Xiangci Li's Homepage

Xiangci Li

I am a final year Ph.D. student at UT Dallas. I am actively looking for full-time positions as a natural language processing scientist or engineer. My research interest is natural language processing, specifically knowledge-intensive natural language processing, including scholarly document processing, dialogue generation and fact-verification. My advisor is Dr. Jessica Ouyang.

李向磁

NLP Researcher, Ph.D. Student

University: UT Dallas
Major: Computer Science
Location: Dallas, TX, USA

Perspective Degree: Ph.D.
Role: Student Researcher at Google
Email: lixiangci8 AT gmail DOT com

I have 5 internship experiences at Google, Amazon, Tencent, Baidu and Chan Zuckerburg Initiative. Previously I was a master student at University of Southern California (USC) and research assistant in Prof. Nanyun Peng's PLUS Lab at Information Sciences Insitute (now moved to UCLA). Before that, I was a full-time computational neuroscience research assistant at Erlich Lab at New York University Shanghai, where I graduated with Bachelor's degree in computer science and neuroscience.

Publications

A Knowledge Plug-and-Play Test Bed for Open-domain Dialogue Generation

Xiangci Li, Linfeng Song, Haitao Mi, Lifeng Jin, Jessica Ouyang & Dong Yu

Conference: The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Publication Date: 2024/5
Insititution: Tencent America
TLDR: We build a multi-source Wizard of Wikipedia (Ms.WoW) dataset as a test bed for multi-source open-domain dialogue generation and propose a challenge called dialogue knowledge plug-and-play to test a trained model's adaptability to newly available knowledge sources.

Contextualizing Generated Citation Texts

Biswadip Mandal, Xiangci Li & Jessica Ouyang

Conference: The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Publication Date: 2024/5
Insititution: University of Texas at Dallas
TLDR: We show the benefit of generating citation contexts alone with target citation texts.

Minimal Evidence Group Identification for Fact Verification

Xiangci Li, Sihao Chen, Rajvi Kapadia & Fan Zhang

Insititution: Google
Year: 2024
TLDR: We propose a novel task called minimal evidence group identification to address fact-verification with multiple plausible sets of fully or partially supporting evidence.

Wizard of Shopping: Target-Oriented E-commerce Dialogue Generation with Decision Tree Branching

Xiangci Li, Zhiyu Chen, Jason Ingyu Choi, Nikhita Vedura, Besnik Fetahu, Oleg Rokhlenko & Shervin Malmasi

Insititution: Amazon
Year: 2023
TLDR: We propose a shopping dialogue generation using decision tree and large language models, which greatly improves the downstream conversational product search performance.

Explaining Relationships Among Research Papers

Xiangci Li & Jessica Ouyang

arXiv: 2402.13426
Insititution: University of Texas at Dallas
TLDR: We explore literature review generating with large language models.

Cited Text Spans for Citation Text Generation

Xiangci Li, Yi-Hui Lee & Jessica Ouyang

arXiv: 2309.06365
Insititution: University of Texas at Dallas
TLDR: We show that distantly-retrieved cited text spans greatly improves citation text generation.

CORWA: A Citation Oriented Related Work Annotation Dataset

Xiangci Li, Biswadip Mandal, Jessica Ouyang

Conference: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2022)
Publication Date: 2022/7/10
Insititution: University of Texas at Dallas
TLDR: We collect a linguistically motivated citation span-based dataset for related work generation. We develop a strong baseline model to automatically annotate unlabeled related work sections. We propose a new task, citation span generation. Finally we provide a big picture of the future related work generation system.
Resources: Video, Repository

Automatic Related Work Generation: A Meta Study

Xiangci Li, Jessica Ouyang

arXiv: 2201.01880
Upload Date: 2022/1
Insititution: University of Texas at Dallas
TLDR: We survey prior works of related work generation task, and selected prior works on other relevant tasks. We point out the limitations of the existing works, and suggest new directions to explore.

CASPR: A Commonsense Reasoning-based Conversational Socialbot

Kinjal Basu, Huaduo Wang, Nancy Dominguez, Xiangci Li, Fang Li, Sarat Chandra Varanasi, Gopal Gupta

Venue: Alexa Prize Socialbot Grand Challenge 4 Proceedings
Publication Date: 2021/7
Insititution: University of Texas at Dallas
TLDR: We report on the design and development of the CASPR system, a socialbot designed to compete in the Amazon Alexa Socialbot Challenge 4.

Scientific Discourse Tagging for Evidence Extraction

Xiangci Li, Gully Burns, Nanyun Peng

Conference: The 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2021)
Publication Date: 2021/4/19
Insititution: University of Southern California, Information Sciences Institute
TLDR: We develop a state-of-the-art model for scientific discourse tagging and demonstrate its strong performance and transferability on a few datasets. We then demonstrate the benefit of leveraging the scientific discourse tags on downstream-tasks by providing claim-extraction task and evidence fragment detection task as two show cases.
Resources: Video, Code, Poster

A Paragraph-level Multi-task Learning Model for Scientific Fact-Verification

Xiangci Li, Gully Burns, Nanyun Peng

Conference: The AAAI-21 Workshop on Scientific Document Understanding
Publication Date: 2021/2/9
Insititution: University of Southern California, Information Sciences Institute
TLDR: We propose a novel, paragraph-level, multi-task learning model for a scientific claim-verifications task (SciFact) by directly computing a sequence of contextualized sentence embeddings from a BERT model and jointly training the model on rationale selection and stance prediction.
Resources: Video, Code

Context-aware Stand-alone Neural Spelling Correction

Xiangci Li, Hairong Liu, Liang Huang

Conference: Findings of the Association for Computational Linguistics: EMNLP 2020
Publication Date: 2020/11/16
Insititution: Baidu USA
TLDR: We present a simple yet powerful solution that jointly detects and corrects misspellings as a sequence labeling task by fine-turning a pre-trained language model.
Resources: Code

Building deep learning models for evidence classification from the open access biomedical literature

Gully Burns, Xiangci Li, Nanyun Peng

Journal: Database
Publication Year: 2019
Insititution: University of Southern California, Information Sciences Institute
TLDR: We apply deep learning on experimental type classification of biomedical experimental descriptions.
Resources: Code

Neural and computational mechanismsfor task switching

Xiangci Li*, Chunyu Duan*, Ce Ma, Carlos Brody, Zheng Zhang, Jeffrey Erlich

Insititution: New York University Shanghai
Conferences: Society for Neuroscience 2017, Computational and Systems Neuroscience (Cosyne) 2021 (Oral, 4.6% acceptance rate)
TLDR: We trained rats and recurrent neural networks to perform a task-switching paradigm using similar procedures. Our results elucidate how ongoing activity, shaped by recent experience of animals and artifical systems, can influence new tasks at hand.

Resume

Education

University of Texas at Dallas

2020.8 - Present

Doctor of Philosophy

Computer Science

Advisor: Professor Jessica Ouyang
Research direction: knowledge-intensive natural language processing
GPA: 3.96

University of Southern California

2018.5 - 2019.12

Master of Science

Computer Science

Courses: Natural Language Processing, Machine Learning, Computer Vision, Information Integration, Algorithms, Web Technology, Database Systems
Advisors: Professor Nanyun Peng & Dr. Gully Burns
Research direction: scientific information extraction

New York University Shanghai

2013.8 - 2017.5

Bachelor of Science

Computer Science & Neuroscience

Honor: Cum Laude
GPA: 3.72

Professional Experience

Research Intern & Student Researcher

2023.9 - 2024.1 - Current

Google LLC

Minimal Evidence Group Identification for Fact Verification
Supervised by Dr. Fan Zhang, & Rajvi Kapadia.
Collaboration with Sihao Chen.

Applied Scientist Intern

2023.5 - 2023.8

Amazon.com Services LLC

Wizard of Shopping: Target-Oriented E-commerce Dialogue Generation with Decision Tree Branching
Supervised by Dr. Zhiyu Chen, Jason Ingyu Choi, Dr. Nikhita Vedula, Dr. Besnik Fetahu, Dr. Oleg Rokhlenko & Dr. Shervin Malmasi

Research Intern, Natural Language & Speech Processing

2022.5 - 2022.8

Tencent America, Tencent AI Lab

Multi-source knowledge-based open domain dialogue generation
Supervised by Dr. Linfeng Song, Dr. Lifeng Jin & Dr. Haitao Mi

Graduate Research Assistant

2020.8 - 2023.5

University of Texas at Dallas

Advised by Professor Jessica Ouyang
Scholarly document processing: automatic related work section generation in scientific papers

Graduate Research Assistant

2020.12 - 2021.5

University of Texas at Dallas

Advised by Professor Gopal Gupta
Member of UT Dallas CASPR team for Alexa Prize Competition Grand Challenge 4

Teaching Assistant

2020.8 - 2020.12

University of Texas at Dallas

Undergraduate-level Machine Learning
Graduate-level Semantic Web

Research Scientist Intern

2020.1 - 2020.5

Baidu USA

Supervised by Professor Liang Huang & Dr. Hairong Liu
Full-time internship on stand-alone neural spelling correction
Paper accepted by Findings of EMNLP 2020

Graduate Student Worker & Research Assistant

2018.5 - 2020.8

University of Southern California, Information Sciences Institute

Co-mentored by Professor Nanyun Peng and Dr. Gully Burns
Worked on a series of projects leading to publications under evidX, which uses natural language processing techniques to extract scientific knowledge from biomedical literature, emphasizing on evidence extraction
- Biomedical experimental type classification
- Scientific discourse tagging
- Evidence fragment delineation
- Automatic scientific claim-verification
Worked on collecting a challenging commonsense reasoning dataset for natural language processing models

Visiting Researcher

2019.5 - 2019.8

Chan Zuckerburg Initiative

Supervised by Dr. Gully Burns
Worked under Meta team, which develops a system to recommend biomedical papers to researchers
Developed testing pipeline for a content-based recommendation system using clustering techniques

Research Assistant

2016.5 - 2018.4

Erlich lab, NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai

Co-mentored by Professor Jeffrey Erlich and Professor Zheng Zhang
Started with student research assistant and switched to full-time research staff after graduation
Worked on Virtual Rat project, by building a recurrent neural network to model animal’s task switch cost phenomenon and analyzing the dynamics of the model
Analyzed rat’s behavior data and found evidence to support the computational model
Submitting a first-author paper for project: Neurophysiological and computation evidence for the task-set inertia theory of switch cost>
Trained mice and rats with behavior task on a regular basis
Contributed code for managing lab database using Python and MySQL

iGEM competition

2014.12 - 2015.9

NYU Shanghai iGEM team

International Genetically Engineered Machine (iGEM) is a synthetic biology competition held by MIT.
Developed project SYNTHESIZED (Bacteria Music Generator)
Played a key role in the team including designing developing the main product and team outreaches
 Attended the iGEM Giant Jamboree in Boston, MA, and won a silver medal

Electro-physiology data analysis
Rodent behavior training

Languages

Mandarin Chinese (native)
Japanese (native, Japanese-Language Proficiency Test N1 certificate)
English (full-English-education since college)

Misc.

Piano
Swimming
Stock trading
Driving
International politics

Xiangci Li

Xiangci Li

李向磁

Publications

A Knowledge Plug-and-Play Test Bed for Open-domain Dialogue Generation

Contextualizing Generated Citation Texts

Minimal Evidence Group Identification for Fact Verification

Wizard of Shopping: Target-Oriented E-commerce Dialogue Generation with Decision Tree Branching

Explaining Relationships Among Research Papers

Cited Text Spans for Citation Text Generation

CORWA: A Citation Oriented Related Work Annotation Dataset

Automatic Related Work Generation: A Meta Study

CASPR: A Commonsense Reasoning-based Conversational Socialbot

Scientific Discourse Tagging for Evidence Extraction

A Paragraph-level Multi-task Learning Model for Scientific Fact-Verification

Context-aware Stand-alone Neural Spelling Correction

Building deep learning models for evidence classification from the open access biomedical literature

Neural and computational mechanismsfor task switching

Resume

Education

University of Texas at Dallas

2020.8 - Present

University of Southern California

2018.5 - 2019.12

New York University Shanghai

2013.8 - 2017.5

Professional Experience

Research Intern & Student Researcher

2023.9 - 2024.1 - Current

Applied Scientist Intern

2023.5 - 2023.8

Research Intern, Natural Language & Speech Processing

2022.5 - 2022.8

Graduate Research Assistant

2020.8 - 2023.5

Graduate Research Assistant

2020.12 - 2021.5

Teaching Assistant

2020.8 - 2020.12

Research Scientist Intern

2020.1 - 2020.5

Graduate Student Worker & Research Assistant

2018.5 - 2020.8

Visiting Researcher

2019.5 - 2019.8

Research Assistant

2016.5 - 2018.4

iGEM competition

2014.12 - 2015.9

Skills

Computer Science Theories

NLP, ML & CV Frameworks

Programming

Neuroscience

Languages

Misc.