Yinghao Li

CV Google Scholar GitHub

I’m a Ph.D. student specializing in Machine Learning at the Georgia Institute of Technology, advised by Dr. Chao Zhang and Dr. Le Song. My research focuses on large language models (LLMs) and their applications in information extraction and reasoning, uncertainty estimation in molecular property prediction, and syntax-guided text generation.

Research Interests

Large Language Models

  • LLM low-rank adapter ensembling
    • Improves large language model’s (LLM’s) performance on diverse tasks through an Expert Ensembles framework, which clusters training data according to gradient profiles to reduce update conflicts and aggregates expert models’ predictions according to their relevance to the input [ELREA. 2024].
  • LLMs for information retrieval
    • Developing effective and efficient methods to fine-tune and utilize LLMs for extracting structured information from unstructured documents [G&O. 2024].
    • Leveraging LLMs to generate pseudo datasets to supervise the training of smaller, task-specific models like BERT [ProgGen. 2024].
  • Investigating the source of LLM reasoning abilities
    • Exploring whether the reasoning capabilities of LLMs are intrinsic or a mimicry of training data patterns [Minesweeper. 2023].

Information Extraction

  • Weakly-supervised named entity recognition and text classification
  • HTML information extraction
    • Extracting information from HTML documents using Transformer-based DOM node classifiers [TrENC. 2023].

Uncertainty Estimation

  • Benchmarking uncertainty quantification methods
    • Evaluating uncertainty quantification techniques for large molecular representation models in molecular property prediction [MUBen. 2024].

Text Generation

  • Syntax-guided paraphrase generation
    • Generating paraphrases under syntactic guidance using constituency parsing tags to enhance text diversity and quality [GuiG. 2020].

I got my Master degree from Georgia Tech, School of ECE, where I worked with Prof. Ying Zhang on Radar SCG signal processing and understanding [Li et al. 2020, Xia et al. 2021].


Feel free to reach out if you’d like to discuss collaborative opportunities or have any questions about my research.

Education

[Aug. 2020 – May 2025]

Ph.D. student @ Georgia Institute of Technology Machine Learning, School of Electrical and Computer Engineering

  • Advised by Dr. Chao Zhang and Dr. Le Song

[Aug. 2018 – May 2020]

Master of Science @ Georgia Institute of Technology School of Electrical and Computer Engineering

[Aug. 2014 – June 2018]

Bachelor of Science @ Southeast University, Nanjing School of Instrument Science and Engineering

Experience

[May 2024 – Aug. 2024]

Applied scientist intern @ AWS, New York SAAR

[May 2022 – Dec. 2022]

Applied scientist intern @ Amazon, Seattle Product Graph