Rui Li
# *****@*******.*** 716-***-****
https://ruilialice.github.io https://scholar.google.com/citations?user=N9koJEQAAAAJhl=en Education
University at Buffalo
Ph.D. in Computer Science and Engineering
Aug. 2018 – Aug. 2025
Buffalo, NY, USA
Zhejiang University
Master in Information Science and Electronic Engineering Aug. 2015 – Aug. 2018
Hangzhou, China
Zhejiang University
Bachelor in Information Science and Electronic Engineering Aug. 2011 – Aug. 2015
Hangzhou, China
Technical Skills
Programming Languages: Python, C++, C, SQL, Matlab, Swift Frameworks: Pytorch, Tensorflow, LangGraph, LangChain, Cursor Deep Learning Technique: Reinforcement Learning, Meta Learning, Generative Adversial Networks, Repre- sentation Learning
Work Experience
UT Health Houston
Graduate Data Scientist
Nov. 2023 – Now
Houston, TX
Analyzed disease progression for rare diseases based on electronic health record data using IQVIA dataset.
Evaluated the impact of various medications on obesity management using the Epic Cosmos dataset.
Used large language model (GPT-4o and LLama3) to extract patients’ social determinants of health status from public case reports, and analyzing the relationship between the documentation of Social Determinants of Health (SDOH) and disease, region, and gender.
Designed and implemented a rare disease early diagnosis system based on Mamba and reinforcement learning mechanism, to promote early detection and intervention for rare diseases. Mayo Clinic
Machine Learning Engineer Intern
Jun. 2022 – May. 2023
Rochester, MN
Designed and implemented a time-aware transformer based rare disease differential diagnoses model, and F1 increases 17% and recall increases 46%.
Designed and implemented a module combining Meta Learning (Meta Weight Net) and generative adversarial network (GAN) aiming to solve the imbalanced and positive unlabeled challenge in rare disease detection, recall increases 80% and F1 increases 25% for skewed data containing hundred thousands samples with 5% positive rate.
Didi Chuxing
Machine Learning Engineer Intern
Mar. 2017 – Jul. 2017
Hangzhou, China
Designed the predictive model to predict the attrition rate of the driver. We select attributes such as drivers’ income, driving time, driving distance etc. to train a logistic regression model.
Wrote SQL queries and working with product managers to help the decision making process. Awards
Chemotherapy Timelines Extraction[ website 2] NAACL, Clinical NLP, 2024
Won 2nd place in the Chemotherapy Treatment Timelines Extraction from the Clinical Narrative competi- tion at NAACL 2024.
Designed an end-to-end NLP system MedTimeline which comprises an event entity extractor, a temporal entity extractor, and a patient-level timeline aggregator.[ paper 2] Rui Li - Page 1 of 2
Projects
SmartMediGo ios app
Implemented an ios app which helps patients to schedule the appointment, view historical hospital visits, view test results and medications. The ios app also includes a chatbot which answers basic questions and help patients schedule an appointment with a specialist.
Used LangGraph to construct the agent for the chatbot. Selected Publications and Manuscripts
[manuscript] F. Wang, Z. Zhang, X. Zhang, Z. Wu, T. Mo, Q. Lu, W. Wang, Rui Li, J. Xu, X. Tang, Q. He, Y. Ma, M. Huang, S. Wang A comprehensive survey of small language models in the era of large language models: Techniques, enhancements, applications, collaboration with llms, and trustworthiness, 2025. [ paper 2]
[BCB 2023] Rui Li, Andrew Wen, Jing Gao, Hongfang Liu. MLGAN: a Meta-Learning based Generative Adversar- ial Network adapter for Rare Disease Differentiation, ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM BCB), 2023. [ paper 2]
[ICHI 2022] Rui Li, Jing Gao. Multi-modal Contrastive Learning for Healthcare Data Analytics, IEEE International Conference on Healthcare Informatics, 2022.[ paper 2]
[AMIA 2021] Rui Li, Fenglong Ma, Jing Gao. Integrating Multimodal Electronic Health Records for Diagnosis Prediction, AMIA Annual Symposium Proceedings, 2021. [ paper 2, video 2]
[BigData 2019] Rui Li, Fenglong Ma, Wenjun Jiang, Jing Gao. Online federated multitask learning, IEEE Inter- national Conference on Big Data, 2019. [ paper 2, code 2] Rui Li - Page 2 of 2