Harvard Medical School
Curriculum Vitae
TIANRUN CAI
Date Prepared:
3.10, 2024
Name:
Tianrun Cai
Office Address:
Division of Rheumatology, Inflammation and Immunology
Department of Medicine, Brigham and Women’s Hospital
60 Fenwood Road, Boston MA 02115
Work Phone:
617 - 264 - 5908
Work E-Mail:
*****@***.*******.***
Work FAX:
617 – 264 - 3019
Place of Birth:
Zhejiang, China
Education:
1992-1997
MBBS
Clinical Medicine
West China University of Medical Sciences
Postdoctoral Training:
07/97-09/01
Resident
General Surgery
Shanghai Songjiang Central Hospital
10/01-07/
Chief Resident
General Surgery
Shanghai Songjiang Central Hospital
06/13-12/13
Visiting Research Scholar
Department of Biostatistics
Harvard T. H. Chan School of Public Health
06/14-08/15
Research Fellow
Department of Radiology (Dr. Frank J. Rybicki)
Brigham and Women’s Hospital
09/15-07/17
Research Fellow
Division of Rheumatology, Immunology and Allergy (Dr Katherine P. Liao)
Brigham and Women’s Hospital
Faculty Academic Appointments
07/17 –
Instructor
Medicine
Brigham and Women’s
Hospital
Appointments at Hospitals/Affiliated Institutions:
08/04-09/08
Attending
General Surgery
Shanghai Songjiang Central Hospital
10/08 -06/12
Attending
General Surgery
Ruian ZhouShuSong Hospital
10/08-06/12
Director of Information Systems
Department of Medical Information
Ruian ZhouShuSong Hospital
07/17 –
Associate Bioinformatician
Section of Clinical Science, The Division of Rheumatology, Immunology and Allery
Brigham and Women’s Hospital
Major Administrative Leadership Positions:
Local
2008-2012
Director of Information Systems
Ruian ZhouShuSong Hospital
Professional Societies:
2014-2015
Radiology Society of North America
Member
2016-
American College of Rheumatology
Member
2017-
American Medical Informatics Association
Member
Editorial Activities:
2016
Ad hoc reviewer
the International Journal of Cardiovascular Imaging
2017-2018
Abstract reviewer
American Medical Informatics Association
2019
Ad hoc reviewer
American Medical Informatics Association 2019 Annual Symposium
Honors and Prizes:
1994-1995
Student Leadership Award
West China University of Medical Sciences
Excellence in managing student activities
2009
Outstanding Physician Award
Ruian ZhouShuSong Hospital
Quality of patient care
2011
Outstanding Management Award
Ruian ZhouShuSong Hospital
Quality of managing Informatics program
Report of Funded and Unfunded Projects:
Funding Information:
Current:
2019-2024
Studying Pseudogout using natural language processing and novel imaging approaches
National Institutes of Health (1K23AR075070-01)
The objective of this project is to identify pseudogout patient incorporating coded data and narrative medical reports using natural language processing technology and to investigate the cardiovascular risk in pseudogout patients
Role: Co-Investigator (PI: Sara K. Tedeschi, Division of Rheumatology, Immunology and Allergy, Brigham and Women’s Hospital)
2022-2025
Recurrence Risk in Early-Stage Renal Cell Carcinoma (RCC): Patient characteristics and Predictors
Merck Sharp & Dohme LLC. (LKR204888)
The objective of this study is to build a prediction model with the consolidated dataset using electronic medical record (EMR) data including image data for the prognosis of RCC.
Role: Sub-PI
2021-2025
Semi-supervised Approaches to Denoising Electronic Health Records Data for Risk Prediction
National Institute of Health (1R01LM013614-01)
The objective of this study is to incorporate clinical billing code data and information extracted from narrative notes by natural language processing (NLP) to build a model for predicting the risk of having IBD.
Role: Sub-PI
2024-2029
Real-world evidence of comparative effectiveness among multiple sclerosis treatments
National Institute of Neurological Disorders and Stroke (R01NS098023)
The proposed study has the overall goal to generate robust real world evidence of MS DMT comparative effectiveness from integrated analyses of prospective observational cohort and EHR data.
Role: Co-Investigator (Overall PI: Zongqi Xia, University of Pittsburg, Sub-PI: Tanuja Chitnis, Division of Neurology, Brigham and Women’s Hospital))
2023-2024
Neurogranin and Traumatic Brain Injury
Massachusetts Veteran’s Epidemiology Research Center (36C24E23N0218)
Past:
2014-2015
Retrospective Evaluation of Trends in Pulmonary Embolism
Research assistant (PI: Frank Rybicki, Department of Radiology, Brigham and Women's Hospital)
2016
Effect of calcium plaque inclusion and exclusion on Translumunal Attenuation Gradient (TAG)
Data Manager (PI: Dimitrios Mitsouras, Department of Radiology, Brigham and Women's Hospital)
2016-2021
Million Veteran’s Project (MVP)
U.S. Department of Veterans Affairs
The objective of MVP is to determine how genes affect health and leverage these data to improve the health of our veterans. MVP is recruiting 1 million veterans to create a cohort with linked electronic medical record (EMR), genotype, and questionnaire data.
Role: Collaborator (PI: Gaziano and Concato)
2016-2021
Integrating EHR and Genomics to Predict Multiple Sclerosis Drug Response
National Institute of Health (R01 NS098023)
The overall goal of this project is to leverage electronic health records (EHR) data to define treatment response and integrate clinical features from EHR data with genomics data to improve prediction of treatment response in MS.
Role: Co-Investigator (PI: Zongqi Xia, University of Pittsburg)
2016-2021
Lipids, inflammation, Cardiovascular Risk in Rheumatoid Arthritis
National Institutes of Health (R01 HL127118)
Advancing understanding of the association between inflammation and lipids may lead to new effective strategies to prevent and treat heart disease. This project is relevant to the missions of NHLBI and NIAMS as highlighted by our objective to study the clinical mechanisms behind the burden of inflammation on heart disease and utilization of advanced imaging techniques to provide insight into anatomic and physiologic changes associated with the rheumatic diseases
Role: Co-Investigator (PI: Katherine P. Liao, Division of Rheumatology, Immunology and Allergy, Brigham and Women’s Hospital)
2017-2022
VERITY: Value and Evidence in Rheumatology using bioInformaTics, and advanced analYtics
National Institutes of Health (1P30 AR072577-01)
The VERITY Core proposes to serve as one of the NIAMS Core Centers for Clinical Research. VERITY will implement novel analytic methods and outcome measures that address existing and emerging needs in clinical research for preventing and treating rheumatic and musculoskeletal disorders.
Role: Resource Bioinformatics Core (PI: Daniel H. Solomon, Division of Rheumatology, Immunology and Allergy, Brigham and Women’s Hospital)
2019-2022
Comparative Effectiveness of Treat-To-Target Approach versus Routine Care in Management of Gout
National Institutes of Health (1R01AR073314-01A1)
The goal of this project is to examine the effect of treat-to-target (TTT) strategy on the risk of gout flares compared with usual care and assess the effect of TTT strategy on the risk of chronic kidney disease and cardiac vascular disease compared with usual care.
Role: Co-Investigator (PI: Seoyoung Catherine Kim, Division of Rheumatology, Immunology and Allergy, Brigham and Women’s Hospital)
2022-2022
Method Development for Medical Device Safety Studies Using EHR Data
International Consulting Associates, Inc.
The overall goal of this project is to develop methodologies for inferring safety of medical devices based on analyses using real-world data from electronic health records (EHR).
Role: Sub-PI
2018-2023
Million Veteran’s Project (MVP)
U.S. Department of Veterans Affairs (Contract NO. 36C24E18D0052)
The objective of MVP is to determine how genes affect health and leverage these data to improve the health of our veterans. MVP is recruiting 1 million veterans to create a cohort with linked electronic medical record (EMR), genotype, and questionnaire data.
Role: Bioinformatician (PI: J. Michael Gaziano, VA CSP Massachusetts Veteran’s Epidemiology Research Center (MAVERIC), Boston)
Projects Submitted for Funding:
2022
ESSENCE: EHR-Based Patient Eligibility Screening Pipelines for Clinical Trial Recruitment
National Institute of Health (R01)
The goal of this project is to develop AI/ML algorithms to automatically identify potentially eligible patients using EHR data. In addition, we aim to validate the pipeline using finished and ongoing clinical trials and build a real-time patient screening system.
Role: Co-PI
2023
Empowering Multi-institutional Electronic Health Records Data for Clinical Discovery
National Institute of Health (R01)
Our objective is to create statistical methodologies and computational resources for organizing and aligning electronic health records (EHR) data across multiple institutions. The ultimate goal is to facilitate the transfer of prediction algorithms for clinical outcomes (illustrated with multiple sclerosis) trained within one healthcare system to another, eliminating the need for extensive manual data curation and retraining.
Role: Co-Investigator
2023
Enabling Reliable Real World Evidence Generation with Multi-institutional EHR Data
National Institute of Health (P01)
There are multiple proposed projects with different goals. The goal of the proposed project 1 is to develop scalable, transportable, and unbiased embedding and network analysis algorithms from multiple EHR data sources for generating knowledge maps that account for biomedical data inequality in different institutions. The goal of the proposed project 2 is to develop statistical methods to handle one of the most important limitations of routinely collected data: selection bias due to missing data. the methods developed in project 2 will be used to address four key questions in cancer prevention, detection, and treatment, including the use of data from the Veterans Administrations HealthCare System to investigate cancer outcomes following bariatric surgery. In the proposed project 3, the goal is to leverage both EHR data and registry data to assist in cancer precision medicine on recurrence free survival/progression free survival for both non-small cell lung cancer and colorectal cancer. The goal also includes Developing causally interpretable cancer survival metrics for RWE and Optimizing individualized cancer therapy selection.
Role: Co-Investigator
2023
Leveraging electronic health records to optimize treatment selection and response in multiple sclerosis: Alzheimer’s disease supplement
National Institute of Health supplemental funding
The goal of the proposed project is to build on the experience learned from the parent project involving multiple sclerosis (MS) and generate a robust framework for investigating real-world evidence of Alzheimer’s disease (AD) treatment comparative effectiveness by leveraging electronic health records (EHR) data.
Role: Co-Investigator
Unfunded Current Projects:
2018
National NLP Clinical Challenges (n2c2)
2019
Scalable Relevance Ranking Algorithm via Semantic Similarity Assessment with
Application to Electronic Health Record Chart Review
2020
Active machine learning for data extraction using electronic medical records
2020
Ensemble learning for identification of gout flare
Report of Local Teaching and Training
Formal Teaching of Residents, Clinical Fellows and Research Fellows (post-docs):
2004-2008
Biweekly case discussion
Incorporating real case, to help residents and clinical fellows to understand disease mechanism, diagnosis and treatment in general surgery patients
Shanghai Songjiang Hospital, Shanghai, China
1 hour per two weeks
2014
3D printing tutorial course
CT images post processing for 3D printing model preparation
Radiology Society of North America annual conference
Chicago, IL
1 hour each session for 2 sessions
Laboratory and Other Research Supervisory and Training Responsibilities
2018
Medical chart review
Tutorial session with researchers and clinical fellows on conducting medical chart review for obtaining gold standard for phenotyping of coronary arterial disease
VA Boston Healthcare System
1 hour
2018
Medical chart review
Tutorial session with clinical fellows on conducting medical chart review for obtaining gold standard for phenotyping of Rheumatoid Arthritis
VA Boston Healthcare System
1 hour
2018
Concept extraction using NLP tools
from MSSQL database on narrative clinical notes using a NLP tool (NILE)
Brigham and Women’s Hospital
1 hour
VA Boston Healthcare System
1 hour
2018
Building dictionaries by parsing online articles for phenotypes of interest using UMLS
Brigham and Women’s Hospital
1 hour
VA Boston Healthcare System
1 hour each, 3 times
2018
NLP for CPPD
Brigham and Women’s Hospital
1 hour each for 10 times
2018-2019
NLP for AOSD with team from UKY
Conference Call
Half an hour/2 weeks, for 21 times
2019
Hands on CHANL with VA gulf war research team
Conference Call
1 hour each, for 5 times
2020
Natural language processing for medical data extraction
1 hour each, for 10 times (on going)
2021
Hands on CHANL implementation for chart review with gout flare research team from Division of Pharmacoepidemiology and Pharmacoeconomics at Brigham and Women’s Hospital
2 hours
2021
Hands on CHANL implementation for chart review with lung cancer research team from Harvard Medical School
2 hours for two times
2021
Hands on CHANL implementation for chart review with rheumatoid arthritis research team from the section of clinical science at Brigham and Women’s Hospital
5 hours
2021
Introduction of Narrative Information Linear Extraction (NILE) at VA Boston Healthcare Jamaica Plain campus
2 hours
2023
Improving the Efficiency of Clinical Trial Recruitment Using EHR via Natural Language Processing and Machine Learning
2 hours
2023
Introduction of CHANL at VA Boston Healthcare Jamaica Plain campus
2 hours
2023
Knowledge-Driven Online Multimodal Automated Phenotyping System
1 hour
2023
Online Narrative and Codified feature Search Engine
1hour
Clinical Supervisory and Training Responsibilities:
2004-2008
Mentoring residents on general surgery/ Shanghai Songjiang Hospital
1 hour/day ward round
Local Invited Presentations:
2015
Numerical Data Extraction for Clinical Outcomes Research
Thirty-minutes lecture with research groups from
Department of Radiology, BWH
2017
UMLS Metathesaurus Basic and Concept Collection in Phenotyping
One-hour lecture for Million Veteran’s Project phenotyping pilot meeting
VA Boston Healthcare System
2018
Natural Language Processing in Clinical Application
One-hour lecture
Bioinformatics Club, Department of Medicine, BWH
2018
Hands on wxPython – Interface Building In Python Tutorial
One-hour lecture with Post-Docs from HSPH
Department of Biostatistics, HSPH
2019
Make a GUI application with WxPython
2019
Improving Chart Review Efficiency Via Artificial Intelligence
2019
Improving the Efficiency of Clinical Trial Recruitment using Electronic Health Record Data, Natural Language Processing, and Machine Learning
2020
Using EHR via Natural Language Processing and Machine Learning to
Improve the Efficiency of Clinical Trial Recruitment
AMIA 2020 Clinical Informatics Conference
2021
Natural Language Processing basic and clinical application
Verity Core Meeting
2021
UMLS for natural language processing in medical research
2023
Knowledge-Driven Online Multimodal Automated Phenotyping System
2023
Online Narrative and Codified feature Search Engine
2023
Enhance Data Diversity for Clinical Studies Using Natural Language Processing and Machine Learning
Report of Reginal, National and International Invited Teaching and Presentations
National Teaching Presentations:
2020
Basics of natural language processing
2020 Virtual VERITY/Brigham Course in Rheumatology
Boston, MA
2021
Natural language processing in clinical research and introduction of tools for medical data extraction and chart review
VERITY Bioinformatics Core mini course (national, virtual)
2021
Introduction to Natural Language Processing (NLP)
VERITY Bioinformatics Core mini course (national, virtual)
2022
Unified Medical Language System for natural language processing in medicine
VERITY Bioinformatics Core course (national, virtual)
2023
VERITY 2023 Bioinformatics Mini Course: NLP Workshop & Knowledge Networks for Clinical Research
VERITY Bioinformatics Core course (national, virtual)
International Abstract Oral Presentations:
2014
A Novel Tool to EXTEND Clinical Radiology Research Using Automated Numerical Data Collection/Selected Oral Abstract
Radiological Society of North America
Chicago, IL
2019
Improving the Efficiency of Clinical Trial Recruitment using Electronic Health Record Data, Natural Language Processing, and Machine Learning
American college of Rheumatology
Atlanta, GA
Report of Clinical Activities and Innovations
Current Licensure and Certification:
2002
Practicing Physician Qualification Certificate
2004
Certificate of Physician Credentials
Practice Activities:
2004-2008
Inpatient, general surgery
Shanghai Songjiang Hospital, Shanghai,China
Three and half days per week
2008–2012
Outpatient,
Minor surgery
Ruian ZhouShuSong Hospital, Ruian, China
Report of Technological and Other Scientific Innovations
Report of Technological and Other Scientific Innovations
2018
The Chart Review Tool Powered by NLP (CHANL)
I developed this software (licensed by BWH, case number: 25109) to perform intelligent medical chart review with high accuracy and efficiency. It’s currently one of the Verity Bioinformatics Resource Core services.
2019
Extraction of EMR Numerical Data(EXTEND)
This is a natural language processing tool for extracting numerical data such as vital signs and ejection fraction from clinical narrative notes it’s also licensed by BWH (license case number: 25122).
2019
Narrative Information Linear Extraction(NILE)
NILE is an efficient and effective software for natural language processing of clinical narrative texts. (licensed by Harvard University, case number: 7630). The design of this software aims at direct translation of linguistic and clinical knowledge to the code, which allows us to develop functions to parse complex language patterns.
2020
NLP interpreter for Cancer Extraction (NICE)
NICE (Natural language processing Interpretor for Cancer Extraction) is a natural language processing tool developed to efficiently extract cancer related variables such as cancer stages, histological types and gene alterations, etc. NICE employs a set of rule-based algorithms using prebuilt dictionaries to identify target information in narrative notes. The output of NICE is different confidence level of results. NICE has been applied to extract clinical stages, TNM stages, histological types, gene alteration variables and relevant dates for lung cancer studies and colorectal cancer studies. NICE has been licensed by BWH with case number: BWH 2020-514.
2020
Medical Information Analysis and Navigation System (MEDIANS)
MEDIANS, Medical Information Analysis and Navigation System, is designed for medical researchers and clinicians to discover both structured and narrative medical information in electronic medical record system. Using natural language processing technology, MEDIANS can perform semantic analysis for narrative notes and intelligent search for thousands of medical concepts simultaneously and can automatically identify and highlight key information in notes. Using MEDIANS, researchers and physicians can obtain accurate information for patients much more efficiently than traditional ways. Using deep learning technologies, MEDIANS ranks the relevance of patient notes to a topic such as disease phenotypes to provide notes sorted by ranking scores. MEDIANS has been licensed by BWH with case number: BWH 2020-523
Report of Scholarship:
Peer-Reviewed Scholarship in print or other media:
Research Investigations
1.Chatzizisis YS, George E, Cai T, Fulwadhva UP, Kumamaru KK, Schultz K, Fujisawa Y, Rassi C, Steigner M, Mather RT: Accuracy and reproducibility of automated, standardized coronary transluminal attenuation gradient measurements. The international journal of cardiovascular imaging 2014, 30(6):1181-1189.
2.Liu H, Juan Y-H, Wang Q, Lin Y-C, Liang C, Zhang X, Cai T, Saboo S: Foreign body venous transmigration to the heart. QJM: An International Journal of Medicine 2014, 107(9):743-745.
3.Bedayat A, Sewatkar R, Cai T, George E, Imanzadeh A, Hussain Z, Dunne RM, Hunsaker AR, Rybicki FJ, Kumamaru KK: Association Between Confidence Level of Acute Pulmonary Embolism Diagnosis on CTPA images and Clinical Outcomes. Academic radiology 2015, 22(12):1555-1561.
4.Mitsouras D, Liacouras P, Imanzadeh A, Giannopoulos AA, Cai T, Kumamaru KK, George E, Wake N, Caterson EJ, Pomahac B: Medical 3D printing for the radiologist. Radiographics 2015, 35(7):1965-1988.
5.Cai T, Rybicki FJ, Giannopoulos AA, Schultz K, Kumamaru KK, Liacouras P, Demehri S, Small KMS, Mitsouras D: The residual STL volume as a metric to evaluate accuracy and reproducibility of anatomic models for 3D printing: application in the validation of 3D-printable models of maxillofacial bone from reduced radiation dose CT images. 3D Printing in Medicine 2015, 1(1):2.
6.Kumamaru KK, George E, Aghayev A, Saboo SS, Khandelwal A, Rodríguez-López S, Cai T, Jiménez-Carretero D, Estépar RSJ, Ledesma-Carbayo MJ: Implementation and performance of automated software for computing right-to-left ventricular diameter ratio from computed tomography pulmonary angiography Images. Journal of computer assisted tomography 2016, 40(3):387-392.
7.Cai T, Giannopoulos AA, Yu S, Kelil T, Ripley B, Kumamaru KK, Rybicki FJ, Mitsouras D: Natural language processing technologies in radiology research and clinical applications. Radiographics 2016, 36(1):176-191.
8.Kumamaru KK, Saboo SS, Aghayev A, Cai P, Quesada CG, George E, Hussain Z, Cai T, Rybicki FJ: CT pulmonary angiography-based scoring system to predict the prognosis of acute pulmonary embolism. Journal of cardiovascular computed tomography 2016, 10(6):473-479.
9.Yu S, Chakrabortty A, Liao KP, Cai T, Ananthakrishnan AN, Gainer VS, Churchill SE, Szolovits P, Murphy SN, Kohane IS: Surrogate-assisted feature extraction for high-throughput phenotyping. Journal of the American Medical Informatics Association 2016, 24(e1):e143-e149.
10.Geva A, Gronsbell JL, Cai T, Cai T, Murphy SN, Lyons JC, Heinz MM, Natter MD, Patibandla N, Bickel J: A Computable Phenotype Improves Cohort Ascertainment in a Pediatric Pulmonary Hypertension Registry. The Journal of pediatrics 2017, 188:224-231. e225.
11.Yu S, Ma Y, Gronsbell J, Cai T, Ananthakrishnan AN, Gainer VS, Churchill SE, Szolovits P, Murphy SN, Kohane IS et al: Enabling phenotypic big data with PheNorm. Journal of the American Medical Informatics Association 2018, 25(1):54-60.
12.Cai T, Lin T-C, Bond A, Huang J, Wanger GK, Cagan A, Murphy SN, Ananthakrishnan AN, Liao KP: The association between arthralgias and vedolizumab using natural language processing. Inflammatory Bowel Diseases 2018, 24(10):2242-2246.
13.Cai T, Zhang Y, Ho Y-L, Link N, Sun J, Huang J, Cai TA, Damrauer S, Ahuja Y, Honerlaw J: Association of interleukin 6 receptor variant with cardiovascular disease effects of interleukin 6 receptor blocking therapy: A Phenome-Wide association study. JAMA cardiology 2018, 3(9):849-857.
14.Zhong Q-Y, Mittal LP, Nathan MD, Brown KM, González DK, Cai T, Finan S, Gelaye B, Avillach P, Smoller JW: Use of natural language processing in electronic medical records to identify pregnant women with suicidal behavior: towards a solution to the complex classification problem. European journal of epidemiology 2018:1-10.
15.Jorge A, Castro VM, Barnado A, Gainer V, Hong C, Cai T, Cai T, Carroll R, Denny JC, Crofford L: Identifying Lupus Patients in Electronic Health Records: Development and Validation of Machine Learning Algorithms and Application of Rule-Based Algorithms. Seminars in Arthritis and Rheumatism 2019 Aug;49(1):84-90.
16.Zhang Y*, Cai T*, Yu S*, Cho K, Hong C, Sun J, Huang J, Lam H-Y, Ananthakrishan A, Xia Z et al: Methods for high-throughput phenotyping using electronic medical record data with a common semi-supervised pipeline (PheCAP). Natural protocols14, 3426–3444 (2019).
17.Zhao S, Hong C, Cai T, Chang X, Huang J, Ermann J, Goodson NJ, Solomon DH, Cai T, Liao KP. Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records. Rheumatology 2019 Sep 19. pii: kez375.
18.Liao KP, Sun J, Cai T, Link N, Hong C, Huang J, Huffman JE, Gronsbell J, Zhang Y, Ho Y-L: High-throughput Multimodal Automated Phenotyping (MAP) with Application to PheWAS. Journal of the American Medical Informatics Association 2019 Nov 1;26(11):1255-1262.
19.Cai T, Zhang L, Yang N, Kumamaru KK, Rybicki FJ, Cai T, Liao KP: EXTraction of EMR Numerical Data: An Efficient and Generalizable Tool to EXTEND Clinical Research BMC medical informatics and decision making 2019 Nov 15;19(1):226.
20.Tedeschi SK, Cai T, He Z, Ahuja Y, Hong C, Yates KA, Dahal K, Xu C, Lyu H, Yoshida K, Solomon DH. Classifying Pseudogout using Machine Learning Approaches with Electronic Health Record Data. Arthritis Care & Research. 2020 Jan 7.
21.Huang S, Huang J, Cai T, Dahal KP, Cagan A, He Z, Stratton J, Gorelik I, Hong C, Cai T, Liao KP. Impact of ICD10 and secular changes on electronic medical record rheumatoid arthritis algorithms. Rheumatology. 2020 May 15.
22.Ahuja Y, Kim N, Liang L, Cai T, Dahal K, Seyok T, Lin C, Finan S, Liao K, Savovoa G, Chitnis T. Leveraging electronic health records data to predict multiple sclerosis disease activity. Annals of clinical and translational neurology. 2021 Apr;8(4):800-10.
23.Yuan Q*, Cai T*, Hong C, Du M, Johnson BE, Lanuti M, Cai T, Christiani DC. Performance of a machine learning algorithm using electronic health record data to identify and estimate survival in a longitudinal cohort of patients with lung cancer. JAMA Network Open. 2021 Jul 1;4(7): e2114723-.
24.Hong C, Rush E, Liu M, Zhou D, Sun J, Sonabend A, Castro VM, Schubert P, Panickan VA, Cai T, Costa L. Clinical Knowledge Extraction via Sparse Embedding Regression (KESER) with Multi-Center Large Scale Electronic Health Record Data. medRxiv. 2021 Jan 1.
25.Cai T, Cai F, Dahal KP, Cremone G, Lam E, Golnik C, Seyok T, Hong C, Cai T, Liao KP. Improving the Efficiency of Clinical Trial Recruitment Using an Ensemble Machine Learning to Assist with Eligibility Screening. ACR Open Rheumatology. 2021 May 18.
26.Ashburner, J.M., Chang, Y., Wang, X., Khurshid, S., Anderson, C.D., Dahal, K., Weisenfeld, D., Cai, T., Liao, K.P., Wagholikar, K.B. and Murphy, S.N., 2022. Natural Language Processing to Improve Prediction of Incident Atrial Fibrillation Using Electronic Health Records. Journal of the American Heart Association, 11(15), p.e026014.
27.Link, N.B., Huang, S., Cai, T., Sun, J., Dahal, K., Costa, L., Cho, K., Liao, K., Cai, T., Hong, C. and Program, M.V., 2022. Binary acronym disambiguation in clinical notes from electronic health records with an application in computational phenotyping. International Journal of Medical Informatics, 162, p.104753.
28.Hou, J., Kim, N., Cai, T., Dahal, K., Weiner, H., Chitnis, T., Cai, T. and Xia, Z., 2021. Comparison of dimethyl fumarate vs fingolimod and rituximab vs natalizumab for treatment of multiple sclerosis. JAMA network open, 4(11), pp.e2134627-e2134627.
29.Hou, J., Zhao, R., Cai, T., Beaulieu-Jones, B., Seyok, T., Dahal, K., Yuan, Q., Xiong, X., Bonzel, C.L., Fox, C. and Christiani, D.C., 2022. Temporal Trends in Clinical Evidence of 5-Year Survival Within Electronic Health Records Among Patients With Early-Stage Colon Cancer Managed With Laparoscopy-Assisted Colectomy vs Open Colectomy. JAMA network open, 5(6), pp.e2218371-e2218371.
30.Cai, T., He, Z., Hong, C., Zhang, Y., Ho, Y.L., Honerlaw, J., Geva, A., Panickan, V.A., King, A., Gagnon, D.R. and Gaziano, M., Scalable Relevance Ranking Algorithm via Semantic Similarity Assessment Improves Efficiency of Medical Chart Review. Journal of Biomedical Informatics, p.104109. 2022.
31.Zhou, D., Gan, Z., Shi, X., Patwari, A., Rush, E., Bonzel, C.L., Panickan, V.A., Hong, C., Ho, Y.L., Cai, T. and Costa, L., 2022. Multiview Incomplete Knowledge Graph Integration with application to cross-institutional EHR data harmonization. Journal of Biomedical Informatics, 133, p.104147.
32.Ulysse, S.N., Chandler, M.T., Santacroce, L., Cai, T., Liao, K.P. and Feldman, C.H., 2023. Social Determinants of Health Documentation Among Individuals with Rheumatic and Musculoskeletal Conditions in an Integrated Care Management Program. Arthritis Care & Research.
33.Hou, J., Zhao, R., Gronsbell, J., Lin, Y., Bonzel, C.L., Zeng, Q., Zhang, S., Beaulieu-Jones, B.K., Cai, T., Weber, G.M., Jemielita, T. and Wan, S.S., 2023. Generate Analysis-Ready Data for Real-world Evidence: Tutorial for Harnessing Electronic Health Records With Advanced Informatic Technologies. Journal of Medical Internet Research, 25, p.e45662.
34.Gan, Z., Zhou, D., Rush, E., Panickan, V.A., Ho, Y.L., Cai, T., Ostrouchov, G., Xu, Z., Shen, S., Xiong, X., Greco, K.F. and Hong, C., 2023. ARCH: Large-scale Knowledge Graph via Aggregated Narrative Codified Health Records Analysis. medRxiv, pp.2023-05.
35.Wen, J., Zhang, X., Rush, E., Panickan, V.A., Li, X., Cai, T., Zhou, D., Ho, Y.L., Costa, L., Begoli, E. and Hong, C., 2023. Multimodal representation learning