MI YANG
NYC 650-***-**** **.********@*****.*** www.linkedin.com/in/miyang0586
Accomplished data scien0st with experience in conduc0ng complex research, specializing in computa0onal biology and data-driven approaches within the pharmaceu0cal and biotech industries. Background in leading research ini0a0ves from concep0on to execu0on and applying advanced analy0cs and machine learning techniques. Adept at developing and implemen0ng robust computa0onal frameworks for drug discovery and pa0ent stra0fica0on. Extensive experience in mul0-omics data integra0on, NGS analysis, and pre-clinical sta0s0cal modeling. Research Project Management Leadership Cross-Func0onal Collabora0on Machine learning Deep Learning Computa0onal biology Cancer drug discovery Mul0-Omics Analysis LLM Knowledge Graph GNN PROFESSIONAL EXPERIENCE
Personal projects, ConsulMng, Academic collaboraMons 06/2024 – present
• Built an app that can (1) query complex biomedical knowledge graph in natural language, (2) Search, summarize and compare ar0cles online.
• Developed a Proximity Modulation strategy targeting Transcription Factors to prioritize target effector pairs for molecular oncology.
• TCR epitope binding prediction with protein language embeddings.
• Used a GNN pipeline to predict transcription factors – target genes interaction. Sanofi US, Cambridge, MA 06/2022 – 06/2024
Senior ScienMst
Molecular Oncology: Built a founda0onal machine learning pipeline for pa0ent popula0on selec0on with CRISPR (Patent filed).
NK cell therapy computa0onal biology lead: designed and analyzed mul0-condi0on scRNAseq and uncovered key MoA regarding NK cells’ impact on the TME: poten0al for sustained immune response through DC ac0va0on. (Manuscript in prepara0on)
• NGS data analysis, including bulk and single-cell RNA-seq, CITE-seq, WES, CRISPR/Cas9 data.
• Analysis of internal and external databases (e.g., TCGA, CCLE/Depmap, OpenTargets, etc.).
• Pre-clinical (in vivo and in vitro) sta0s0cal data analysis. Machine Learning, Gene expression deconvolu0on, Cell- Cell interac0on, Trajectory analysis.
• Built Machine Learning classifier on large scale bulk and single cell RNAseq to predict survival of 1000 mul0ple myeloma pa0ents (MMRF Compass study), collaborated with Owkin for subtypes classifica0on. PrognomIQ, San Mateo, CA 05/2021 – 06/2022
Data scienMst
Built a mul0-omics machine learning pipeline for cancer early detec0on.
• Mul0 Omics data modeling and integra0on (Proteomic, transcriptomics, metabolomics, lipidomics, genomics).
• DNA methyla0on featuriza0on.
• microRNAseq data processing pipeline.
Stanford University, CA 04/2019 - 04/2021
Post Doctoral
Advisor: Ash A. Alizadeh.
Built a Computa0onal framework for drug discovery in immuno-oncology. We combine data driven approaches with a deep characteriza0on of the tumor microenvironment signaling network to find new treatments for DLBCL.
• Various projects aiming at detec0ng different immune cell frac0ons for any 0ssue type with bulk RNAseq data.
• Predic0ng an0gen presenta0on (MHCI/II) with deep learning (MARIA, Chen et al. Nature Biotech). Predic0on of B cell response to neoan0gen.
MI YANG PAGE 2/3
Heidelberg University, Germany 09/2015 – 11/2018
PhD student
Advisor: Julio Saez-Rodriguez.
Cancer bioinforma0cs, drug combina0on discovery. 10+ publica0ons, 3 as first authors.
• Developed a computa0onal framework to uncover drug response mechanisms at the target and pathway levels, offering robust insights for cancer treatment.
• Created workflows for drug synergy predic0on and pa0ent stra0fica0on to enhance efficacy and priori0ze compounds in drug screenings.
• Led the NCI-CPTAC DREAM Proteogenomics Challenge, managing a team of >100 scien0sts across the world. This project represents a landmark community effort and the most comprehensive assessment of algorithms to infer proteomic from genomic data to date. Methods derived from this challenge could be used for predic0ng protein levels on pa0ent samples for which only mRNA is available.
• Analyzed mul0-omics data from HeLa cell lines, revealing genomic variability and stressing the need for careful cell line selec0on in cancer research.
• Built Collector, an algorithm to match cancer cell lines with primary tumors for more relevant drug tes0ng. University of California, San Francisco, CA 02/2015 – 07/2015 Research Intern
Advisor: C. Antony Hunt
Modeling and simula0on, Systems pharmacology, Use of MASON (Mul0-Agent Simula0on of Networks).
• In silico toxicity predic0on of acetaminophen-induced hepatotoxicity and the mechanism of protec0on of N- acetylcysteine for the liver. New hypothesis of beier treatment protocols for Drug Induced Liver Injuries. EDUCATION
• Doctor of Philosophy (PhD), ComputaMonal Biology, Heidelberg University, Germany, 2015-2018
• Master of Science (MS), SyntheMc and Systems Biology, University of Paris-Saclay, France, 2014-2015
• Doctor of Pharmacy (PharmD), University of Paris-Saclay, France, 2007-2014 TECHNICAL SKILLS
• Programming/cloud/database: Python (OOP) R AWS Neo4j Snowflake
• Machine learning algorithms: Linear regression Matrix/Tensor factoriza0on Regularized Cox regression for survival analysis Tree based algorithms Mul0 Omics Factor Analysis Bayesian Sampling Par0al Dependence Plot LDA
• AI soeware: Lang Chain LLM Knowledge Graph PyTorch TensorFlow Keras
• Deep learning: Neural network LSTM GNN
PUBLICATIONS
• Yang, Mi, Jaak Simm, Chi Chung Lam, Pooya Zakeri, Gerard J. P. van Westen, Yves Moreau, and Julio Saez- Rodriguez. 2018. “Linking Drug Target and Pathway Ac0va0on for Effec0ve Therapy Using Mul0-Task Learning.” Scien0fic Reports, May.
• Yang, Mi, Patricia Jaaks, Jonathan Dry, Mathew Garnei, Michael P. Menden, and Julio Saez-Rodriguez. 2020.
“Stra0fica0on and Predic0on of Drug Synergy Based on Target Func0onal Similarity.” NPJ Systems Biology and Applica0ons 6 (1): 16.
• Yang, Mi, Francesca Petralia, Zhi Li, Hongyang Li, Weiping Ma, Xiaoyu Song, Sunkyu Kim, et al. 2020. “Community Assessment of the Predictability of Cancer Protein and Phosphoprotein Levels from Genomics and Transcriptomics.” Cell Systems, July. hips://doi.org/10.1016/j.cels.2020.06.013. MI YANG PAGE 3/3
• Liu, Yansheng, Yang Mi, Torsten Mueller, Saskia Kreibich, Evan G. Williams, Audrey Van Drogen, Christelle Borel, et al. 2019. “Mul0-Omic Measurements of Heterogeneity in HeLa Cells across Laboratories.” Nature Biotechnology, February, 1.
• Najgebauer, Hanna, Mi Yang, Hayley E. Francies, Clare Pacini, Euan A. Stronach, Mathew J. Garnei, Julio Saez- Rodriguez, and Francesco Iorio. 2020. “CELLector: Genomics-Guided Selec0on of Cancer In Vitro Models.” Cell Systems 10 (5): 424–32.e6.
• Dry, Jonathan R., Mi Yang, and Julio Saez-Rodriguez. 2016. “Looking beyond the Cancer Cell for Effec0ve Drug Combina0ons.” Genome Medicine 8 (1): 125.
• Tognes, M., A. Gabor, M. Yang, and V. Cappelles. 2020. “Deciphering the Signaling Network Landscape of Breast Cancer Improves Drug Sensi0vity Predic0on.” Cell Systems 12 (5): 401–18.e12.
• César-Razquin, Adrián, Enrico Girardi, Mi Yang, Marc Brehme, Julio Saez-Rodriguez, and Giulio Super0-Furga. 2018. “In Silico Priori0za0on of Transporter–Drug Rela0onships from Drug Sensi0vity Screens.” Fron0ers in Pharmacology 9.
• Roumelio0s, Theodoros I., Steven P. Williams, Emanuel Gonçalves, Clara Alsinet, Mar0n Del Cas0llo Velasco- Herrera, Nanne Aben, Fatemeh Zamanzad Ghavidel, et al. 2017. “Genomic Determinants of Protein Abundance Varia0on in Colorectal Cancer Cells.” Cell Reports 20 (9): 2201–14. CONFERENCE TALK
• RECOMB/ISCB conference on Regulatory & Systems Genomics. December 8-10, 2018 - New York City, USA.
• Summer school of Machine Learning in Drug Design, August 20-22, 2018 - Leuven, Belgium.
• German Associa0on for Medical Informa0cs, Biometry and Epidemiology (GMDS). 03/09/2018 - Osnabruck, Germany.
BOOK CHAPTER
• Dynamic logic models complement machine learning to improve cancer treatment. OTHER PUBLICATIONS
Personalized medicines and pharmacogenomic tes0ng: from scien0fic basis to assessment, regulatory framework, labeling, and prac0cal informa0on.