Mengyao Zhao, Ph.D.
Brookline, MA **445 • 781-***-**** • ***********@*****.*** • linkedin.com/in/mengyao-zhao-ph-d-9406455 Bioinformatics Software Leader
Summary of Qualifications
• Strong record of designing high-performance NGS pipelines and biomarkers, using extensive genomic data expertise.
• Capability of independently architecting novel parallel computing algorithms and leading end-to-end delivery of industry-grade clinical software.
Core Competencies:
• Coding Languages: C/C++, Python (NumPy, SciPy, Pandas, Cython), SQL, Perl, R, Java
• Cloud & DevOps: AWS/Google Cloud, Docker/Podman, Bamboo/Jenkins CI/CD, Nextflow, Snakemake.
• HPC & Computing: Cluster usage (LSF/SGE/Slurm), SIMD, High Performance Computing.
• Machine Learning: TensorFlow/Keras, CNN/MLP/HMM, GenAI, Deep learning, supervised & unsupervised learning
• Bioinformatics: Large-scale NGS data analysis, cancer genomics (DNA/RNA), HLA typing, population/evolutionary genomics, PLINK
• Clinical Development & Regulatory: CDx development, Clinical trials, FDA filing, AV writing. Experience Highlights
Manager (Staff Computational Biologist)—Pillar Biosciences Inc. (Natick, MA) 2022–2026 PiVAT Pipeline Team Manager
• Led the Pillar Variant Analysis Toolkit (PiVAT) Pipeline Team, including Algorithm, DevOps, and Software Manufacturing functions (10 members total, including outsourced contractors), to deliver end-to-end PiVAT release cycles, from new feature development and integration through software product distribution
• Oversee the patient selection pipeline development for the MoonRise phase 3 bladder cancer clinical drug trial. Software Algorithm Team Manager
• Supervised a 4-person Software Algorithm Team to deliver critical pipeline architecture and performance upgrades: o Rewrote PiVAT’s variant-calling methods from Python to pure C, and combined with algorithmic improvements to achieve ~24x speedup and ~99.9% memory reduction (59 GB to 66 MB), cutting overall pipeline runtime by 49.6% and memory usage by 49.32%.
o Decoupled pipeline and Web App codebases for independent development cycles and flexible pipeline and platform integration.
• Enhanced long-InDel/ITD calling algorithms, significantly boosting FLT3 ITD detection accuracy.
• Maintain SSW library, delivering ~50x speed improvements over standard implementations. Code utilized globally in Pillar PiVAT, Illumina Dragon, Google DeepVariant, NVIDIA Parabricks DeepVariant, and Seven Bridges graph genome aligner
Lead of Bioinformatics Team (Staff Scientist I)—IDEXX (Westbrook, ME) 2021–2022
• Led a 4-person Bioinformatics Team in cross-functional collaboration with the Genomics Wet Lab, Proteomics (LCMS), and IT departments to develop a multiomics-based dog cancer diagnostic platform, cysto canis (dog parasite) diagnostic biomarkers, and dog NSAID intolerance biomarkers.
• Spearheaded and pioneered development of IDEXX Oncology Database (IDEXX OncoDB).
• Built core bioinformatics pipelines to support multiomics research, including germline/somatic variant calling, sequencing coverage Limit of Detection (LoD) definition, and GWAS.
• Established and maintained the cloud computing environment on AWS for Genomics Department. Computational Biologist—Roche - Foundation Medicine (Boston, MA) 2018–2021
• Optimized a loss of heterozygosity (LOH) detection quality control (QC) tool using a multilayer perceptron (MLP), reducing manual curation labor by 90%.
• Developed a hybrid MLP and CNN deep learning model that boosted large B-cell lymphoma (DLBCL) cell of origin
(COO) classification accuracy from 91.3% to 95.9%, successfully scaling the prototype into a production-level tool. Mengyao Zhao, Ph.D.
Brookline, MA 02445 • 781-***-**** • ***********@*****.*** • linkedin.com/in/mengyao-zhao-ph-d-9406455 Page 2 of 2
• Authored Analytical Validation (AV) report for FDA filing of FoundationOne Liquid Companion Diagnostic (F1LCDx).
• Conducted PIK3CA clinical trial data analysis, directly contributing to the FoundationOne Companion Diagnostic
(F1CDx) Supplemental Premarket Approval (sPMA) FDA filing. Senior Research Scientist—Velsera (Seven Bridges) (Boston, MA) 2016–2017
• Developed the HLA typing tool (Type), achieving an average of 4x higher accuracy than HLA*PRG, OptiType, and xHLA in 4-digit resolution typing of class I and II genes (based on low-coverage WGS CEU trio testing).
• Directed a 5-person team in extending the SSW Library to support multiple programming languages.
• Supervised and mentored an intern through a comprehensive HLA typing tool benchmarking project. Staff Scientist (Advisor: Dr. David Reich)—Harvard Medical School (Boston, MA) 2014–2016
• Optimized data processing tools (cTools) for the Simons Genomic Diversity Project (SGDP): analyzed whole genome variation data of 300 genomes from 142 diverse populations.
• Accelerated software performance by ~30x with zero increase in memory usage; condensed the SGDP dataset footprint by 90%, reducing total size from 2 TB to 192 GB. Made SGDP project results a publicly accessible package.
• Documented and maintained the Reich Lab software packages for high-throughput analysis. Patent
• Justin Newberg, …, Mengyao Zhao, …, Jason D Hughes, “Methods and systems for normalizing targeted sequencing data.” Serial No. PCT/US2023/069150, Jun 27, 2023. Publications
• Austin Talbot, …, Mengyao Zhao, …, Yue Ke, “BayesCNV: A Bayesian Hierarchical Model for Sensitive and Specific Copy Number Estimation in Cell Free DNA.” Diagnostics 16(2), 280, January 16, 2026.
• Swapan Mallick, …, Mengyao Zhao, …, David Reich, “The Simons Genome Diversity Project: 300 genomes from 142 diverse populations.” Nature 538, 201–206, October 13, 2016.
• The 1000 Genomes Project Consortium (including Mengyao Zhao), “A global reference for human genetic variation.” Nature 526, 68–74, October 1, 2015.
• Mengyao Zhao (corresponding author), Wan-Ping Lee, Erik P. Garrison, Gabor T. Marth, “SSW library: an SIMD Smith- Waterman C/C++ library for use in genomic applications.” PLoS One 8, December 04, 2013.
• Mengyao Zhao: Co-translated book by R. Durbin et al., “Biological sequence analysis” (Press Syndicate of the University of Cambridge, Cambridge, UK, April 1998) into Chinese (Science Press, Beijing, August 2010).
• Helen Parkinson, …, Mengyao Zhao, Ugis Sarkans and Alvis Brazma, “ArrayExpress update – from an archive of functional genomics experiments to the atlas of gene expression.” Nucleic Acids Research 37, 868–872, 2009.
• Yali Xue, …, Mengyao Zhao, …, Chris Tyler-Smith, “Adaptive evolution of UGT2B17 copy number variation.” The American Journal of Human Genetics 83, 337–346, September 12, 2008. Education and Professional Development
Bioinformatics—Ph.D., Boston College (Chestnut Hill, MA) Dissertation: “Genomic variation detection using dynamic programming methods” (Advisor: Dr. Garbor Marth) Software: https://github.com/mengyao
Genomics and Pathway Biology—M.Sc., The University of Edinburgh (Edinburgh, UK) Thesis: “The integration of the dynamic pathway information into the Medical Microbiology Database and the improvement of the Boolean network modeling method”—Distinction Thesis Award Computer Science—B.Sc., Beijing Normal University (Beijing, China) Thesis: “DNA sequencing and human leukocyte antigen (HLA) matching software”—Excellent Graduation Thesis Award Project Management—Certification, LinkedIn (Sunnyvale, CA)