Yingqin (Katherine) Luo, Ph.D.
Bronx, NY ***** 623-***-**** *******@*****.*** Green card holder ! Education
Beijing Normal University 2008
Ph.D. of Bioinformatics (Awarded to First-Class Qiushi fellowship, top 0.5%) Lanzhou University 2003
Bachelor of Information and Computing Sciences (top 2% graduated students) Experience
Software Developer Albert Einstein College of Medicine 2011-Present
- Designed, developed and maintained Clinical Trial data collection systems.
- Built a complete notification system in large commercial CTMS software actively used by medical college and hospital
- Gathered and collected legacy data, established a data integration pipeline, and prepared documentation to submit to official sites.
- Extracted cohort data from empire data warehouse for site researchers
- Performed data harmonization between multiple data resources to make user desired data sets.
Bioinformatics researcher Arizona State University 2008-2011
- Statistically investigated evolutionary dynamics of overlapping genes
- Discovered two specific virulence factors by systematically studying genomic variants Skills and Tools
Programming languages: Python (Scikit-learn, Numpy, Pandas, Scipy, Seaborn), PL/SQL, R, Perl, SAS, Shell/Scripting
Machine learning: Classification, Regression, Clustering, Feature engineering Statistical Methods: Regression models, Hypothesis testing and confidence intervals, Principal component analysis, Dimensionality reduction Software: Unix/Linux, Oracle, MySQL, Microsoft Excel Data Engineering: Data Modeling, ETL
Relevant Courses: Machine Learning, Introduction to Data Science in Python, Learning Python for Data Analysis and Visualization, Intro to Statistics, Stochastic Process, Linear Algebra, Advanced Algebra, Advanced Calculus, Multivariable Calculus, Numerical Analysis, Probability Theory, Mathematical Modeling, Database and Data Structures Projects and Publications
- Conducted machine learning methods to investigate bacterial relationship: building different models to hierarchically classifying strains based on distance matrix
- Built supervised learning model to globally identify transcribed regions based on microarray data: feature engineering, data normalization, PCA analysis, model supervised training, and prediction of unrecognized functional units.
- Performed hypothesis test to prove that university towns having their mean housing prices less affected by recessions.
- Published 5 first-authored research papers in top journals