Liyi (Lily) Kuo
Forest Hills, New York, ***** Email: *******@*****.*** Phone: (917) 334–0629
EDUCATION
NYC Data Science Academy, New York, New York January 2020
• Data Science certificate program involving over 420 hours of coursework
• Related Coursework: Machine Learning, Algorithms for Data Science, Foundations of Statistics and Probability, Exploratory Data Analysis & Visualization
Touro College, New York, New York June 2013
• Master of Science Secondary Education in Mathematics New York University, New York, New York June 2010
• Master of Science Biomedical Engineering
• Bachelor of Science Chemical and Biological Engineering PROJECTS
Credit Card Fraud Detection
• Distinguished fraudulent credit card transactions from genuine client transactions using clustering and classification methods with an accuracy of 95% and a roc_auc_score of 97% Lending Club - Predicting Loans with Positive Return on Investments
• Reverse engineered and determined the 7 most important loan approval criteria with 96% accuracy of the Lending Club in determining accepted vs. rejected loans through clustering and machine learning models
• Increased the mean ROI for investors and stakeholders from 8% to 15% by selecting fully paid-off loans at maturity, results validated by the modern portfolio theory and machine learning models Ames Housing Price Prediction
• Implemented hedonic pricing models and machine learning modeling algorithms to predict house prices in using 79 features
• Identified the top 5 important attributes in housing price prediction across the counties of Ames Iowa. The final stacked model achieved an R square value of 89.2%
Leading Cause of Deaths in NYC Dashboard
• Explored the leading causes of death in New York City from 2007-14 using a dataset with 1100 entries provided by the Department of Health and Mental Hygiene (DOHMH) with R
• Classified 5 top causes of deaths [Heart Diseases, Cancer, Accidents, Chronic Lower Respiratory Diseases, and Stroke] which offers useful insights for health care providers
Hotwire Web Scraper
• Created and developed an extension to the Hotwire website to search for the lowest 5 flight prices 3 days to the designated travel date using selenium
SKILLS
Language and Frameworks: R, Python, SQL, MS Office Machine Learning: PCA, GLM, Linear Regression, Logistic Regression, Random Forest, Decision Trees, AdaBoost, XG Boost, SVM, Gradient Boosting, A/B Testing, Time Serie Analysis Tools and DBMS: RStudio, Jupyter, Git, MySQL, Tableau, Hadoop, PySpark, Matplotlib, Scikit-Learn, Pandas, Numpy, Scipy Others: Probability Theory, Statistics, Financial Mathematics, Monte Carlo Simulation EXPERIENCE
Department of Education/Classroom Mathematics Teacher, New York, New York September 2018
• Analyzed department-wide student data in MS Excel, monitored student requirements and performance to secure 92-100% student
• passing rate in New York State standardized math exams
• Developed predictive classification model to analyze and previse student performances using various features (i.e grade records) and provided early intervention for students with the highest probability of dropping out
• Orchestrated over 4 concurrent projects simultaneously, strategically prioritizing each step of the project to meet deadlines NYU Langone Medical Center/ MIS Junior Research Scientist, New York, New York April 2011
• Designed and revised projects according to budget, schedule, and implemented the Friedman Test and Anderson-Darling Test to validate experimental procedures using R with confidence level of 0.95
• Coordinated clinical researches involving 600+ cases to evaluate implant stability of Total Knee Arthroplasty six months post- surgery using statistical data analysis
• Derived a mean resistive force of 5.6 1.2 N and 4.9 1.2 N for the LFC and MFC sites with a lab constructed indentation device, result analyzed using R and MS Excel