Rachel Khoo
***********@*****.*** GitHub
501-***-**** LinkedIn
Allen, TX
Summary
I am a data scientist skilled in Python, SQL, and R with an interest in NLP. With a background in chemistry and the recent completion of my data science apprenticeship, I bring an experimental methodology approach, deep knowledge of statistical analysis, and strong evaluation abilities. My work is supported by my communication skills and comfort presenting findings to audiences of varying industry knowledge. I am eager to bring my drive, pro-activity, and passion for data driven decision making to create effective business solutions. Skills
Statistics: Hypothesis testing, linear regression, logistic regression, Bayesian statistics, non-parametric statistics Data cleaning: Imputation, normalization, transformation, class-imbalance Machine Learning: Feature selection, feature engineering, regression, classification, supervised learning, unsupervised learning, hyper-parameter tuning, Natural Language Processing, deep learning, time-series analysis Data analysis tools: Python, SQL, R, and Tableau
Experience
THINKFUL Remote
Data Science Apprentice July 2020 – Present
● Completed learning capstones with various data science techniques earning praise from industry professionals
● Led collaborative pair work sessions with other students resulting in improved communication and teamwork
● Received mentorship from Data Science professionals from top employers including Humana and Aptiviti
● Developed skills in data science methodology and coding, deployed supervised and unsupervised learning models
● Project base work include:
● Determined there is no significant correlation between deaths due to opioid overdose and the proportion of synthetic opioids prescribed using Kruskal-Wallis, technology used: Python, Pandas, Seaborn, and SQL - GitHub
● Classified breast mass biopsy images as benign or malignant with 95% accuracy using XGBClassifier, technology used: Python, scikit-learn, XGBoost - GitHub
● Clustered CDC data into four groups, showing that prevention alone is not indicative of health, technology used: Python, sklearn, scipy, statsmodels - Github
● Deployed a cover letter generator using state-of-the-art methods, technologies used: NLTK, BERT, SpaCy, Tensorflow, Keras - Github
MICROCONSULT, INC. Carrollton, TX
Chemist Technician May 2018 - November 2018
● Analyzed test results to determine whether client samples meet specifications
● Optimized inventory records using Excel functions
● Reported out-of-specification results and performed ad-hoc investigation
● Maintained consumable inventory and kept track of usage Education
UNIVERSITY OF CENTRAL ARKANSAS
Bachelor of Science, Chemistry August 2014 - May 2018
● Senior Capstone Research
o FD&C Dye Content of Popular Beverages: A project in analytical chemistry to determine the quantity of common food dyes in popular beverages using calibration curves and linear regression.
● Awards: 2017 Award in Analytical Chemistry, Outstanding Undergraduate Thesis