Post Job Free

Resume

Sign in

Data Scientist

Location:
Pittsburgh, PA
Posted:
December 11, 2020

Contact this candidate

Resume:

Shulei (Flora) Yang

571-***-**** l adilg7@r.postjobfree.com

https://github.com/floraysl

EDUCATION

University of Virginia, Charlottesville, VA Aug 2019 – (Dec 2020) Master of Science in Statistics

• Overall GPA: 3.6

• Relevant Courses: Stats Computing with SAS and R, Statistical Machine Learning, Linear Models, Statistical Consulting, Data Mining, Applied Time Series, Design and Analysis of Sample Surveys, Exploratory Data Analysis University of Richmond, Richmond, VA Aug 2015 – May 2019 Bachelor of Science in Business Administration, Concentrate in Finance and Minor in Mathematics

• Overall GPA: 3.52

• Honors: Graduate with Cum Laude, Robins School of Business Partner Scholarship (2017, 2018), Robins School of Business Summer Fellowship (2017)

SKILLS & CERTIFICATE

Certificates: CFA Level candidate

Tools and Technology: Machine Learning, Data Mining, Excel, @Risk, Bloomberg Terminal, Tableau, MicroStrategy and Microsoft Access Programming Language: R, Python, SAS, SQL and VBA. OPEN SOURCE PROJECTS

Santander Bank Product Recommendation System – a Kaggle machine learning challenge Mar 2020 – May 2020

• Created a recommendation system that can predict ideal financial products for individual customers using Python.

• Designed and implemented an XGBoost solution with a sparsity-aware split algorithm to overcome the high-dimensional and the over-fitting problem generated by GBDT.

• Utilized SVD to realize the Matrix Factorization Approach, and improved the prediction accuracy. Spam Email Detection– a machine learning project Nov 2019 – Dec 2019

• Performed exploratory data analysis on the dataset to observe statistical differences between spam emails and valid emails.

• Built three machine learning models, including logistics regression, non-linear SVM, and random-forest. Then performed five-fold cross-validation to validate these models’ performance and prevent overfitting.

• Analyzed results from each model and optimized the random-forest model by using grid search to tune the parameters. Improved the specificity from 91.2% to 95.4% and improved the sensitivity from 87.5% to 93.4% Happiness Level Report – a machine learning project Oct 2019 – Nov 2019

• Led a four-people team in the designing, planning, executing and report write-up process for the entire project which used various machine learning models to predict the happiness level of the people within a country based on over 25 predictors in R.

• Focused on penalized linear regression model by using Ridge and Lasso, and concluded the credibility of the project’s assumption.

• Improved the prediction accuracy from 87.5% to 98.2% by proposing and utilizing the random forest algorithm to deal with the high dimension of the data.

PROFESSIONAL EXPERIENCE

Research Intern, International Business Department Jun 2020 – Aug 2020 Industrial and Commercial Bank of China, Remote

• Research topic: ‘under the COVID-19 pandemic influence, how the changes in the international industrial chain might affect the commercial bank industry’.

• Developed currency-risk hedging strategies for companies that are facing currency risk because of the COVID-19 pandemic. Statistic Consultant, Gut-Brain Axis in Infancy Project Team Feb 2020 – Mar 2020 UVA Brain Institute, Charlottesville, Virginia

• Served as a data scientist and helped the team with the research on “if and how the gut microbiota at the newborn time period predicts later social-emotional outcomes”.

• Performed data cleaning and data exploration on raw data.

• Applied Elastic Net Regularization and Random Forest algorithms in R to find the key features of gut microbiota that are most predictive of infant’s different social responses. Financial Analyst Intern, Investment Department Jun 2018 – Aug 2018 TenYall & Sumin Investment Management, Nanjing, China

• Conducted researches and delivered presentations weekly on changes in Bio-pharma markets.

• Performed due diligence on target and verified the valuations provided by the companies.

• Used Data Table, Scenario Manager, and other Excel tools to analyze the potential financial risks of target companies.

• Participated in the final round capital raise of BrightGene Bio-Medical Technology Co. Ltd before its IPO. Independent Student Researcher, advised by Dr. Abdullah Kumas, Accounting Department May 2017 – Jun 2017 University of Richmond, Richmond, Virginia

• Collected data of several auto manufacturing companies’ recall issues and compared the data recorded by NHTSA to investigate whether these companies correctly reported their recalls on their financial statements.

• Presented findings and informed investors to be more alert with companies issued recalls in recent years when making investment decisions.



Contact this candidate