Post Job Free
Sign in

Data Analyst

Location:
Brooklyn, NY
Salary:
70000
Posted:
April 15, 2021

Contact this candidate

Resume:

Weixing Yang

Data scientist with math/computer science double major and demonstrated experience with big data analytics and intensive programming. Seeking a position within a creative and dynamic work environment. ****.******@*****.***

917-***-****

New York, United States

nycdatascience.com/blog/aut

hor/weixyang/

linkedin.com/in/weixing-yang-

381485173

github.com/weixyang90

SKILLS

R(dplyr, ggplot2, shiny)

Python(numpy, pandas,

scikit-learn)

SQL (MySQL)

Web-Scraping(scrapy)

KNN Clustering

Linear Models

Tree based models

Naive Bayes Java

HTML/CSS

Data Structures

Git/Github

LANGUAGES

Mandarin

PROFESSIONAL EXPERIENCE

Part-time Data Scientist

ABPHINA

10/2020 - Present,

Propose business ideas to help the company make strategic decisions. Provide plans for gathering, and organizing data from multiple sources. Ensure the quality, consistency of data and utilize different analytics to assist in turning raw data into fact-based conclusions.

Develop and deliver reports as a result of the tested hypothesis and provide useful information for the organization.

Data Science Intern

ABPHINA

06/2019 - 09/2019,

Used supervised and unsupervised learning with Python packages (numPy, Pandas, scikit-learn, SciPy, etc.) to predict malaria trends and outbreaks.

Divided countries with clustering models including K-means and LCA. Processed data with missing value by KNN, collinearity removal, and re-sampling with bootstrap. Predicted disease trend with different regression models including penalized models (Lasso, Ridge, Elastic net, etc), polynomial regression with feature interactions, tree models (Random Forest, Stochastic Gradient Boosting and XGBoost).

Evaluated and improved model performance by cross-validation and tuning hyperparameters. Translated model results into business recommendations with key contributing factors and variable weights.

Data Science Fellow

NYC Data Science Academy

01/2019 - Present,

Worked with a senior data scientist from UnitedHealthcare to build a model to predict hospital readmission rates for diabetic patients to assist hospitals and insurance companies target high-risk patients. Implemented advanced feature engineering and used several classification models including logistic regression, random forest, gradient boosting, and extreme gradient boosting on large, complex data set.

Predicted house sales prices using a highly dimensional dataset. Processed missing data for numerical and categorical features. Used a single Machine Learning model including Lasso regression, Ridge regression, ElasticNet, Random forest, Gradient Boosting, and Extreme Gradient Boosting to predict target price. Stack all the above models except Randomforest to perform better prediction for the target.

EDUCATION

Bachelor of Science

Stony Brook University

02/2014 - 12/2017,

Double Major in Computer Science and Applied Mathematics & Statistics Data Structure, Analysis of Algorithms, Principles of Database Systems, Scripting Languages, Principles of Programming Languages, Software Engineering Achievements/Tasks

Achievements/Tasks

Achievements/Tasks

Courses



Contact this candidate