Post Job Free
Sign in

Data Analysis, statistical analysis, machine learning, R, SQL, Python

Location:
New York City, NY
Posted:
March 27, 2020

Contact this candidate

Resume:

Shijie Cai

*** **** **** **, *** **D, *****, New York, New York

*************@*****.***= cell phone: +1-814-***-**** = Github: https://github.com/shijiecai EDUCATION

The Pennsylvania State University, University Park State College, PA Eberly College of Science Aug-2013 - May 2017

Bachelor of Science in Statistics with Actuarial Science Option Boston University Boston, MA

The Graduate School of Arts & Sciences Sep 2017 - January 2019 Master in Statistical Practice.

GPA: 3.66/4.00

WORK EXPERIENCE

Mean Value Consulting New York, NY

Business Analyst (intern) May 2019 –

• Conduct research to identify new markets and customer needs.

• Arrange business meetings with prospective customers. Work with sales team to identify, develop and secure customer prospects using multiple information gathering techniques. Nearr Corporation New York, NY

Data Analyst June – August 2018

• Constructed the key features in data structures and assisted tech team to build it for eCommerce.

• Used data visualizations (ggplot) in R to present data, including sales, margin, GMV and etc., to find the key traits and made it easier to comprehend.

• Built k-means clustering model in R to cluster the customers in based on recency, frequency and monetary. And made different promotions for different level of customers.

• Delivered two sales prediction models, using random forest and XGBoost and chose Random Forest as the final model.

China UnionPay Wuhan, China

Analyst July –August 2016

• Analyzed and forecasted the daily transactions and completed the daily report based on the daily performance.

• Retrieved credit or debit cards’ transaction records to assist the law enforcement for credit card fraud investigation.

• Investigate the merchants that wish to join the UnionPay system based on UnionPay standards. PROJECT

Confirmatory factor analysis for survey questions, Boston University Athlete (collaborative)

• Explored and cleaned the 2016-2017 BU athlete survey data by R.

• Data visualization (in R) through all sports and all athletes for likert scale questions in order to find distribution of scores.

• Used confirmatory factor analysis in R to select and build the model based on factor loadings. Twitter Data Mining

• Used Streaming API with R to capture words in tweeds, created visualizations of geo-location for all the tweeds and used words clouds and sentiment analysis to finish analysis.

• Create a dashboard using Shiny Apps in R to present visualizations. Birth Canal Measurements in Primates (collaborative)

• Used ggplot to visualize distributions of measurements for all species.

• Used probabilistic principal component analysis (ppca), an unsupervised method to help locating the constraints of birth canals within the primate species.

• Used decision trees to locate constrains where rotations might be needed during primates’ labor and delivery

CERTIFICATION AND SKILLS

Actuarial Examinations: Exam P: Probability Theory, FM/2: Financial Mathematics Technical Skills: Proficient in MS Office (Excel, PowerPoint, Word), Tableau Programming Skills: Proficient in R, familiar with Python and SQL Language: Mandarin and English



Contact this candidate