Shijie Cai
*** **** **** **, *** **D, *****, New York, New York
*************@*****.***= cell phone: +1-814-***-**** = Github: https://github.com/shijiecai EDUCATION
The Pennsylvania State University, University Park State College, PA Eberly College of Science Aug-2013 - May 2017
Bachelor of Science in Statistics with Actuarial Science Option Boston University Boston, MA
The Graduate School of Arts & Sciences Sep 2017 - January 2019 Master in Statistical Practice.
GPA: 3.66/4.00
WORK EXPERIENCE
Mean Value Consulting New York, NY
Business Analyst (intern) May 2019 –
• Conduct research to identify new markets and customer needs.
• Arrange business meetings with prospective customers. Work with sales team to identify, develop and secure customer prospects using multiple information gathering techniques. Nearr Corporation New York, NY
Data Analyst June – August 2018
• Constructed the key features in data structures and assisted tech team to build it for eCommerce.
• Used data visualizations (ggplot) in R to present data, including sales, margin, GMV and etc., to find the key traits and made it easier to comprehend.
• Built k-means clustering model in R to cluster the customers in based on recency, frequency and monetary. And made different promotions for different level of customers.
• Delivered two sales prediction models, using random forest and XGBoost and chose Random Forest as the final model.
China UnionPay Wuhan, China
Analyst July –August 2016
• Analyzed and forecasted the daily transactions and completed the daily report based on the daily performance.
• Retrieved credit or debit cards’ transaction records to assist the law enforcement for credit card fraud investigation.
• Investigate the merchants that wish to join the UnionPay system based on UnionPay standards. PROJECT
Confirmatory factor analysis for survey questions, Boston University Athlete (collaborative)
• Explored and cleaned the 2016-2017 BU athlete survey data by R.
• Data visualization (in R) through all sports and all athletes for likert scale questions in order to find distribution of scores.
• Used confirmatory factor analysis in R to select and build the model based on factor loadings. Twitter Data Mining
• Used Streaming API with R to capture words in tweeds, created visualizations of geo-location for all the tweeds and used words clouds and sentiment analysis to finish analysis.
• Create a dashboard using Shiny Apps in R to present visualizations. Birth Canal Measurements in Primates (collaborative)
• Used ggplot to visualize distributions of measurements for all species.
• Used probabilistic principal component analysis (ppca), an unsupervised method to help locating the constraints of birth canals within the primate species.
• Used decision trees to locate constrains where rotations might be needed during primates’ labor and delivery
CERTIFICATION AND SKILLS
Actuarial Examinations: Exam P: Probability Theory, FM/2: Financial Mathematics Technical Skills: Proficient in MS Office (Excel, PowerPoint, Word), Tableau Programming Skills: Proficient in R, familiar with Python and SQL Language: Mandarin and English