Post Job Free

Resume

Sign in

Data Analysis, R, SAS, prediction

Location:
Sunnyvale, CA
Posted:
May 16, 2017

Contact this candidate

Resume:

Jiexuan (Jessica) Cao

+ Sunnyvale, CA ( 530-***-**** * ac0b8u@r.postjobfree.com cn.linkedin.com/in/jiexuancao/ EDUCATION

Master of Science in Statistics University of California, Davis (UC Davis) 03/2016 Bachelor of Science in Mathematics Renmin University of China (RUC) 06/2014 PROFESSIONAL SKILLS

• R PROGRAMMING • MACHINE LEARNING • DATA ANALYSIS • SQL

• SAS PROGRMMING • STATISTICAL MODELING • PYTHON • EXCEL

• DATA MINING • DATA VISUALIZATION • LATEX • CHINESE (NATIVE) PROFESSIONAL EXPERIENCE

Data Analyst YONO Health Inc., Mountain View, CA, U.S.A 07/2016-Present

• Monitor and maintain the database from Google Cloud Platform. Apply adaptive weighted moving average algorithm to smooth the temperature data to capture important patterns in the data by using R.

• Apply machine learning technologies such as Hidden Markov Model to estimate the biphasic property in females’ menstrual cycle. Develop a predictive model to predict the ovulation day of customers which help them to get pregnant or avoid pregnancy.

• Create KPI matrix to see the quality of data and give instructions to customers of their behavior for using the YONO thermometer.

• Visualize data to show for the customers and get insights for the modeling prediction by using R.

• Collaborate with engineers to implement and deploy data cleaning and prediction solutions in YONO Fertility APP by using Python.

• Write and maintain detailed specifications documentation on data cleaning and prediction methods. SAS Programmer Intern GCP ClinPlus Co., Ltd. (CRO), Carlsbad, CA, U.S.A 07/2015-09/2015

• Independently completed the two projects. Cleaned thousands of data points with hundreds of variables and analyzed basic statistical information such as standard deviation. Optimized the original SAS programs by using macros, mapping table and achieved the statistical analysis methods of chi-square test, rank-sum test. Credit Risk Rating Project Intern STANDARD & POOR’S, Beijing, China 09/2013-01/2014

• Involved in the project to do Credit Risk Rating for CHINA CITC BANK. Cleaned, managed and integrated the data from original rating information of CHINA CITIC BANK’s customer by using SAS and SQL.

• Solved the problem of whether to partition models according to the enterprise scale using statistical methods such as K-S test, K-means.

• Screened the indexes using AR ratio, Somers’ D, Spearman coefficient. Calculated the weight of qualitative indexes using Analytic Hierarchy Process (AHP). Using Genetic Algorithm to further selected the indexes and calculated the weight of quantitative indexes. RESEARCH EXPERIENCE

R Package Markov Chains, UC Davis 01/2016–02/2016

• Be a team leader of three to develop a CRAN-quality R package for the analysis of Markov Chains. Creatively solved continuous-time Markov Chains for both finite and infinite state space case based on the discrete-time Markov Chains. The package offered the computations of stationary distribution and sub-matrix of expected hitting times with rows and columns specified by the user. Data Analysis Projects, UC Davis 09/2014– 02/2016

• Classification on New Emails with R

Classified more than 6000 new email messages as SPAM or HAM (valid mail) from the training data by using classification tree and k- nearest neighbors (KNN). Performed descriptive statistics to explore important email features for SPAM identification.

• Visualizing and Mining Yelp Data for Personal Ratings and Recommendations with R and Python (NLTK) Preprocessed the data from Yelp json files and visualized the data on heat map, distribution plot and K-Means Clustering plot. Recommended the restaurants by users’ preference. Quality phrase mining and sentimental analysis the restaurants in Phoenix.

• Processing Big Data (about 40GB) on New York Taxi with R in Different Approaches and SQL Used Shell Command, C, Parallel Processing in R and SQL to process the New York Taxi Data and compared the differences between each approach. Fitted a Linear Regression Model to see the relation between trip time and total amount.

• Modeling Sparrow Survival Data with R

Developed a Multiple Logistic Regression Model to analyze data. Fitted the model by using the stepwise forward selection whose criterion is AIC and p-value. Validated the model by cross-validation. The Python Project Battleship Game, UC Davis 11/2015

• Implemented a one-player version of battleship game that allows the user to play against a computer.



Contact this candidate