Post Job Free
Sign in

Data Analyst

Location:
United States
Posted:
November 02, 2016

Contact this candidate

Resume:

Jiexuan (Jessica) Cao

( 530-***-**** cn.linkedin.com/in/jiexuancao/

+ Sunnyvale, CA * ******@*******.***

EDUCATION

Master of Science in Statistics University of California, Davis (UC Davis) GPA: 3.57/4.0 03/2016 Bachelor of Science in Mathematics Renmin University of China (RUC) GPA: 3.57/4.0 06/2014 PROFESSIONAL SKILLS

• R PROGRAMMING • MACHINE LEARNING • DATA ANALYSIS • SQL

• SAS PROGRMMING • STATISTICAL MODELING • PYTHON • EXCEL

• DATA MINING • DATA VISUALIZATION • LATEX • CHINESE (NATIVE) PROFESSIONAL EXPERIENCE

Data Scientist Intern YONO Health Inc., Santa Clara, CA, U.S.A 09/2016-Present

• Cleaned raw date of users’ daily temperature. Corrected the abnormal data by applying linear regression and simulation. Reduced the noise of data using Simple Moving Average (SMA) filters by using Python.

• Visualized the data by plotting users’ everyday nightly temperature chart and Basal Body Temperature (BBT) chart by using R.

• Applied Machine Learning technologies such as Smoothing Spline, Exponential Smoothing and Moving Average to develop the predictive model. Predicted the ovulation day of users which help them to get pregnant. SAS Programmer Intern GCP ClinPlus Co., Ltd. (CRO), Carlsbad, CA, U.S.A 07/2015-09/2015

• Independently completed the project Bilobalide4, using SAS to clean up thousands of data points with hundreds of variables and analyzed basic statistical information such as standard deviation.

• Involved in the projects Baqutting and Isis, optimized the original SAS program by using macros, mapping table and achieved the statistical analysis methods of chi-square test, rank-sum test. Credit Risk Rating Project Intern STANDARD & POOR’S, Beijing, China 09/2013-01/2014

• Involved in the project to do Credit Risk Rating For CHINA CITC BANK Customer. Cleaned, managed and integrated the data from original rating information of CHINA CITIC BANK’s customer by using SAS and SQL.

• Solved the problem of whether to partition models according to the enterprise scale using statistical methods such as K-S test, LOESS, K-means, AUC, Wilcoxon-Mann-Witney Test.

• Screened the indexes using AR ratio, Somers’ D, Spearman coefficient. Calculated the weight of qualitative indexes using Analytic Hierarchy Process (AHP). Using Genetic Algorithm to further selected the indexes and calculated the weight of quantitative indexes. RESEARCH EXPERIENCE

R Package Markov Chains, UC Davis 01/2016–02/2016

• Be a team leader of three to develop a CRAN-quality R package for the analysis of Markov Chains. Creatively solved continuous-time Markov Chains for both finite and infinite state space case based on the discrete-time Markov Chains. The package offered the computations of stationary distribution and sub-matrix of expected hitting times with rows and columns specified by the user. Data Analysis Projects, UC Davis 09/2014– 02/2016

• Classification on New Emails with R

Classified more than 6000 new email messages as SPAM or HAM (valid mail) from the training data by using classification tree and k- nearest neighbors (KNN). Performed descriptive statistics to explore important email features for SPAM identification.

• Visualizing and Mining Yelp Data for Personal Ratings and Recommendations with R and Python (NLTK) Preprocessed the data from Yelp json files and visualized the data on Heatmap, distribution plot and K-Means Clustering plot. Recommended the restaurants by users’ preference. Quality phrase mining and sentimental analysis the restaurants in Pheonix.

• Processing Big Data (about 40GB) on New York Taxi with R in Different Approaches and SQL Used Shell Command, C, Bag of Little Bootstraps, Parallel Processing in R and SQL to process the New York Taxi Data and compared the differences between each approach. Fitted a Linear Regression Model to see the relation between trip time and total amount.

• Modeling Sparrow Survival Data with R

Developed a Multiple Logistic Regression Model to analyze data. Fitted the model by using the stepwise forward selection whose criterion is AIC and p-value. Validated the model by cross-validation. The Python Project Battleship Game, UC Davis 11/2015

• Implemented a one-player version of battleship game that allows the user to play against an AI.



Contact this candidate