Resume

Data Analyst/Scientist

Location:

Novato, CA

Posted:

December 07, 2020

Contact this candidate

Resume:

Jiali Luan

San Francisco, CA 415-***-**** adift7@r.postjobfree.com

EDUCATION

University of Michigan, Ann Arbor May 2020

M.S. Applied Statistics GPA: 3.87/4.0

University of California, Santa Barbara Mar. 2018

B.S. Financial Mathematics and Statistics (Spatial Science Minor) GPA: 3.63/4.0 SKILLS

● Programming R, Python, SQL, MATLAB, SAS, Tableau, Google Cloud, AWS, TensorFlow

● Data Science Machine Learning, Deep Learning, Data Mining, Time Series, Experimental Design, A/B Testing

● Language Fluent Chinese, Conversational Japanese

● Certification Society of Actuary Probability Exam Certification EXPERIENCE

Data Analyst - Rethink Returns, Remote June - Oct. 2020

● Provided data analysis using Python and SQL to answer business questions and support the decision-making process.

● Analyzed consumer behaviors using regression models and found factors that can increase customer stickiness.

● Cooperated with the team in exploring the new sources of products and building connections. Graduate Student Instructor (GSI) - UMich Dept. of Statistics, Ann Arbor, MI Sept. 2018 - May 2020

● Guided over 100 students each semester in learning R coding for exploratory data analysis and linear regression.

● Led weekly GSI meetings where I developed future plans, discussed grading rubrics, and shared students’ feedbacks.

● Trained incoming new GSIs and monitored their teachings to ensure the quality of education. Undergraduate Researcher - UCSB Dept. of Electrical Engineering, Santa Barbara, CA Aug. 2017 - Mar. 2018

● Implemented Bayesian Tensor Completion algorithm, a multi-dimensional computational method.

● Applied this algorithm to practical chip testing data across multiple dies of a wafer to predict spatial variation.

● Achieved a 0.20% relative error on the predicted result based on only 15% of 717,080 data samples in 67 seconds. PUBLICATION

J. Luan and Z. Zhang, "Prediction of multi-dimensional spatial variation data via Bayesian tensor completion," IEEE Trans. CAD of Integrated Circuits and Systems (TCAD), vol. 39, no. 2, pp. 547-551, Feb. 2020. PROJECT

Credit Fraud Detection Apr. 2020

● Used 5-folds cross validation before applying resampling methods in order to avoid overfitting.

● Overcame data imbalance problem using SMOTE (oversampling) and EasyEnsemble (undersampling) techniques.

● Applied logistic regression and random forest for classifying fraudulent transaction.

● Achieved a recall score of 0.9042 and 0.8847, F1 score of 0.9294 and 0.9123 for the two algorithms, respectively. Next Word Prediction Feb. 2020

● Preprocessed original corpus through tokenization, punctuations and stop-words removal, and lemmatization.

● Applied n-grams (bigram, trigram) and 2-layer LSTM language models in predicting next word.

● Achieved perplexity scores of 178.62, 117.83 for n-grams and 47.24 after 50 epochs for LSTM model.

● Output the next top five candidate words for the current input word, just like how a smart keyboard works. Housing Price Prediction Sept. 2019

● Predicted missing data such as city information by using longitude, latitude, or zip-code, and vise versa.

● Handled high-dimensional data using PCA to reduce multicollinearity problem in the original data.

● Adopted regularized linear regression and random forest regression models to forecast future housing prices.

● Attained Root Mean Squared Logarithmic Error(RMSLE) of 0.1663 and 0.1065, respectively.

Contact this candidate