Jiali Luan
San Francisco, CA 415-***-**** adift7@r.postjobfree.com
EDUCATION
University of Michigan, Ann Arbor May 2020
M.S. Applied Statistics GPA: 3.87/4.0
University of California, Santa Barbara Mar. 2018
B.S. Financial Mathematics and Statistics (Spatial Science Minor) GPA: 3.63/4.0 SKILLS
● Programming R, Python, SQL, MATLAB, SAS, Tableau, Google Cloud, AWS, TensorFlow
● Data Science Machine Learning, Deep Learning, Data Mining, Time Series, Experimental Design, A/B Testing
● Language Fluent Chinese, Conversational Japanese
● Certification Society of Actuary Probability Exam Certification EXPERIENCE
Data Analyst - Rethink Returns, Remote June - Oct. 2020
● Provided data analysis using Python and SQL to answer business questions and support the decision-making process.
● Analyzed consumer behaviors using regression models and found factors that can increase customer stickiness.
● Cooperated with the team in exploring the new sources of products and building connections. Graduate Student Instructor (GSI) - UMich Dept. of Statistics, Ann Arbor, MI Sept. 2018 - May 2020
● Guided over 100 students each semester in learning R coding for exploratory data analysis and linear regression.
● Led weekly GSI meetings where I developed future plans, discussed grading rubrics, and shared students’ feedbacks.
● Trained incoming new GSIs and monitored their teachings to ensure the quality of education. Undergraduate Researcher - UCSB Dept. of Electrical Engineering, Santa Barbara, CA Aug. 2017 - Mar. 2018
● Implemented Bayesian Tensor Completion algorithm, a multi-dimensional computational method.
● Applied this algorithm to practical chip testing data across multiple dies of a wafer to predict spatial variation.
● Achieved a 0.20% relative error on the predicted result based on only 15% of 717,080 data samples in 67 seconds. PUBLICATION
J. Luan and Z. Zhang, "Prediction of multi-dimensional spatial variation data via Bayesian tensor completion," IEEE Trans. CAD of Integrated Circuits and Systems (TCAD), vol. 39, no. 2, pp. 547-551, Feb. 2020. PROJECT
Credit Fraud Detection Apr. 2020
● Used 5-folds cross validation before applying resampling methods in order to avoid overfitting.
● Overcame data imbalance problem using SMOTE (oversampling) and EasyEnsemble (undersampling) techniques.
● Applied logistic regression and random forest for classifying fraudulent transaction.
● Achieved a recall score of 0.9042 and 0.8847, F1 score of 0.9294 and 0.9123 for the two algorithms, respectively. Next Word Prediction Feb. 2020
● Preprocessed original corpus through tokenization, punctuations and stop-words removal, and lemmatization.
● Applied n-grams (bigram, trigram) and 2-layer LSTM language models in predicting next word.
● Achieved perplexity scores of 178.62, 117.83 for n-grams and 47.24 after 50 epochs for LSTM model.
● Output the next top five candidate words for the current input word, just like how a smart keyboard works. Housing Price Prediction Sept. 2019
● Predicted missing data such as city information by using longitude, latitude, or zip-code, and vise versa.
● Handled high-dimensional data using PCA to reduce multicollinearity problem in the original data.
● Adopted regularized linear regression and random forest regression models to forecast future housing prices.
● Attained Root Mean Squared Logarithmic Error(RMSLE) of 0.1663 and 0.1065, respectively.