Post Job Free
Sign in

Data Python

Location:
Phoenix, AZ
Posted:
March 30, 2020

Contact this candidate

Resume:

Jiaer Yuan (Vanessa) Phoenix, AZ ***** 458-***-**** adci6r@r.postjobfree.com www.linkedin.com/in/yuanjiaer

Educational Qualifications

Arizona State University, W. P. Carey School of Business (Tempe, AZ) Aug 2018 – May 2019

Master of Science in Business Analytics

University of Oregon, Department of Economics (Eugene, OR) Sep 2014 – Sep 2017

Bachelor of Science in Economics, Minor in Business Administration

Analytical Skills

Tableau, Power BI, Excel

SQL, MySQL, matplotlib

Python, sklearn, pandas

Azure Machine Learning

R, cart, caret, nFactors

IBM SPSS Modeler, SAS

AWS, EC2, EMR, Stata

Precision Tree, @RISK

Financial Modeling

Statistical Modeling

Decision Modeling

Risk Optimization

Time Series Analysis

Cost Analysis, Forecasting

Marketing Analytics, PPT

Sensitivity Analysis

Professional Experience (2+ years)

Enterprise Analytics Engineer, Best Western Hotels & Resorts, Phoenix, AZ Aug 2019 – Mar 2020

AWS: Generating reports on AWS QuickSight using data from Amazon Redshift DB. Using complex SQL queries with joins and functions to aggregate data from multiple tables. Migrating reports and visualizations from Cognos for UAT uses.

Customer Aggregation: Matched customers using Levenshtein distance from customer information and hotel booking history tables. Created accurate customer behavior data on Amazon Sagemaker.

Data Lake & ETL: Utilizes Amazon Glue crawlers to scan data from S3 and pushed data to Redshift to use it on QuickSight. Used python AWS libraries and SQL code to do Extract Transform and Load data.

Workflow: Used MS Visio to create visualizations for ETL job processes.

Management: Manage and optimize processes for data intake, validation, mining and engineering as well as modeling, visualization and communication deliverable.

Advanced Data Analytics Intern, ASU Admission Office, Tempe, AZ Jan 2019 – April 2019

Optimization: Improved decision-making process for purchasing leads and maximizing enrollments of first-time freshmen at ASU. Reduced purchase redundancy from leads by collating and cleaning data from multiple sources.

Modeling: Preprocessed and transformed data in Python. Encoded categorical features and trained a logistic machine learning algorithm using sklearn. Predicted the conversion probability of a student with 90% accuracy.

Analysis: Visualized relationships and trends among features and clusters in data set using Tableau analysis.

Business Assistant (Supply Chain), Lenovo, Beijing, China April 2018 – August 2018

KPI tracking: Used SQL queries to create quarterly performance reports for small and medium businesses. Planned for next quarter using historical performance to achieve KPI’s for China Region Supply Chain Department.

Supply Chain: Assisted supply chain manager to fulfill order releases and accounting department to identify reporting errors in the database. Ensured daily work efficiency client satisfaction for 4 different product lines.

Order Fulfillment: Built connections between sales and clients by interfacing with cross functional teams which included the inside sales teams and supporting consulting services to satisfy order fulfillment services for clients.

Marketing Associate, State Farm, Eugene, OR June 2016 – Sept 2017

Forecasting: Analyzed and calculated insurance industry trends & future needs by researching autonomous cars’ effects on insurance industry. Identified relationship among the number of accidents and insurance claims.

Support: Proposed quarterly marketing plan for the insurance agent. Initiated connections with Chinese customers by introducing different types of insurances. Helped them file claims and make payments.

Engagement: Engaged an additional 10% international customers being actively involved in events and organizations. Acted as a State Farm campus ambassador for events organized across different universities.

Applied Projects and Coursework

Premium Subscription Market Segmentation Model, Marketing Analytics in R

Performed unsupervised k-means clustering to create clusters and identified market segment composition.

Identified the best number of clusters using Elbow Method for K-Means and measured cluster proportions.

Created hierarchical clusters & visualized using dendrograms. Sliced the dendrograms to create optimum clusters.

Demand Forecasting, Time Series Analysis using Excel

Forecasted the demand of milk in a grocery store. Identified randomness, trend and seasonality in the data.

Used moving averages, Holt’s model, Winter’s model and ARIMA models. Picked the one with the least error.

Advanced Tableau Product Sales Visualization, Business Analytics Strategy

Created bar, histogram, line, bullet, heatmap, pie, scatter, bubble, dual axis and area charts for data analysis.

Used multidimensional segmentation of data and calculated with Levels of Details to plot across dimensions.

Bank Marketing, Statistical Analysis in R, Kaggle

Created Cross Table and performed chi square test to see independence between pairs of categorical features.

Built a logistic regression model to predict the probability of a person buying the long-term deposit.

Portfolio Financial Modeling, Probabilistic Simulation using Palisade’s @RISK in Excel

Simulated 1000 iterations for normally distributed return rates with investments and repayments on portfolio.

Customer Churn Prediction, Machine Learning in IBM SPSS Modeler

Built a machine learning model to identify cellphone users most likely to churn based a huge set of features.

Performed text mining and clustering in IBM SPSS Modeler. Analyzed clusters using Excel Pivot tables.

Identified feature importance and provided recommendations for customer retention and improved service.

Vehicle Insurance Claim Prediction, Statistical Analysis in R, Kaggle

Performed factor analysis for dimensionality reduction. Used under sampling to create a balanced dataset.

Used univariate analysis using Decision Trees to identify feature importance. Scaled and transformed features.

Created a glm logistic regression model to predict claim probabilities with balanced accuracy and f1 scores.

Housing Prices Prediction, Linear Regression in Python, Kaggle

Created a linear regression model in python and predicted housing prices in the United States.

Optimized regression model by including interaction terms using Polynomial Features from sklearn.

Used Ridge and Lasso regressions to include regularization by including only important coefficients.

Calculated correlations for numeric factors, analyzed coefficients and measured slope significance using p-values.

Compared model significance using R-square and adjusted R-square. Analyzed residual plots to verify model.

Online Shopper Purchasing Intention, Predictive modeling in R, UCI Repository

Built a classification model for data with 12000 rows and 18 columns (9 numeric and 8 categoric variables).

Used Logistic regression, Random Forest, LDA and Naïve Bayes’ classification algorithms in R.

Performed factor analysis to find important latent factors and measured uniqueness. Achieved 88% accuracy.

Lean Six Sigma, Statistical Process Control using MiniTab

Implemented statistical process control using confidence intervals to measure manufacturing inconsistencies.

Compared null & alternate hypothesis for variables. Compared means using p-values & confidence intervals.

IBM HR Analytics Employee Attrition & Performance, Analysis in Python, Kaggle

Analyzed relationships between attrition, gender, income, education and work environment using python.

Used matplotlib and seaborn to visualize data patterns, distributions and correlations across dimensions.

Used Decision Tree algorithm to train a model with 4 factors to identify the most relevant factors for attrition.



Contact this candidate