Post Job Free
Sign in

Data Analyst Machine Learning

Location:
Georgia
Posted:
August 08, 2023

Contact this candidate

Resume:

Xinyu Lou

+1-857-***-**** **********@*****.*** LinkedIn : linkedin.com/in/xinyu-lou/ Boston, MA

EDUCATION

Northeastern University Boston, MA Sept 2021 – Jul 2023 Master in Analytics

Coursework: Enterprise/Predictive Analytics, Visualization, Big Data, Database Management, Data Mining, Machine Learning China University of Mining and Technology Beijing, China Sept 2017 - Jun 2021 Bachelor of Science in Mathematics

Coursework: Mathematical analysis, Probability theory, Theory of Complex Functions, Mathematical Modelling, Numerical Analysis,Time Series, Database Theory, Applied Stochastic Process, Statistical Models, Mathematical Statistics SKILLS

• Data Analytics: A/B Testing, Casual Inference, Tableau, PowerBI, SAS, Google Analytics, MS Excel

• Machine Learning: Python (Scikit-Learn), R, SPSS Predictive Modeling, Hypotheses testing, Regression Analysis

• Database Processing: SQL, AWS S3, Google Cloud BigQuery Data Engineering: Spark, Hadoop, Hive, MapReduce WORK EXPERIENCE

DATA SCIENTIST San Francisco, CA

DeFiner Jan 2023 - present

• Extracted and processed 14GB+ transaction and event log data related to block chain and identified 5+ KPIs including transaction times, transaction value and transaction frequency etc.

• Developed dashboard report automation system to create 4 interactive Tableau reports weekly with transaction performance database, saved 30% of the reporting time

• Performed anomalous detection algorithms including Isolation Forest & K-Means and statistical analysis for the sale price data in order to reduce potential errors, successfully improved the data accuracy for 13% DATA ANALYST Gansu, China

Gansu Ministry of Construction Dec 2017 - Aug 2020

• Designed surveys to gather 1k+ examinees' satisfaction data to determine whether the examinee's attitude cause the decline in the exam participation

• Performed sentiment analysis by using BERT to analyze the examinee's attitude regarding to the exam

• Transformed the exam registration information of 34 province into latitude and longitude by using Geocoding API

• Visualized 40k+ exam registration information to analyze the concentration of the location of exam registration

• Communicated with the team of 5 to present a comprehensive analysis reported to the government regarding to the change for exam registration location, successfully increased the number of tests by 46.6% PROJECTS

Banking Customer Churn Prediction and Analysis (Git Link)

• Developed the automated ETL pipelines by using SageMakers and EC2 service, load bank customer data from CSV files, improved the working efficiency by 20%

• Predicted the customer churn rates by using Random Forest, Logistic Regression, and K-Nearest Neighbors, reached the optimal model with an accuracy of 86.5%

• Improved the model performance of classification (accuracy, f1 or AUC score) via 5-fold cross-validation technique for 4% and identified top factors that influenced the results Amazon Customer Reviews Analysis and Topic Modeling (Git Link)

• Created the web crawler to scrape the review data from the Amazon website from HTTP requests and regular expressions

• Preprocessed review text by tokenization, stemming, removing stop words and extracted features by TF-IDF

• Extracted the top 10+ popular topics using K-means clustering and Latent Dirichlet Analysis MovieLens Movie Recommendation system

• Processed and analyzed 27M ratings to 58K movies by 280K users’ datasets and conducted OLAP with Spark SQL

• Created the recommendation system (ALS model) and predicted the ratings for the movies and made specific recommendation based on users’ preferences with Python

• Tuned the hyperparameters using 5-fold CV and applied the optimal hyperparameters on final model, reached RMSE = 0.88



Contact this candidate