YIXUAN LU
* ***** ** **** ********* experience
203-***-**** admxpi@r.postjobfree.com www.linkedin.com/in/yixuanlu17 https://github.com/yixuanlu17
PROFESSIONAL EXPERIENCE
Data Analyst at DataReady DFW (Remote, USA) 10/2020 – 03/2021
●Developed scripts for web scraping (in Selenium with Python) to collect data about 10,000 grant records for identifying suitable grants
●Developed a model to predict the most suitable grants using Natural Language Processing (tokenizing & stemming & word vectorization in Python) to identify the top grants for manual review, leading to saving an estimated 80% amount of total research time
●Built complex dashboards (in Tableau) with advanced functions to deliver comprehensive analysis in selecting grant websites
Business Analyst at TWG Companies (Remote, USA) 12/2020 – 03/2021
●Liaised with stakeholders in gathering user requirements and implement SQL queries, providing guidance to testers during the QA process
●Applied regression models based on machine learning algorithms (in Python) to predict the age of the product user to help with product UI/UX design, significantly increasing the market appeal
●Provided project-level analysis – producing required project analysis documentation (business requirements, scope matrix, use cases, sequence diagrams, future state proposals, UAT plan), increasing adoption and customer satisfaction
Data Analyst at Pan-China Certified Public Accountants (Chongqing, China) 06/2018 – 01/2019
●Developed complex SQLs (in BigQuery) to analyze financial transaction data, effectively identifying critical financial issues
●Collaborated with stakeholders to automate work process in handling data-analysis request (in Python)
●Built operational reporting (in Tableau) to stabilize the business and maximize efficiency, resulting in $210,000 in annual incremental revenue
Data Analyst at AXA Insurance (Hong Kong, China) 03/2017 – 09/2017
●Implemented SQLs to collect diversified data (from MySQL); Assisted fund manager in analyzing the market trend of funds (in Tableau)
●Conducted fund-portfolio analysis and built financial modeling (in Python), providing a reference for the fund management
●Predicted customer churn rate and provided customer re-engaged recommendations through survival analysis, achieving 85% accuracy (in SAS)
DATA SCIENCE PROJECTS
Henkel Product Capstone Analysis (Python Tableau) https://git.io/JGvME Aug 2020
●Experienced with large datasets and performed Exploratory Data Analysis (in Python & Tableau) to drive targeted marketing campaigns
●Predicted the sentiment of user reviews using Bidirectional LSTM Model with Keras, achieving 85% accuracy (in Python)
●Diversified marketing campaign strategies on outer packaging, with conducting Sentiment Analysis in Natural Language Processing (in Python)
●Collaborated customer re-engaged recommendations on product launches (in Tableau) through analyzing consumer sentiments & reviews data
Road Accident Severity Prediction Project (Python Tableau) https://git.io/JGv1X June 2020
●Developed machine learning model pipelines (Pandas and Scikit-learn libraries) to determine the most suitable model among 11 classifiers
●Predicted the severity of the road accident using the Gradient Boosting model, increasing the baseline accuracy from 60.9% to 86.7%
●Visualized the feature importance scores, applied GridSearchCV and hyperparameter tuning for 20 features (in Python)
●Created dashboards and applied complex & compound calculations to large & complex data sets (in Tableau), providing better route recommendation for Google Maps
Yamibuy Customer Analysis Project (SQL Python) Dec 2019
●Implemented complex SQLs to analyze 100k+ products data & customers demographic/behaviors & transaction data, contributing to optimal marketing campaign strategies-making
●Built models and conducted clustering analysis to differentiate customer segments (K-means in Python)
●Developed EERD & relations in 3rd normal forms; Designed and implemented optimal databases/tables schema, including data-duplication reduction & data-anomalies elimination & referential-integrity enforcement
TECHNICAL SKILLS
Programming Languages: Python (Scikit-learn, Pandas, NumPy, SciPy, Matplotlib, Seaborn, PySpark), R (RStudio), SQL, SAS,
Big Data: Big Query, Hadoop (Pig. Hive, Sqoop, MapReduce, Apache Spark)
Data Science & Statistics/Machine Learning & Deep Learning Models: Linear/Logistic Regression, Natural Language Processing (NLTK, SpaCy, Gensim, Vader), Ensemble (Bagging Boosting), Feature Engineering, SVM, Naive-Bayes, KNN, K-means, GBDT, Neural Networks & CNNs (TensorFlow, Keras), PCA, Clustering, Time Series (ARIMA), Computer Vision (YOLO, OpenCV, Face Recognition, TensorFlow)
Database/Visualization/ETL: Microsoft SQL Server, MYSQL, Oracle, PostgreSQL, Snowflake; Power BI, Tableau; SSIS
Cloud: AWS (SageMaker, Redshift), GCP
Other Data-Analytic Tools/Skills: Advanced Excel (Macros, Pivot Tables, VLOOKUP), Google Analytics, JMP, SPSS; Data Cleansing/Wrangling/ Segmentation/Mining/Visualization, Business Intelligence
Software/Project Management: Jira, Microsoft Project
EDUCATION
Master of Science, Business Analytics & Project Management Sep 2019 - Dec 2020
University of Connecticut, Stamford, CT
Bachelor of Business Administration, Accounting Sep 2015 - Jun 2019
Chongqing Technology and Business University, Chongqing, China