Post Job Free

Resume

Sign in

NLP, Deep Learning, Data visualization

Location:
Austin, TX
Posted:
June 04, 2021

Contact this candidate

Resume:

YIXUAN LU

* ***** ** **** ********* experience

203-***-**** admxpi@r.postjobfree.com www.linkedin.com/in/yixuanlu17 https://github.com/yixuanlu17

PROFESSIONAL EXPERIENCE

Data Analyst at DataReady DFW (Remote, USA) 10/2020 – 03/2021

●Developed scripts for web scraping (in Selenium with Python) to collect data about 10,000 grant records for identifying suitable grants

●Developed a model to predict the most suitable grants using Natural Language Processing (tokenizing & stemming & word vectorization in Python) to identify the top grants for manual review, leading to saving an estimated 80% amount of total research time

●Built complex dashboards (in Tableau) with advanced functions to deliver comprehensive analysis in selecting grant websites

Business Analyst at TWG Companies (Remote, USA) 12/2020 – 03/2021

●Liaised with stakeholders in gathering user requirements and implement SQL queries, providing guidance to testers during the QA process

●Applied regression models based on machine learning algorithms (in Python) to predict the age of the product user to help with product UI/UX design, significantly increasing the market appeal

●Provided project-level analysis – producing required project analysis documentation (business requirements, scope matrix, use cases, sequence diagrams, future state proposals, UAT plan), increasing adoption and customer satisfaction

Data Analyst at Pan-China Certified Public Accountants (Chongqing, China) 06/2018 – 01/2019

●Developed complex SQLs (in BigQuery) to analyze financial transaction data, effectively identifying critical financial issues

●Collaborated with stakeholders to automate work process in handling data-analysis request (in Python)

●Built operational reporting (in Tableau) to stabilize the business and maximize efficiency, resulting in $210,000 in annual incremental revenue

Data Analyst at AXA Insurance (Hong Kong, China) 03/2017 – 09/2017

●Implemented SQLs to collect diversified data (from MySQL); Assisted fund manager in analyzing the market trend of funds (in Tableau)

●Conducted fund-portfolio analysis and built financial modeling (in Python), providing a reference for the fund management

●Predicted customer churn rate and provided customer re-engaged recommendations through survival analysis, achieving 85% accuracy (in SAS)

DATA SCIENCE PROJECTS

Henkel Product Capstone Analysis (Python Tableau) https://git.io/JGvME Aug 2020

●Experienced with large datasets and performed Exploratory Data Analysis (in Python & Tableau) to drive targeted marketing campaigns

●Predicted the sentiment of user reviews using Bidirectional LSTM Model with Keras, achieving 85% accuracy (in Python)

●Diversified marketing campaign strategies on outer packaging, with conducting Sentiment Analysis in Natural Language Processing (in Python)

●Collaborated customer re-engaged recommendations on product launches (in Tableau) through analyzing consumer sentiments & reviews data

Road Accident Severity Prediction Project (Python Tableau) https://git.io/JGv1X June 2020

●Developed machine learning model pipelines (Pandas and Scikit-learn libraries) to determine the most suitable model among 11 classifiers

●Predicted the severity of the road accident using the Gradient Boosting model, increasing the baseline accuracy from 60.9% to 86.7%

●Visualized the feature importance scores, applied GridSearchCV and hyperparameter tuning for 20 features (in Python)

●Created dashboards and applied complex & compound calculations to large & complex data sets (in Tableau), providing better route recommendation for Google Maps

Yamibuy Customer Analysis Project (SQL Python) Dec 2019

●Implemented complex SQLs to analyze 100k+ products data & customers demographic/behaviors & transaction data, contributing to optimal marketing campaign strategies-making

●Built models and conducted clustering analysis to differentiate customer segments (K-means in Python)

●Developed EERD & relations in 3rd normal forms; Designed and implemented optimal databases/tables schema, including data-duplication reduction & data-anomalies elimination & referential-integrity enforcement

TECHNICAL SKILLS

Programming Languages: Python (Scikit-learn, Pandas, NumPy, SciPy, Matplotlib, Seaborn, PySpark), R (RStudio), SQL, SAS,

Big Data: Big Query, Hadoop (Pig. Hive, Sqoop, MapReduce, Apache Spark)

Data Science & Statistics/Machine Learning & Deep Learning Models: Linear/Logistic Regression, Natural Language Processing (NLTK, SpaCy, Gensim, Vader), Ensemble (Bagging Boosting), Feature Engineering, SVM, Naive-Bayes, KNN, K-means, GBDT, Neural Networks & CNNs (TensorFlow, Keras), PCA, Clustering, Time Series (ARIMA), Computer Vision (YOLO, OpenCV, Face Recognition, TensorFlow)

Database/Visualization/ETL: Microsoft SQL Server, MYSQL, Oracle, PostgreSQL, Snowflake; Power BI, Tableau; SSIS

Cloud: AWS (SageMaker, Redshift), GCP

Other Data-Analytic Tools/Skills: Advanced Excel (Macros, Pivot Tables, VLOOKUP), Google Analytics, JMP, SPSS; Data Cleansing/Wrangling/ Segmentation/Mining/Visualization, Business Intelligence

Software/Project Management: Jira, Microsoft Project

EDUCATION

Master of Science, Business Analytics & Project Management Sep 2019 - Dec 2020

University of Connecticut, Stamford, CT

Bachelor of Business Administration, Accounting Sep 2015 - Jun 2019

Chongqing Technology and Business University, Chongqing, China



Contact this candidate