Post Job Free
Sign in

Data Analyst Machine Learning

Location:
Secaucus, NJ
Posted:
June 06, 2024

Contact this candidate

Resume:

Yue Liu

872-***-**** ********@*****.*** www.linkedin.com/in/yue-liu-475******-**** Frank E Rodgers Blvd S, Harrison, NJ EDUCATION

University of Chicago, Chicago, IL Sept 2021 – Dec 2022 Master of Science in Applied Data Science GPA:3.89 University of Rochester, Rochester, NY Sept 2016 – May 2020 Bachelor of Science in Applied Mathematics, Business: Finance GPA:3.67 SKILLS & QUALIFICATIONS

● Statistics and Machine learning: Linear Regression, Probability, Hypothesis Testing, SVM, Random Forest, XGBoost, AdaBoost; Deep learning (TensorFlow, Keras); NLP

● Coding and software: Python (Pandas, NumPy, scikit-learn, Seaborn); Excel; Hive, MySQL; Tableau; Hadoop; Google cloud platform, Bigqeury; Spark & Pyspark; MongoDB, Neo4j PROFESSIONAL EXPERIENCE

Ready Warehousing & Logistics (3PL) Oct 2023- May 2024 Data Analyst Python, SQL, Tableau Secaucus, NJ

● Leveraged Python to transform data and conduct a comprehensive analysis of storage capacities, turnover rate, demand variation, and labor cost with visualization including Pareto & ABC Analysis; presented strategic recommendations for charge adjustment and resource allocation to stakeholders, resulting in a successful 10% increase in profit

● Utilized Tableau to create daily summarization dashboards on operational performance, facilitating close monitoring of staff supply and demand for operation headcount optimization

● Designed and implemented a predictive inventory control system incorporating machine learning models, accounting for variability in arrival times of ocean freight shipments and historical outbound orders to optimize SKU-level inventory, mitigating stockouts and overstocking for enhanced customer satisfaction and turnover rate

● Automized data entry process leveraging Python (Tesseract, Regex) to extract order details of over 30 SKUs from image-based PDFs, containing cost estimation and calculated routing; seamlessly integrated the solution into the Warehouse Management System (WMS), yielding a notable 11% reduction in labor costs

● Maintained SQL database and served as analytical support to perform ad-hoc queries with MySQL to address client inquiries regarding charges, orders, and storage inventory

ATZ Trading Inc July 2023- Oct 2023

Data Analyst, Intern Python, Excel New York, NY

● Leveraged Python to analyze customer purchasing behavior, identified important churned customers, generated the customer contact list, and coordinated with the sales team for customized promotion, improving customer retention by 16%

● Developed key performance indicators (KPIs) to monitor and report fetching accuracy, frequency of errors, and impact on delivery performance; automated summarization order quantity for each delivery route for the verification process, reducing error fetching rate by 80%

● Utilized market basket analysis(Python) to identify trending sales, slow-selling products, and bundle sale items, adjusting promotional strategies to optimize in-app and poster promotions, resulting in a 9% increase in average order value and conversion rate among Japanese restaurant customers in New York and New Jersey V V Renter Inc March 2023 – July 2023

Data Scientist, Intern MySQL, Tableau Remote (Alhambra, CA)

● Located important factors that influence the number of rentals by building ML models SVM, random forest, and AdaBoost with feature importance(Python); suggested strategies to increase rentals including rental price, discount, different pickup location charges, etc; recommended 5 car models with rental price for upcoming business expansion with breakeven points

● Flattened a nested JSON file, cleaned and normalized car rental data into 5 tables from multiple sources; updated data and performed ad-hoc querying with MySQL to monitor monthly rental rates, price changes, and promotions IRI Worldwide, April 2022 – Dec 2022

Capstone Project Researcher Python Chicago, IL

● Led a group of 4 to research Synthetic Data Generation models and used cases, customizing statistical evaluation metrics such as boundary adherence and columns correlation, and developed and tested 4 Generative Adversarial Network models to opt for the Copula GAN model as the most promising with a statistical similarity score of 97%

● Reported to the R&D team monthly with progress, delivered a demo and a 30-page academic research paper, and showcased the final deck to both the R&D team and vice president to guide further development in addressing data privacy issues PROJECT EXPERIENCE

Yelp Recommendation System Big Data Project – PySpark and GCP January 2022 – March 2022

● Set up a Google Cloud Cluster environment and imported 11GB of nested JSON data into BigQuery; performed Explanatory Data Analysis for 8 tables on restaurant information (PySpark.sql, Seaborn) in Dataproc

● Developed NLP classification models on user reviews (Bert) to generate 7 clusters of restaurants, and further built ML models(Random Forest, Support Vector Machine, and AdaBoost) including those 7 clusters to recommend the top 10 restaurants to each user (PySpark ML), ultimately identifying the best model as Random Forest through Grid Search tuning



Contact this candidate