Post Job Free
Sign in

Data Scientist Machine Learning

Location:
Chapel Hill, NC
Salary:
100000
Posted:
November 09, 2024

Contact this candidate

Resume:

Jie Yang

Phone: 919-***-**** Email: *******@*****.*** Linkedin: /in/jieyang112/

SUMMARY

Data Scientist with 5+ years of experience in the Retail industry. Proficient in applying Statistical Analysis and Machine Learning modeling in Marketing and Inventory Management. Specialized in Time Series Analysis and Sales Forecasting, data-driven decision-making and optimized resource allocation. Experienced in working with cross- functional teams and communicating with stakeholders. SKILLS

Python, NumPy, Scipy, Pandas, Matplotlib, Seaborn, Scikit-Learn, SQL, PCA, ARIMA, SARIMA, SVM, SHAP, GitHub, Flask, Beautiful Soup, Selenium, TensorFlow, Keras, PyTorch, CNN, RNN, LSTM, NLTK, Natural Language Processing (NLP), Large Language Models (LLMs), Generative AI, AWS, Microsoft Azure, Google Cloud Platform (GCP), Snowflake, Databricks, Hadoop, Spark, Tableau, Power BI, A/B Testing, Dataiku, MATLAB, VBA EXPERIENCE

Techlent Inc. remote

Data Scientist Fellow 06/2023 - Present

Drug Store Sales Predictor

To assist drug store managers in optimizing budget allocation based on sales demand, developed a Sales Forecast Model using time series models as ARIMA, SARIMAX, Prophet, and other regression models as Random Forest, XGBoost.

Collaborated with stakeholders to obtain and analyze the sale and store data sets. Identified SARIMAX as the top-performing model to predict the daily sales with the lowest RMSE. Reduced staff costs by 30% and boosted revenue 30% accordingly by the same margin.

Deployed the SARIMAX model as a Flask API on GCP, streamlining accessibility and utilization. Old Friend Bakery Chapel Hill, NC

Data Scientist and Consultant 01/2023 - Present

Bakery Sale Predictor

To help effectively predict the weekly bakery sales, built a Sales Forecasting Engine using time series forecasting models and classification regression models.

Collaborated with sales and bakery teams to gather and analyze sales and product data. Utilized Linear Regression, Random Forest, XGBoost, and ARIMA, SARIMAX models to predict sales trends, accounting for stationarity and seasonality.

XGBoost emerged as the top-performing model, achieving a test accuracy score 0.87, surpassing the random- selection benchmark by 20%. This provided tailored production schedule for the high-demand days and resulted in a 40% cost decrease and revenue increase.

South China University of Technology Guangzhou, China Data Scientist and Professor 06/2004 - 03/2018

Spam Detector on Mobile Terminal

To identify the prevalent issue of spam and scam text messages received by mobile users, devised an end-to-end spam detection model.

Collaborated with user managers and text database managers, built a regression model to rank text content based on the presence of suspicious keywords to assess the risk level. The model resulted in High-risk messages alerts enabling swift detection and mitigation of harmful content. Decreased 40% spam and scam messages compared to random-check benchmark. Bio Products and Service Customer Detector

To enhance sales efficiency and revenue through targeted email marketing, developed a Customer Detector for Bio-Cytogen Co.

Collaborated with sales teams to gather historical sales data and biological/medical publications from journals and conferences. Engineered domain-specific features and trained binary classification models to identify valuable customers.

Achieved 90% improvement over traditional methods, doubling revenue with existing sales investments. EDUCATION

Wuhan University Wuhan, China

Ph. D. in Computer Science 09/2001 - 06/2004

Master in Computer Science 07/1999 - 06/2001

Bachelor in Computer Science 09/1994 - 07/1998



Contact this candidate