Post Job Free

Resume

Sign in

Data Scientist Python

Location:
Alpharetta, GA
Posted:
June 22, 2020

Contact this candidate

Resume:

*

Guifeng Li

**** ********** *******, **********, ** 30022 Cell: 330-***-**** E-mail: addzvm@r.postjobfree.com Visa: Green Card Qualifications Summary

Throughout understanding various statistical methods/models and machine learning techniques such as regression, random forest, XGBOOST and deep learning as well as their implementations in retail fields;

Strong capability to compile, manage, analyze and validate large size structure and unstructured data (such as NLP), and well manage and generate reports, tables, and graphs;

Proficiency with advanced statistical analysis tools such as Python and Spark. Strong capability to develop methods/models on various levels to meet business requests;

Experience in assisting with optimization process using Python (numpy, pandas, sklearn, keras, nltk, tensor flow, pytorch, etc);

Excellent ability to manage project, solve problem and exhibit interpersonal skills. Work independently and good team worker;

Skills: Proficiency in Python, PySpark, SAS, R, Matlab, Tableau, Origin, Windows/UNIX, and MS office; Familiar with predictive modeling library and work with MySQL database

Professional Experience

1. 03/2020~ current, Lead Data Scientist, Macy's Technology, Duluth, GA 30097

Development of universal model for e-Commerce order prediction using advanced machine learning method (XGBOOST)

Apply PySpark on existing order prediction model 2. 11/2018~ 03/2020, Data Scientist, Macy's Technology, Duluth, GA 30097

Improvement of order prediction on types (BOSS, BOPS) from e-Commerce by combining sales/promotion information

Development of model for predicting time dependent individual store transactions value by using random forest and XGBOOST

System metrics prediction (CPU, Transaction and Response time) on application level by regression

Order drop prediction model has been tested by logistic regression

Anomaly detection of server and JVMs by using its key metrics such as CPU, Transactions, and Response Time

Health score and network model has been built for Macy’s ecommerce online shopping process

Order loss number from online has been calculated by using system metrics correlation 3. 1/2017~ 11/2018, Associate Data Scientist, Macy's Technology, Duluth, GA 30097

Development of capacity planning as service (CPaaS) tools for e-Commerce infrastructure

Extend CpaaS tools to all the applications on different data centers of Macys.com and Bloomingdales.com

Independently do order prediction by similar day concept and regression 4. 1/2016~1/2017, EDP Analyst, Macy's System and Technology, Duluth, GA 30097

Understanding of holiday capacity project, and manage holiday capacity on Bloomingdales.com

Development of automatic process on holiday capacity project by data science methods 5. 5/2005~ 8/2014, Postdoc Researcher, U.S Universities (Emory Univ. etc.) National Institutes of Health (NIH) and National Science Foundation (NSF) project have been involved. Research topics focused on energy flow in enzyme, molecular conformation change on polymer/oil interface. During this process, many data analysis methods such as singular value decomposition (SVD) and global fitting have been used in experimental data analysis by aid of Matlab and origin. 25+ top peer-reviewed publications have been published; Certifications

(1) SAS certified base programmer for SAS 9; (2) SAS certified advanced programmer for SAS 9; Educations

M.S., Statistics, Industrial & Systems Engineering (ISyE), Georgia Institute of Technology, Atlanta, GA

Ph.D, Materials Science, Division of Material Science, Hokkaido University, Sapporo, Japan



Contact this candidate