Guifeng Li
**** ********** *******, **********, ** 30022 Cell: 330-***-**** E-mail: addzvm@r.postjobfree.com Visa: Green Card Qualifications Summary
Throughout understanding various statistical methods/models and machine learning techniques such as regression, random forest, XGBOOST and deep learning as well as their implementations in retail fields;
Strong capability to compile, manage, analyze and validate large size structure and unstructured data (such as NLP), and well manage and generate reports, tables, and graphs;
Proficiency with advanced statistical analysis tools such as Python and Spark. Strong capability to develop methods/models on various levels to meet business requests;
Experience in assisting with optimization process using Python (numpy, pandas, sklearn, keras, nltk, tensor flow, pytorch, etc);
Excellent ability to manage project, solve problem and exhibit interpersonal skills. Work independently and good team worker;
Skills: Proficiency in Python, PySpark, SAS, R, Matlab, Tableau, Origin, Windows/UNIX, and MS office; Familiar with predictive modeling library and work with MySQL database
Professional Experience
1. 03/2020~ current, Lead Data Scientist, Macy's Technology, Duluth, GA 30097
Development of universal model for e-Commerce order prediction using advanced machine learning method (XGBOOST)
Apply PySpark on existing order prediction model 2. 11/2018~ 03/2020, Data Scientist, Macy's Technology, Duluth, GA 30097
Improvement of order prediction on types (BOSS, BOPS) from e-Commerce by combining sales/promotion information
Development of model for predicting time dependent individual store transactions value by using random forest and XGBOOST
System metrics prediction (CPU, Transaction and Response time) on application level by regression
Order drop prediction model has been tested by logistic regression
Anomaly detection of server and JVMs by using its key metrics such as CPU, Transactions, and Response Time
Health score and network model has been built for Macy’s ecommerce online shopping process
Order loss number from online has been calculated by using system metrics correlation 3. 1/2017~ 11/2018, Associate Data Scientist, Macy's Technology, Duluth, GA 30097
Development of capacity planning as service (CPaaS) tools for e-Commerce infrastructure
Extend CpaaS tools to all the applications on different data centers of Macys.com and Bloomingdales.com
Independently do order prediction by similar day concept and regression 4. 1/2016~1/2017, EDP Analyst, Macy's System and Technology, Duluth, GA 30097
Understanding of holiday capacity project, and manage holiday capacity on Bloomingdales.com
Development of automatic process on holiday capacity project by data science methods 5. 5/2005~ 8/2014, Postdoc Researcher, U.S Universities (Emory Univ. etc.) National Institutes of Health (NIH) and National Science Foundation (NSF) project have been involved. Research topics focused on energy flow in enzyme, molecular conformation change on polymer/oil interface. During this process, many data analysis methods such as singular value decomposition (SVD) and global fitting have been used in experimental data analysis by aid of Matlab and origin. 25+ top peer-reviewed publications have been published; Certifications
(1) SAS certified base programmer for SAS 9; (2) SAS certified advanced programmer for SAS 9; Educations
M.S., Statistics, Industrial & Systems Engineering (ISyE), Georgia Institute of Technology, Atlanta, GA
Ph.D, Materials Science, Division of Material Science, Hokkaido University, Sapporo, Japan