FENG SU
*** **** ***, *** ****, NY ***** 347-***-**** adcgym@r.postjobfree.com
PROFESSIONAL SUMMARY
Motivated and results-driven Analyst with proven track record in data analytics and process mapping. Proven ability to identify business needs and develop valuable solutions to drive accuracy and process efficiency. Drives business effectiveness through making recommendations based on data findings. Proficiency in MS Office suite, Python (Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn, Keras), R, SQL, Tableau, and SAS. WORK HISTORY
Data Scientist Intern, 09/2019 to Present
72 Dragons – New York, NY
Designed and generated legible data structure with Python packages such as Pandas and Numpy which helped to process over 1500 missing values and integrated three data sources into one data frame
Set up an added value model by using random forest with Python Scikit-learn to investigate the value added by different features such as directors, actors on the final box office which improve the efficiency of film makers to select suitable professionals
Generated data visualizations by using Tableau, which largely shortened the time used by filmmakers to understand outcomes without digging into complex statistics or data science algorithms Data Scientist Intern, 07/2019 to 09/2019
ICBC – New York, NY
Integrated and preprocessed data by using SQL and Python Pandas, performed multiple features selection methods such as single variable ranking, stepwise forward selection which increased recall and precision by 3% and 2%
Assisted third party technology consulting in fit out a baseline probability of default model with logistic regression by using Python Scikit-learn which increased the recall from 79% to 83% compared to the previous model
Created data tables by using Python data structures and generated data visualization with Python packages such as Matploylib and Seaborn to generate multiple internal reports which reduced the weekly meeting time Data Analyst Intern, 06/2018 to 08/2018
UnionPay E-payment Services Co. LTD – Shanghai, CN
Accessed millions of E-payment trade data by conducting SQL scripts, applied Python to visualize the number and type of failure trade, analyzed causes and impacts of failures and provided solutions in two monthly business reports of UnionPay's agent service
Designed and specified reporting format of UnionPay's products backtrack project by coding in Python instead of manual operation, which accelerated the process of whole project by 2 days before deadline
Designed the client-facing presentation showing the value added by UnionPay's card payment data and spotted the strength and market positioning of the service, summarized with rigorous logic in the PPT EDUCATION
Master of Arts: Statistics, GPA: 3.3, 12/2019
Columbia University in The City of New York - New York, NY
Relevant Coursework: Applied Data Science, Statistical Machine Learning, Linear Regression Model, Statistical Computing & Introduction to Data Science
Bachelor of Science: Mathematics with Financial Mathematics, GPA: 3.7, 06/2018 University of Manchester - Manchester, NY
PROJECT EXPERIENCE
Deep Leaning (Python): Facial Expression Recognition Understanding and constructing convolutional Neural Networks, Apr. 2019 - May. 2019
Designed and adjusted the combinations of convolutional layers with proper kernel size and filter number, used packages in Python such as Keras with TensorFlow backend to develop the baseline model (Le-Net5)
Self-developed an advanced model by adding two more convolutional and max pooling layers based on Le-Net5 and employed a pre-trained model (VGG16) with SVM classifier which increased the prediction accuracy by 25% NLP (Python): Text mining with machine learning models for a corpus of 100,000 crowd-sourced happy moments Understanding the causes of Happiness, Jan. 2019 - Feb. 2019
Produced Python algorithms with package such as "word-cloud" to clean the database and create visualizations
Used Text Mining techniques to calculate the TF-IDF combined with K-mean machine learning algorithm to further investigate the clustering of happy moments and drive predictions
Applied bag-of-words modeling skills and set up a logistic regression model to predict the gender of each happy moment's author