Kaiwen Zhang
Tysons, VA 319-***-**** *****.*******@************.***
EDUCATION
Northeastern University Boston, MA
Master of Science in Analytics, GPA: 3.35/4.0 March 2020 Coursework: Statistics for Business; Financial Accounting; Macroeconomics; Managerial Accounting; Entrepreneurship and Innovation; Marketing; Information System; Business Computing; Calculus and Matrix Algebra for Business
University of Iowa Iowa City, IA
Bachelor of Science in Economics May 2018
PROFESSIONAL EXPERIENCE
Alibaba Beijing, China
Business Analyst Intern January 2020 – March 2020
• Analyzed conversion rate from internal marketing data through statistical analysis (A/B testing), exploratory data analysis & visualization
• Extracted marketing data using SQL and manipulate data with Python & Excel including data cleaning, imputation & feature transformation
• Build a K-means clustering powered customer segmentation model based on User Persona data
• Created UI dashboard with Tableau to visualize metric trend and extracted business features affecting testing result
Universal Processing LLC New York, NY
Data Analyst Intern June 2019 – September 2019
• Perform data process & quality check over sales data by using MySQL workbench and Python on weekly base.
• Conducted systematic table operation including merging, joining & etc. with SQL based on business need.
• Created interactive dashboard in Tableau to track sales performance data and used Salesforce to assign daily task to each sales agent
PROJECT EXPERIENCE
Credit card transaction fraud detection
• Data Processing: handled missing value, encoded categorical variables, resampled imbalance data, and etc.
• Exploratory Data Analysis: analyzed distribution and relationship among variables by histogram, scatter plot, box-plot, and Chi-square test
• Feature Engineering: generated 10+ features such as duplicated transactions, the number of attempts
• Model Training, Hyper-parameter tuning and Model Selection: tuned hyper-parameters of Random Forest, Gradient Boosting Tree, and select best models among them and Logistic Regression, SVM, and Naive Bayes by 5-fold cross validation in training set
• Model Evaluation: achieved 0.93 f1-score in testing set, and visualized ROC-curve NLP September 2019 – October 2019
• Scraped 10-year historical 10-Q and 10-K corporate filings from SEC.gov by Python (Beautifulsoup)
• Processed 300GB text data such as: removing stopping words, numbers, and tables, and separating documents into paragraphs by NLTK and regular expression.
• Generated features such as sentiment scores, bag of words, and similarity measures
• Constructed monthly rebalanced stock portfolios based on various features, and backtested the performance, which achieved 1.1 Sharpe Ratio
SKILLS
Programming: SQL, Python, Tableau, Microsoft Office (Excel, Power Point), PowerBI, Minitab Statistical Software