Sign in

Data Sales

New York City, NY
March 27, 2020

Contact this candidate


Qianqian Wu

New York · 646-***-**** · · Github · LinkedIn


• Specialties: Machine learning, Statistical analysis (Time series analysis, Clustering, Probabilistic models), Data mining, Natural language processing, Deep learning, Visualization, Quantitative modeling

• Tools: Python (Sklearn, Nltk, Spacy, Pandas, Numpy, Scipy, Matplotlib), R (Tidyverse, gglpot2), SQL, Tableau

• Big Data: Hadoop, Spark, MongoDB, MySQL, PostgreSQL, ETL, TensorFlow, AWS EC2, Airflow EDUCATION

• Columbia University, M.S. in Applied Analytics, STEM, GPA 3.8 Dec. 2019

• Zhongnan University of Economics and Law, B.B.A. in Accounting, GPA 3.5 Jun. 2014 PROFESSIONAL EXPERIENCE

nQ Medical Boston, Massachusetts

Data Capstone Research Consultant - Data Science Department Sep. 2019 - Dec. 2019

• Analyzed typing patterns of Parkinson’s patients vs. control in mobile devices to finally obtain FDA approval

• Flattened the multi-dimensional dataset in MongoDB, added summarized statistics features for patient-level data

• Tackled skewed age distribution and unbalanced population size by propensity-matched session-level data

• Applied SVM, XGB, Decision tree, Random forest, Naive bayes with cross validation; Proposed GLM and ensemble to aggregate prediction of the previous models to the patient level for future prediction purpose Palm • The Drive final Capital model LLC is proved (6 portfolio to be robust companies with 80% became accuracy, unicorns) compared with 59% in baseline SVM New York, model United States Investment Analyst Intern - Investment Division Jan. 2019 - May 2019

• Covered E-commerce and MedTech, initiated interviews with 10+ entrepreneurs, presented 1 deal to IC meeting

• Predicted GMV of an Ecommerce startup by generating alternative data of price and units sold to validate its sales Sinovation prediction Ventures model (before Top 5 deciding Venture whether Capital Institution to make subsequent in China) investment, improved the investment Beijing, effectiveness China Investment Analyst Intern - Investment Division May 2019 - Aug. 2019

• Covered SaaS and AI, interviewed 20+ startups, narrowed down to 8 by my evaluation framework, 3 was approved

• Predicted sales by Neural Network with regularized Adam Optimizer using one-week time-lagged sales, holiday China indicator Merchants and other Securities macroeconomic Ltd. (Top 5 features Investment as inputs, bank in trained China) on random forest, improved accuracy Shenzhen, by 30%China IBD Associate Intern - Investment Banking Division Feb. 2018 - June 2018

• Led a 4-member team to issue 3 $300M CMBS deals, led C-level to resolve major issues, save 25% in time

• Built the valuation model to predict future cash flows under analysis of default and prepayment risks; scraped Deloitte relevant Touche news Tohmatsu coverage and Ltd. applied the score of sentiment analysis as a factor in valuation model Shenzhen, China Senior Associate - Assurance Department Oct. 2014 - Apr. 2017

• Led a 5-member team in IPO due diligence and spin-off processes; built profit forecasting and DCF model

• Conducted a consulting engagement of refining supply chain management, composed advisory report by leading cross-functional collaboration, and obtained agreement of clients’ C-suite via tactical negotiation

• Managed multiple audit engagements, fixed $8M misstatement via quantitative risk management analysis; established automated models to address major issues in financial reports PROJECT NLP: Topic EXPERIENCE Modeling and Sentiment Analysis of Yelp Data Challenge in Python

• Preprocessed review text by w2v trained on data from a different city with Spark MLlib and SimHash

• Implemented topic modeling by trained LDA model and add filter of review search by building taxonomy

• Applied sentiment analysis on stemmed words and labeled them according to stars, used Naïve Bayes classifier as baseline, tried Logistic Regression and Linear SVM classifier and improved accuracy to 91% Deep • Obtained learning: polarity Convert score, 1500 conducted Images to feature Super-Resolution selection to in predict Python stars with Naïve bayes, Gradient boosting, etc.

• Parallelized the train function on GCP for baseline GBM model, improved performance by 10-fold cross validation and tuning 20 sets of parameters, determined optimal learning rate at 0.0001 and depth at 15

• Built machine learning pipeline to resize low-resolution (LR) by Cubic interpolation, cropped patches of LR image as features and the corresponding HR image as labels, converted RGB into YCrCB to save bandwidth

• Applied SRCNN using TensorFlow, saved 50% time on feature engineering, and 81% time on model training SQL compared ETL Pipeline to GBM, and Dashboard: 53 minutes vs. Analytics 6 hours, of improved Medicare MSE Data by in 64% Python and achieved and PostgreSQL 0.001

• Extracted data from API, explored data and designed data schema based on the 3NF normalization plan

• Cleaned data and developed database and ETL pipeline in PostgreSQL and Cassandra to optimize queries

• Created reports in Tableau Desktop using features - Tableau Function, Connector, Dashboard coloring, filtering HONORS • Generated & CERTIFICATES Tableau BI tools to create dashboards proto types based on appropriate use cases of end users

• Deloitte Impact Award (7/600+), and regularly honored for top ranking on performance appraisals

• Certified Public Accountant (CPA) and Association of Charted Certified Accountant (ACCA) license holder

Contact this candidate