Data Science

Cambridge, Massachusetts, United States
April 23, 2018

Alice Wong

Summary & Objectives

A data scientist with six years of experience in the field with strong technical, business strategy and communication skills looking for a long-term future supported by a stable environment. Education

Harvard University

M.S. in Biostatistics May 2012

Master’s Thesis: Multiple-Sequence Iterative Imputation, supervised by Dr. Nan Laird University of Pennsylvania

Postbaccalaureate Graduate Studies Dec 2009

B.A. in Economics cum laude; Distinction in Economics May 2008 Minor: Mathematics


Hyperplane Consulting S an Francisco, CA, Boston, MA

Contract Data Scientist Feb 2017-present

• For mobile games publisher Gameloft, prototyped a RandomForest model for paying user detection put into production.

• Used a variety of machine learning models to predicted cohort cumulative lifetime value for DMLegends using the first 3 days’ data with an 18% percentage error.

• Served as a team lead across various projects by providing recommendations on optimizing machine learning/artificial intelligence models, metrics for evaluating models, best practices for production and data to collect.

• For Suffolk Construction, performed statistical tests comparing Virtual Design and Construction models v standard ones on outcomes such as time to resolution of Requests For Information.

• Researched the monetary costs and benefits of innovation pilots such as drone capture technologies, virtual reality showrooms and real-time sensors.

• Made recommendations for data governance through standardization of spaghetti data through fuzzy matching of contractor names, graph theory to produce table schemas, etc.

• Analyzed Trade Partner surveys through Natural Language Processing, A/B testing and correlation tests

Houghton Mifflin Harcourt Boston, MA

D ata Scientist May 2015-Oct 2016

• Writing algorithms to predict performance for students over time using longitudinal mixed effects models and identifying students at risk of poor performance for adaptive educational software.

• Predicted IT issues using natural language processing, dynamic Bayesian networks and zero-inflated longitudinal mixed effects models.

• Predicted retention of student usage of educational software and other products over time using both regression and machine learning methodologies such as support vector machines, gradient-boosted trees, k-nearest-neighbors, sequential nearest neighbors, naive Bayes classifiers, RandomForest and survival analysis.

• Created prototypes of algorithm-driven products such as a literary appreciation/creative writing app centered on natural language processing and an algorithmic walking schoolbus centered on minimum spanning trees and geographic information systems.

• Performed various ad hoc analyses displayed on an interactive dashboard to track, for example, platform issue ticket causes, the effects of repeated effort on student performance, the impact of completing conceptual prerequisites.

• Served as a company-wide bridge across various departments to make infrastructure decisions, improve e-commerce outcomes and explore synergies between different textbooks’ instructional methods.

• Provided business development recommendations to HMH Marketplace and Labs.

• Led two interns in their work on interactive dashboards and app prototypes. Localytics B oston, MA

D ata Scientist July 2014-Apr 2015

• Developed retention, engagement and cumulative lifetime value predictive algorithms for (micro-)segmentation, personalization and remarketing using methodologies such as naive Bayes (text) classifiers, market basket analysis, support vector machines, randomForests, k-nearest neighbors and sequential k-nearest neighbors, logistic regression and survival analysis.

• Detected event funnels leading to conversion events using various data mining techniques such as graph algorithms.

• Assisted in creating an online dashboard for non-technical teams to obtain metrics in real time.

Constant Contact Waltham, MA

Senior Analyst Nov 2013-July 2014

• Using high-dimensional algorithms, predicted churn above 50 times the random rate of predicting churn while keeping the false positive rate below random.

• Using email features such as subject categories, content, delivery time, images and structural features to predict open and click rates primarily through text classification algorithms.

• Evaluated cross-selling success for Constant Contact and SinglePlatform products through longitudinal mixed models of clients’ menu (on Yelp, etc.) views.

• Product strategy: a customized app tool to display a client’s menu view forecasts over time.

comScore R eston, VA

Senior Analyst Aug 2012-Oct 2013

• Used various machine learning methodologies to predict the demographic profiles of web visitors from various large, mostly app event-based data sets

• Increased the lift over the pre-existing model by 0.30 and reduced the dependence of model performance on measurement precision in web visitation.

• Developed algorithms such as graph algorithms for the optimal clustering of cookies into likely web user profiles.

• Provided ad hoc support such as parsing search queries and A/B testing. New England Research Institutes Watertown, MA

Statistics Intern Jun-Sep 2011

• Analyzed and managed limited access public-use data for NIH by merging large datasets from Thalassemia Cohort Research Network, producing frequency tables, removing outliers, randomly generating identification numbers and performing complex date conversions in SAS.

• Wrote codebooks and searched databases using SQL to obtain relevant information.

• Completed tasks one or two months ahead of schedule. Summary of Additional Work Experience

Ministry of Trade and Industry

• Served as Asst Dir of Trade for the Asia-Pacific Economic Cooperation Policy

(APEC) during Singapore’s 2009 hosting year.

• Under the Ministry’s statutory board Competition Commission of Singapore, detected fraud or bid-rigging using machine learning as Strategic Planning Intern.

• Under the Ministry’s statutory board Economic Development Board of Singapore, analyzed the economic potential of fuel cells in Singapore to assist in Singapore’s fuel cell roadmap


R, SAS, Stata, Python, SQL (Sybase, Oracle, PostGres, MySQL, Vertica), PHP, HTML, Spark, Azure Languages

English (native), Mandarin (native), Cantonese (native), French (fluent), Spanish (basic) Websites/Portfolio,

