SUMMARY
*+ year of experience applying machine learning and statistical modeling to extract value from large data sets, including biological data, user behavioral/demographic data and geo-spatial data
2+ years of industrial experience developing data pipelines using R, Python, Apache Hive and Spark
Proficiency in big data analytics tools and libraries (dplyr, pandas, GraphLab, HBase) and visualization tools (Shiny, ggplot2, Tableau), familiar with MATLAB, Java, C/C++
nFamiliar with Linux, shell scripting and command line tools, GitHub and git version control system
Fast learner, group player with strong curiosity and sense of responsibility for work
EDUCATION
Ph. D. in Physics, University of California, Irvine (GPA: 3.75/4) March 2013
M. S. in Condensed Matter Physics, Institute of Physics, CAS June 2005
B. S. in Applied Physics, Tsinghua University, Beijing, China June 2002
EXPERIENCE
Data Scientist, LivingSocial Inc. June 2013-Present
Build machine learning models for multi-channel personalization to increase revenue and user engagement; Coordinate with multiple teams to execute online A-B tests, generate report and communicate result
nBuilt a propensity model based pipeline that matches email message types with user interest at individual user level; resulted in 70% increase in email message types and 5% increase in net revenue per user
Designed and built an offline testing framework and used it to evaluate third party user data
Data exploration and visualization to help business decision making and support new product initiatives
Research Analytics Intern, Adobe Systems Inc. June 2012-Sep 2012
Experimented supervised machine learning methods like neural network, SVM, GBM within the reinforcement learning framework for modeling customer life-time value using weblog data
Graduate Student Researcher, UC Irvine Sep 2006-June 2013
Bayesian modeling, Machine learning, MCMC/sampling methods, Computational systems biology
Developed novel MCMC based parameter inference algorithms for stochastic dynamic systems, applied it to gene regulatory network modeling
Analyzed ChIP-Seq data of histone modification pattern using probabilistic graphical models; developed a parallel algorithm in python for parameter learning that can handle whole genome scale data
nDeveloped a regularized low-rank model to learn sparse correlation between genes using microarray data
Teaching Assistant, Department of Physics & Astronomy Mar 2008-Dec 2011
Gave lectures, led discussions and tutoring sessions for several physics undergraduate courses and one graduate course (Electro-Magnetism Theory)
PROJECTS
KDD cup 04 Quantum physics problem (Machine Learning course project), ranked 1st in class and among top 10 in overall leader board
Stock sentiment analysis and visualization of twitter posts through twitter API
CERTIFICATES AND COURSES
Data Manipulation at Scale: Systems and Algorithms, University of Washington, Coursera
Introduction to Big Data with Apache Spark, UC Berkeley, edX (non-certificate track)
Graduate courses: Machine Learning (CS273A), Numeric Methods, Applied Engr. Math I/II, Stochastic Process (Math 271B), Statistical Physics
HONORS
Distinguished Freshman Reward, Tsinghua University
Regents Fellowship, UCI 2005-2007
SOCIAL ACTIVITIES AND VOLUNTEER EXPERIENCE
President of Chinese Student & Scholar Association at UCI 2007-2008
Member of Sino-American Biomedical & Pharmaceutical Professionals Association
PUBLICATIONS (SELECTED)
Parameter inference for discretely observed stochastic kinetic models using stochastic gradient descent, Wang Y, Christley S, Mjolsness E, Xie X. BMC Systems Biology 2010, 4:99
Efficient latent variable graphical model selection via split-Bregman method, Ye G, Wang Y, Chen Y., Xie X. arXiv:1110.3076
Discovering and mapping chromatin states using a tree hidden Markov model, Biesinger J, Wang Y (co-first author), Xie X. BMC Bioinformatics 2013, 14(Suppl 5)
Virtual screening using molecular simulations, Yang T, Wu JC, Yan C, Wang Y, Luo R, Gonzales MB, Dalby KN, Ren P Proteins 2011, Jun 79(6)
REFERENCES
My Banh (Dept of Physics & Astronomy, UCI) *****@***.*** 949-***-****
Xiaohui Xie (Academic advisor, Dept of Computer Science, UCI) ***@***.***.*** 949-***-****
Julie Del Rosso (LivingSocial Inc.) *****.********@************.*** 202-***-****