Ruifeng Wang, Ph.D Candidate
Tel: 504-***-**** Email: ******@******.*** or ***********@******.***
Address: ***-*** **** **, *** #4, Poughkeepsie, New York, 12601 LinkedIn: https://www.linkedin.com/in/ruifengwang/ EDUCATION
Tulane University – GPA 3.92/4.0 New Orleans, LA
Doctor of Philosophy in Bio-Statistics Sep. 2014 – 2019 Georgia Institute of Technology Atlanta, GA
Master of Science in Computer Science (Machine Learning Track) Jan. 2018 - 2019 University of California, Irvine Irvine, CA
Master of Science in Statistics Sep. 2011 - 2013
University of California, Irvine Irvine, CA
Master of International Finance Sep. 2010 - 2011
(Registered 6 courses, changed major, graduated with a certificate) SUMMARY
• A highly motivated individual with hybrid experience in model building, programing and end-to-end data analysis including querying, aggregation, analysis and visualization.
• Extensive experience in devising and implementing machine learning methods to solve real-world problems.
• 8 + years of experience in statistical modeling, data analysis, machine learning and deep learning. Methods including: Linear Regression, Logistic Regression, LASSO/Ridge regression, Decision Tree, Random Forest, PCA, KNN, K-Means, Naïve Bayes, Bagging, AdaBoost, Gradient Boosting.
• Strong programming experience in Python, R, SQL, SAS, Linux Shell.
• Proficient in writing parallel computing programs on server to manipulate and analyze TB level big data, e.g., stock tick data from Thomson Reuters.
• Demonstrated ability in delivering high-quality and detail-oriented work and efficiency in working in fast- paced and results-driven environment.
• Excellent teamwork and communication skills.
PROJECTS
Lending club risk adjusted interest rate and default rate prediction:
• Extracted features from raw lending club loan data containing different types, such as categorical, numerical and time series data, imputed missing data using multivariate imputation by chained equation
(MICE) algorithm.
• Performed feature selection, feature engineering through exploratory analysis.
• Fitted linear/logistic regression model with regularization to control for multicollinearity and achieved excellent RMSE on test data set.
Yelp reviews clustering and recommender system:
• Construct a personalized recommender system that can accurately predict users’ preference for a business.
• Using PCA to reduce dimensionality and using Naive Bayes/logistic regression/K-means to predict the primary categories of businesses.
• Applying item-item collaborative filtering for recommender system and model is evaluated by beating the baseline MSE 1.1375.
Energy firm bankruptcy rate prediction:
• Data are collected from the Wharton database.
• Present bankruptcy models are biased and inconsistent for predicting the bankruptcy risk of energy firm.
• Proposed a Cox Proportional Hazard Ratio model that combining both financial ratios and market variables as predictors has tremendously reduced the bias and inconsistency of the probability estimate of bankruptcy.
• Improved the test of goodness of fit, adj-R^2 0.63 and the sensitivity is 0.81. WORKING EXPERIENCE
Graduate Research and Teaching Assistant Sep. 2014 – Present Tulane University
• Extensive experience in machine learning and statistical modeling for analyzing high-dimensional data;
• Developing heteroscedastic regression methods, construct causal models, dig big DNA sequence data of Africa Americans, utilizing R and Python code to investigate the sophisticated statistical properties and utilities of harmonious statistical tests.
• Render assistance to Professor Qin in writing and analyzing information for National Institutes of Health
(NIH: R01AR050496) grants .
• As a teaching assistant, instructing students on intermediate biostatistics methods, including computer laboratory of conducing data analysis using R and SAS, review lectures, examinations, and assignments.
• Also, being a RA for professor Trapani, the associate dean of Tulane Business School, to coordinate various EMBA programs and corresponding courses, e.g., international finance. Quantitative analyst Intern May. 2014 – Aug. 2014
Everbright Securities Co. Ltd
• Contribute to in-house data analysis packages and research framework development.
• Acquisition of new data sets and manipulation of new and existing data sets.
• Collaborate with other experienced traders to implement their trading strategies by Python. Research Analyst Apr. 2013 – Apr. 2014
GP Capital Co., Ltd
• Assist senior team with due diligence on potential investment opportunities, Pre-IPO projects including HiLan Optech, EasyGenomicsTM, Shanghai Beite Technology.
• Prepare (due) diligence presentations for senior analyst.
• Build financial models for IPO projects to analyze existing data for support team analysis. AWARDS AND CERTIFICATES
• Paper reviewer of the journal of Open Journal of Statistics (OJS)
• Chartered Financial Analyst (CFA) level II Candidate
• Securities Association Certificate (SAC) Holder
• Scholarship of 21th Summer Institute of Statistical Genetics Travel Award of University of Washington
• 2014-2015 Biostatistics and Bioinformatics Endowed Scholarship $5000
• 2018 Tulane Global Biostatistics and Data Science Research Grant $7000
• Basic Programming certificate for SAS 9
• Advanced Programming certificate for SAS 9
SELECTED PUBLICATIONS AND DISSERTATION
Book Chapter:
• Application of Clinical Bioinformatics
(http://link.springer.com/chapter/10.1007/978-94-017-7543-4_9) Publication:
• A systems Genetics Approach Identified GPD1L and its Molecular Mechanism for Obesity in Human Adipose Tissue (DOI:10.1038/s41598-017-01517-6)
Journal article will be submitted soon:
• A Meta-analysis of the ABCA7 rs3752246 Polymorphism and the Alzheimer’s Disease Susceptibility. Dissertation:
• Small sample quasi-likelihood ratio test for human genetics and microbiome association data.