Weibin Ye
Dallas, TX ***** / ******.**.****@*****.*** /Cell 469-***-****
LinkedIn: https://www.linkedin.com/pub/weibin-ye/89/972/42a Position: Test Data Analyst
CERTIFICATIONS SKILLS
SAS Certified Business Analyst (No. SBARM002592v9)
SAS Certified Advanced Programmer (No. AP017353v9)
SAS Certified Base Programmer (No. BP059751v9)
SAS Predictive Modeling
SAS SQL MACRO
MS Access Excel
Language: English Chinese Cantonese
EDUCATION
University of Texas at Dallas, Richardson, TX
MS Information Technology and Management. GPA: 3.4
Guangdong University of Finance, China
BS Economics, Finance & Monetary Management GPA: 3.3 May 2016
June 2012
Academic Project
Customer Purchase Behavior Analysis Model SAS 9.4 Spring 2016
Dataset records customer purchases from two competing booksellers – Amazon and B&N in 2007. Customer demographics are included.
Combined BY statement in sort procedure with first.variable/last.variable in data step to group data after cleaning it, then outputted to temporary dataset.
In order to predict amount of book customer will buy, assumed number of book customer bought from both websites was Poisson distributed, and then inputted temporary dataset to Negative Binominal Distribution Regression Model (NBD) and Poisson Regression Model.
Using Maximum-Likelihood Estimation to gain NBD Regression Model and Poisson Regression Model.
Compared two models by using Likelihood Ratio Test and found out that there was no significant difference between two models.
Constructed new variables – percentage of weekend purchases and degree of loyalty – to improve model performance and then reran two models. AIC and BIC were much lower than before.
Assigned 1 to Amazon and 0 to B&N, then built Logistic regression model to find out what factors make customers prefer Amazon to B&N.
Building Plane Ticket Online Booking Predictive Model (Enterprise Miner 13.1) Spring 2016
Adjusted decision weight, set value of true positive two times greater than value of false negative and false positive.
Applied StatExplore to data set for result of variables’ skewness and missing value.
Applied Replacement to replace data entry error and applied Impute to impute median to interval variables and set tree surrogate as class variable’s input method.
Applied Decision Tree and Gradient Boosting to build models DT1 and GB1.
Reduced interval variables’ skewness by choosing log10 method in Transform Variables, and then applied Regression to build regression model Reg1.
Applied Variable Selection and Neural Network to build model NN1.
Connected models NN1, Reg1, DT1 and GB1 to node Model Comparison, then ran process.
To improve model performance, stratified sample data set before being input into models.
Used Bagging to improve model performance, built several decision trees with different random seed numbers and used tool ensemble to find out the best model.
Compared performance of five models and choose champion model.