Tianqi Wu
tianqi-wu *****@********.*** wutianqidx 217-***-****
OBJECTIVE
To leverage data analysis to support decision making and deliver efficient solutions to business problems EDUCATION
University of Illinois at Urbana Champaign Champaign, IL MS in Statistics GPA: 4.00/4.00 Aug 2019 - Dec 2020 MS in Industrial Engineering (Advanced Analytics) GPA: 3.85/4.00 Aug 2017 - May 2019 BS in Industrial Engineering (Math, CS minor) Aug 2013 - May 2017 EXPERIENCE
Xiaomi Technology Beijing, China
Applied Scientist Intern Jun - Aug 2019
Prepared and processed raw text of over 70,000 Chinese poems
Implemented Seq2Seq with attention to generate acrostic poems using TensorFlow
Achieved 30% better performance of artistic conception, fluency and diversity than RNN
Modified regular expressions to increase F1 score by 20% for chatbot’s dialogue feature Sina Corporation Beijing, China
Data Scientist Intern Jun - Aug 2018
Implemented web crawler to collect daily HTML source codes of 10 Sina webpages
Built an analyzer to detect the potential cyber attack based on calculation of cosine similarities between past and current webpages using Python
Created a database to manage the monitor and contact list using MySQL PROJECT
LendingClub Loan Status Prediction Aug - Dec 2019
Performed feature selection through exploratory data analysis on 1.5 million records with imbalanced classes
Preprocessed data with standardization, categorical variable encoding and missing data imputation
Implemented Logistic Regression, Naive Bayes, Decision Tree, Random Forest and MLP for comparison
Improved performance from 0.78 accuracy and 0.08 F1 score to 0.65 accuracy and 0.68 F1 score with technique of undersampling, regularization, cross-validation and parameter tuning Top Skills Employers Look For Aug - Dec 2020
Crawled and examined HTML source code of 5,000 job postings from Amazon.jobs
Extracted keywords from basic and preferred qualifications
Visualized skill sets for different job titles with WordCloud and bar chart using Python
Deployed the Dash web application using AWS Elastic Beanstalk RESEARCH
Medical Text Generation Jan - May 2020
Processed 16,950 brain activity image data with corresponding medical reports
Utilized CNN to extract features from images and encode as embeddings for text generation
Implemented Transformer model to align CNN features with medical descriptions and generate reports
Achieved 0.561 BLEU@1 on test data with implementation in PyTorch Yelp Review Sentiment Analysis Aug - Dec 2019
Preprocessed data with NLP techniques such as stemming, lemmatization and TF-IDF vectorization to transform unstructured review text into numeric data set
Implemented LSTM, BiLSTM and BiLSTM+Attention models for Yelp Review Polarity
Compared performance with TF-IDF+logistic regression, BOW, RNN and BERT
Achieved 96.2% accuracy on 560,000 training and 38,000 testing samples ADDITIONAL
Relevant Coursework: Data Structures and Algorithms, Machine Learning, Deep Learning
Statistical Knowledge: A/B Testing, Probability Distribution, Regression, Forecasting
Programming Languages: Python, SQL, R, Git, SAS, PyTorch, TensorFlow
Models: Linear/Logisitc Regression, Random Forest, KNN, SVM, CNN, RNN, BERT