BOWEN YIN
734-***-**** Ann Arbor, MI *****@*****.***
EDUCATION
University of Michigan M.Sc. in Data Science GPA:4.0/4.0 Aug 2018 - Dec 2019 Coursework: Natural Language Processing; Information Retrieval; Database Management System; Statistical Learning; Bayesian Modeling and Computation
Tsinghua University B.Sc. in Industrial Engineering Minor in Economics GPA:3.7/4.0 Aug 2014 - Jul 2018 Coursework: Data Structure and Algorithm Analysis; Programming Fundamentals; Applied Statistics and Probability; Time Series Analysis; Operations Research; Traffic Systems Planning and Control; Principle of Economics Denmark Technical University Exchange Student Aug 2016 - Jan 2017 Coursework: Computational Tools for Big Data; Static and Dynamic Optimization WORK EXPERIENCES
State Information Center of China Beijing, China Data Analyst Intern (full-time) Jun 2017 - Sep 2017
Constructed a classification model to distinguish opinion-sentences from fact-sentences in economic corpora with Python, based on keyword matching, word embedding and using SVM algorithm; reached an accuracy over 90%
Designed an algorithm and developed a program to automatically remove unreadable characters, normalize the text and delete irrelevant paragraphs, with regular expression and machine learning method
Developed the GUI application with Qt in Python to implement functions above
Built a platform to routinely finish text cleaning, tokenization and word frequency analysis tasks Beijing Transport Institute Beijing, China Transport Modeler Intern Feb 2018 - Jun 2018
Designed a multinomial logit model with SAS to predict the workplace preference of users and explore significant factors in choice decision, based on the metropolitan transportation study data
Initiated strategies to locate residence and workplace of users from mobile phone signaling data PROJECTS
Personality Detection with Deep Learning University of Michigan Course Project
Built a deep learning model using keras in Python to identify personality traits based on collected essays
Extracted features of texts with convolutional neural network (CNN), where n-gram features were obtained by application of three kinds of convolution kernels, and max pooling was used for concatenation; word embedding was used to represent each word in pre-processing period
Fed the classifier with output of CNN extractor, TF-IDF feature and document-level Mairesse features; reached an accuracy 5% higher than benchmark
Sentiment Analysis of Online Reviews Tsinghua University Directed Research
Constructed a model to analyze sentimental tendencies of reviews on dianping.com and hence gave insights on product and service improvement
Vectorized review texts with sentence segmentation and word embedding; applied LDA topic model to obtain keyword of each class of texts and accordingly conducted clustering analysis with k-means algorithm Big Data Study on Riot Games Denmark Technical University Course Project
Constructed a clustering model on performance of characters in the game using DBSCAN algorithm and optimized the model through parameter tuning
Designed web scraper with BeautifulSoup in Python to fetch gaming records through APIs and preprocessed the data
Visualized the research outcome by plotting the result using matplotlib in Python HONORS
International Exchange Student Scholarship for Undergraduates China Scholarship Council SKILLS
Programming Languages: Python (numpy, scikit-learn, matplotlib, pyspark, nltk, gensim, keras), R (ggplot2, dplyr, rstan), SQL, Java, C/C++, Shell
Applications: SAS, MATLAB, LaTeX