Xuechun Wang
adgybo@r.postjobfree.com
*** ******** **, ********* *****
xuechun.georgetown.domains
linkedin.com/in/xcw1203
SKILLS
Languages:
R, Python, SQL, Java
Big Data:
AWS EMR, Pig, Hadoop, Hive, Spark
Analytical Tools:
Tableau, PowerBI, Domo, Alteryx,
Tibco spotfire, SAS Viya, Qlik Sense
Microsoft Offices:
Excel, Access, Word, PowerPoint,
Outlook
Web Development:
Html, Django, JavaScript, CSS
EDUCATION
Georgetown University
MS in Analytics-Data Science
(3.85/4.0)
Expected Grad. May 2020
Merit Based Scholarship
TA for NLP (graduate-level) and
Intro to Computer Science
University of Minnesota
BA in Mathematics (3.9/4.0)
Minors: Economics, Computer
Science
Grad. May 2018
Graduate with High Distinction
COURSEWORK
Data Science/Analytics:
Data Analytics, Data Visualization,
Data Mining Neural Networks &
Deep Learning, Strategic Business
Analytics,
Computer Science & Math:
Streaming and High Dimension Data,
Optimization, Statistical Computing,
Data Structure & Algorithms
Finance & Economics:
Finance Fundamentals, Financial
Analysis & Modeling, Micro &
Macroeconomics, Cost-Benefit
Analysis
ACTIVITIES
Campus String Orchestra - violin
Chinese Students Association
Welcome Week Leader
K-5 Tutor at People Serving People
09.30.2020
PROFESSIONAL EXPERIENCE
OCBang Jun 2020 - Sep 2020
Data Scientist Santa Clara, CA
• Implemented automation on notes sending and candidates matching (Python); Automated data extraction and preprocessing; increased efficiency by 50%.
• Scraped data from dynamic web page using using Python/Selenium; extracted key features by implementing NLP algorithm, such as regular expression.
• Clustered individuals using DBScan and KMeans; conducted feature engineering, feature selection,and model evaluations (Random Forest, AdaBoost, KNeighborsClassifer, XGboost) to predict candidates tendencies.
• Optimized company’s website strength on SEO based on AB Testing; added new features and fixed bugs and defects using WordPress on AWS EC2.
• Developed a web app to interactively display org charts, and statistics summary. Neighborhood Rescue of America Feb 2020 - present
Data Analyst Washington DC
• Collected demographic data, property market value data, and crime data; performed geographic data analysis and implemented machine learning algorithms (Regression, Random Forest) to measure the effect of weather, moon phase, income, education, and healthcare level on crime.
• Implemented visualization on current at-risk hot-spot data by creating interactive webpage using Python plotly and folium.
• Developed full-stack web app that allows user communicate with back-end database using using Django Framework.
• Led a team of 12 to work on multiple projects; coached new hires. 1010DATA May 2019 – Aug 2019
Data Analyst Intern New York, NY
• Conducted ad-hoc analysis and generated dashboard using data from multiple sources including Gong, and Salesforce (R, Excel).
• Utilized SQL and XML code to write complex data queries in NoSQL databases to automate the execution of ELT (1010 Platform).
• Worked on Hackathon projects to constructed a logistic regression model with R and MapReduce using retail data to predict client’s sales performance (with 16 leading parameters, achieve an accuracy of 95 %).
• Researched and identified prospects for 1010Discover; conducted competitive analysis; held training sessions to construct prospect presentations and demos. ACADEMIC PROJECTS
Deloitte Digital Camp - NLP Auto Grading& Web Development:
• Implemented Sentiment Analysis on essays using Opinion Finder, analyzed the correlation between grades and sentiments.
• Conducted feature Engineering and L1 regularization on generated features.
• Trained essay-auto-grading models using TensorFlow LSTM with word embeddings (BOW, GloVe, BERT), achieved 0.97 Kappa Score with GloVe.
• Developed a web app for auto scoring model with Django and SQLite; deployed app on Tencent Cloud server.
Reddit Topic Modeling with Apache Spark on Amazon EMR:
• Worked with 500GB JSON data from Reddit Achieve: Loaded data from AWS S3 to Spark RDD and conducted exploratory data analysis on EMR with PySpark.
• Implemented Sentiment Analysis on the body text and utilized LDA model to identify the cluster pattern in the jewelry subreddit. Data Visualization of Soccer Matches:
• Player positions, attacking tactics, team performance visualization of soccer matches using Python and Tableau;
• Interactive plotting and web dashboard development using plotly dash. People for Productive - Mobile App Development:
• Programmed using Java (Android Studio) to develop a mobile app to help campus students keep productive with their daily work.
• Conducted A/B testing on the real app. Modified app functions based on test results and increased the user rating by 1 point.