Post Job Free

Resume

Sign in

Data Analyst Python

Location:
Santa Clara, CA
Posted:
October 14, 2020

Contact this candidate

Resume:

Xuechun Wang

adgybo@r.postjobfree.com

*** ******** **, ********* *****

xuechun.georgetown.domains

linkedin.com/in/xcw1203

+1-612-***-****

SKILLS

Languages:

R, Python, SQL, Java

Big Data:

AWS EMR, Pig, Hadoop, Hive, Spark

Analytical Tools:

Tableau, PowerBI, Domo, Alteryx,

Tibco spotfire, SAS Viya, Qlik Sense

Microsoft Offices:

Excel, Access, Word, PowerPoint,

Outlook

Web Development:

Html, Django, JavaScript, CSS

EDUCATION

Georgetown University

MS in Analytics-Data Science

(3.85/4.0)

Expected Grad. May 2020

Merit Based Scholarship

TA for NLP (graduate-level) and

Intro to Computer Science

University of Minnesota

BA in Mathematics (3.9/4.0)

Minors: Economics, Computer

Science

Grad. May 2018

Graduate with High Distinction

COURSEWORK

Data Science/Analytics:

Data Analytics, Data Visualization,

Data Mining Neural Networks &

Deep Learning, Strategic Business

Analytics,

Computer Science & Math:

Streaming and High Dimension Data,

Optimization, Statistical Computing,

Data Structure & Algorithms

Finance & Economics:

Finance Fundamentals, Financial

Analysis & Modeling, Micro &

Macroeconomics, Cost-Benefit

Analysis

ACTIVITIES

Campus String Orchestra - violin

Chinese Students Association

Welcome Week Leader

K-5 Tutor at People Serving People

09.30.2020

PROFESSIONAL EXPERIENCE

OCBang Jun 2020 - Sep 2020

Data Scientist Santa Clara, CA

• Implemented automation on notes sending and candidates matching (Python); Automated data extraction and preprocessing; increased efficiency by 50%.

• Scraped data from dynamic web page using using Python/Selenium; extracted key features by implementing NLP algorithm, such as regular expression.

• Clustered individuals using DBScan and KMeans; conducted feature engineering, feature selection,and model evaluations (Random Forest, AdaBoost, KNeighborsClassifer, XGboost) to predict candidates tendencies.

• Optimized company’s website strength on SEO based on AB Testing; added new features and fixed bugs and defects using WordPress on AWS EC2.

• Developed a web app to interactively display org charts, and statistics summary. Neighborhood Rescue of America Feb 2020 - present

Data Analyst Washington DC

• Collected demographic data, property market value data, and crime data; performed geographic data analysis and implemented machine learning algorithms (Regression, Random Forest) to measure the effect of weather, moon phase, income, education, and healthcare level on crime.

• Implemented visualization on current at-risk hot-spot data by creating interactive webpage using Python plotly and folium.

• Developed full-stack web app that allows user communicate with back-end database using using Django Framework.

• Led a team of 12 to work on multiple projects; coached new hires. 1010DATA May 2019 – Aug 2019

Data Analyst Intern New York, NY

• Conducted ad-hoc analysis and generated dashboard using data from multiple sources including Gong, and Salesforce (R, Excel).

• Utilized SQL and XML code to write complex data queries in NoSQL databases to automate the execution of ELT (1010 Platform).

• Worked on Hackathon projects to constructed a logistic regression model with R and MapReduce using retail data to predict client’s sales performance (with 16 leading parameters, achieve an accuracy of 95 %).

• Researched and identified prospects for 1010Discover; conducted competitive analysis; held training sessions to construct prospect presentations and demos. ACADEMIC PROJECTS

Deloitte Digital Camp - NLP Auto Grading& Web Development:

• Implemented Sentiment Analysis on essays using Opinion Finder, analyzed the correlation between grades and sentiments.

• Conducted feature Engineering and L1 regularization on generated features.

• Trained essay-auto-grading models using TensorFlow LSTM with word embeddings (BOW, GloVe, BERT), achieved 0.97 Kappa Score with GloVe.

• Developed a web app for auto scoring model with Django and SQLite; deployed app on Tencent Cloud server.

Reddit Topic Modeling with Apache Spark on Amazon EMR:

• Worked with 500GB JSON data from Reddit Achieve: Loaded data from AWS S3 to Spark RDD and conducted exploratory data analysis on EMR with PySpark.

• Implemented Sentiment Analysis on the body text and utilized LDA model to identify the cluster pattern in the jewelry subreddit. Data Visualization of Soccer Matches:

• Player positions, attacking tactics, team performance visualization of soccer matches using Python and Tableau;

• Interactive plotting and web dashboard development using plotly dash. People for Productive - Mobile App Development:

• Programmed using Java (Android Studio) to develop a mobile app to help campus students keep productive with their daily work.

• Conducted A/B testing on the real app. Modified app functions based on test results and increased the user rating by 1 point.



Contact this candidate