Post Job Free
Sign in

Data Computer Science

Location:
New York, NY
Salary:
60000
Posted:
May 22, 2017

Contact this candidate

Resume:

SHENGZHONG YIN

*** * ***** **, *** York, NY *****

*********@*****.***

203-***-****

PROFILE

A well-educated data science graduate with: strong academic background (CS, Math, and Data Science); familiarity/experience of big data analysis tools (Pig, Hive, Spark, Hadoop, Matlab, R); fluency of in computer operation and programming (Python, Java, C++, SQL, HTML, PHP); real industrial experience. EDUCATION

Columbia University New York, NY

Master of Science Aug.2015 - Dec.2016

Master in Data Science.

University of Virginia Charlottesville, VA

Bachelor of Arts Aug.2011 - May.2015

Major in Mathematics (with Probability and Statistics concentration).

Major in Computer Science.

WORK EXPERIENCE

CITIC Securities Company Limited Shanghai, China

Analyst Assistant May.2014 - Aug.2014

Passed through learning curve quickly, got involved in data collection and wrote a report weekly.

Supported business working flow.

PROJECT EXPERIENCE

Kaggle Competition New York, NY

Predictive modeling competition Apr.2016

Participated in a Kaggle in class competition.

Built a model based on spoken dialogues to classify texts by speakers.

Got 93.6% accuracy (top accuracy approximately 95%). Data Science Capstone New York, NY

Natural language processing project Sep.2016 – Dec.2016

Cooperated with Unilever.

Dealt with Amazon commodity product review dataset (~1G) and generated a summary for each product.

Dealt with Unilever product survey answer dataset (~200M) and generated a summary for each survey question.

Techniques used include topic modeling (LDA, NMF), keyword selection (word embedding), clustering (k-means) and etc.

Big Data Analytics Project New York, NY

PokemonGo data analysis with big data tools Sep.2016 – Dec.2016

Used pokemon historical occurrence data (~1G) to predict future occurrence.

Linked pokemon occurrence to local 311 service data (~2G) to analyze unusual events.

Techniques used include multiple machine learning algorithms, PySpark and graph database tool SystemG. TECHNICAL SKILLS

Computer Skills

Programming languages: Python, Java, C++

Data processing tools: Matlab, R

Database tools: SQL

Big data manipulating tool: Pig, Hive, Spark, Hadoop

Web-design tools: HTML, PHP



Contact this candidate