Post Job Free

Resume

Sign in

SQL, Matlotlib, seaborn, python, R

Location:
Toronto, ON, Canada
Posted:
March 18, 2019

Contact this candidate

Resume:

PengfeiLiu

** ******** ***, ***, ***, Toronto, ON

+1-416-***-**** ac8tji@r.postjobfree.com medium.com/@oscarliu0928 oscarliu00 pengfei-liu Summary

• Hands-on experience in dealing with both unstructured and structured data on large dataset, performing data cleaning and visual- ization, and applying unsupervisedand supervisedmachine learning techniques to business problems

• Experienced MSc researcher solving complex quantitative problems and designing research process, and presented and published article on journal

• Real world business experience with deep understanding in market research and strategic planning Education

University of Regina Regina, SK

MASTER OF SCIENCE IN COMPUTER SCIENCE Jan 2016 - Dec 2018

• Courses: Machine Learning, KDD, Information Retrival, Rough Sets, Artificial Intelligence. Udacity

DATA ANALYSTS AND ARTIFICIAL INTELLIGENCE NONDEGREE July. 2018

• Courses: Inferential Statistics, A/B testing, Data Extraction and Wrangling, EDA, MongoDB, Time series forecasting, Data Visualization with Dj3, MapReduce, and Artificial Intelligence.

Northwest Polytechnical University Xi’An, China

BACHELOR OF ENGINEERING IN COMPUTER SCIENCE Sep. 2011 - July 2015

• Courses: SQL Program Language, Relational Database Management, Software Engineering, JavaEE Design, and HTML/CSS Web Ap- plication.

Skills

Machine Learning: Classification, Regression, Data Scraping, Manipulation& Visualization(Matplotlib, seaborn, ggplot), Tableau Statistic: Regression, Confidence interval, Bayesian Coding: Python(skit-learn, Numpy, Scipy, Pandas), SQL, R, Git, Latex, Java, JavaScript, HTML/CSS, C++ Projects

New York OpenStreetMap Data Wrangling

• Languages: Python, SQL

• Performed data wrangling on unstructured data

• Developed metadata using python ElementTree Library.

• Parsed 400+MB XML file, processed missing values using mean to replace, and standardized five digit postal code using pattern de- tection method.

• Audited data quality of 1,834,638 nodes and 313,418, and ran queries using SQL to find the high-frequency street name

• Reported the ”Broadway” the most common street name in New York City Shareride Economy in New York Prediction

• Languages: SQL, Python, Tableau

• Analyzed NYC taxi records to predict hourly pickups in New York City.

• Cleaned and aggregated 20+GB Taxi data by hour of day using SQL, built statistical model to observethedistributionof each features and selected the appropriate data, and mapped taxi pickups using Python

• Transformed geographical information to region clusters using unsupervisedclusteringalgorithmKmeans

• Built the regression model to predict the taxi demands in different regions, generated 2-month taxi record.

• Discovered that the trip between Manhattan and Brooklyn over 20% taxi trip are sharerable after 1pm on Sunday, suggesting that huge profit of expanding short share ride business on late weekend night. Working Experience

University of Regina Regina, SK

RESEARCH Jan. 2016- Dec.2018

• Directed research on Game-theoretic rough sets

• Applied game-theoretic rough sets to three-way classification on spam filtering

• Researched that spam filtering hardly classifies more email messages in the real word, determine the suitable spam filtering between measures accuracy and coverage

• Normalized the continuous data by binning method, transformed to signals, selected 5 out of 57 features using random forest

• Conducted a competitive game between accuracy and coverage on three-way classification.

• Concluded that the coverage is improved roughly 20% by made slight accuracy. Software Developer Intern University Program XI’AN, China SOFTWAREDEVELOPER Apr. 2016 - May. 2016

• Language used: JavaScript, HTML, CSS, and SQL;

• Techniques used: Struct2, Spring, Hibernate, and JQuery;

• Complied customers requirements on system feasibility, and designed system using MVC model

• Transformed Customers units to registration, and paying units using SSH Framework with Object-Oriented program

• Conducted logical and physical data models using ERD with one-to-one and one-to-many relationships

• Designed system, along with two other team members who designed cart and product display. University of Regina Regina, SK

LAB INSTRUCTOR Sep. 2017 - Dec. 2018

• Taught the basic C++ syntax

• marked assignments and tutored students

• Helped with students in design small projects

Extracurricular Activity

Saskatchewan Gay-Straight Alliances Summit Regina, SK VOLUNTEER Oct. 2018

• Worked for organizing events in order

Regina Chinese Canadian Association Regina, SK

VOLUNTEER May. 2017 - PERSENT

• Participated on Chinese’s New Year celebration and Chinese Pavilion Mosaic.

• Communicated with crew and leaders to arranges series of activities. Student Union University of Regina Regina, SK

JUNIOR AMBASSADOR Sep. 2016 - PRESENT

• Organized events like Student Orientation, GayPride, New Year Gala

• Tutored students in mid and final examination



Contact this candidate