Post Job Free
Sign in

Python, R, C/C++, SQL, Scala, Hadoop, Spark, Pig, Hive

Location:
Richardson, TX
Posted:
August 31, 2016

Contact this candidate

Resume:

RESUME_Gaoyang Ye

*** ****** ***, *** ***

Davis, CA 95616

*****.*********@*****.***

469-***-****

Github: https://github.com/LoganYe

LinkedIn: https://www.linkedin.com/in/gaoyang-logan-ye-645b1bab

EDUCATION

University of Texas at Dallas

M.S. Computer Science

Jan 2015 -- (Expected)Dec 2016

Richardson, TX

GPA: 3.787/4.0

China University of Mining and Technology, Beijing

B.S. Electrical Engineering and Automation

Sep 2009 -- Jul 2013

Beijing, China

Class-Rank: Top 10/70

COMPUTER SKILLS

Programming Language:

Python, C/C++, Java, R, Scala, SQL, Matlab, LATEX.

Operating System:

Linux/Unix, Mac OS, Windows.

Environment:

Sublime, RStudio, Github, Eclipse.

Big Data:

Hadoop, Spark, Pig, Hive, Hbase, Cassandra, MangoDB, Impala.

WORK EXPERIENCE

Alibaba Group Data Engineer

May 2016 -- Aug 2016 Hangzhou, China

Designed algorithm to automatically assess the rationality of monitor-items using Naive Bayes and Neural Networks, the accuracy of which was improved to over 90 percent (compared to human judgment).

Designed a new Anomaly Detection algorithm for Alibaba Global Operation Center, for detecting the suddenly falling for the trading system using Hierarchical Temporal Memory(HTM) algorithm. (The original system was designed based on STL time series analysis and Kalman Filter).

PROJECTS

Big Data Analyse Big Data Jan 2016 -- May 2016

Designed a recommendation system with HDFS MapReduce.

Used Hive scripts to create and update the database of three Yelp files, and used Pig scripts for tasks such as TopN, calculating average, comments counting, recommendation system.

Implemented the machine learning algorithm in Spark MLlib to classify tweets as positive tweets or negative tweets and return the reason of every negative tweet, using the Tweets database of US Airline.

Grabbed tweets using twitter API to predict people's voting tendencies and the result of 2016 United States presidential election.

Learning Algorithms Machine Learning Nov 2015

Implemented and tested the decision tree(information gain and Gain Ratio) learning algorithm, and pruned the tree to avoid overfitting.

Implemented and compared the Naìˆve Bayesian and Logistic Regression to classify whether an email is a spam, and implemented Neural Network using Scikit Learning package to do the same work.

Implemented Perceptron, SVM, Neural Network, kNN, Bagging, Random Forest, Boosting algorithm in R and compared the accuracy.

Implemented k-Means as an unsupervised learning algorithm (figuring out the E step and M step in the code) with python for clustering raw tweets from a real world dataset sampled from Twitter during the Boston Marathon Bombing event in April 2013.

Library System Database Design Oct 2015

Designed a library system using MySQL to manage over 3000 records.

Designed a GUI for library management system with multiple tables, SQL queries and management functions like data retrieval, modification and search.

Implemented GUI using Python Flask framework.

Nine Men's Morris Game Artificial Intelligence Jul 2015

Designed game strategies for a board game based on MinMax decision tree algorithm, to evaluate every steps and returned the best one.

Used ALPHA-BETA pruning to generate more depth for the decision tree and using different statistic function for better performance.

Smart Home System Internet of Things Jun 2013 -- Oct 2013

Compiled the web page with C language, designed GSM and Cable Networks control systems and developed the voice control system.

Publication: Gaoyang Ye, Ran Bi, "Designed Smart Home System Based on Internet of Things", CN51-1037/TP, Journal of Computer Applications, Vol.34(S1) No.1, Jun. 2014.

Patent: China Patent ZL201320633414.9, issued Oct.15th, 2013.

License Plate Recognition Thesis Apr 2013 -- Jun 2013

Improved the technology of Vehicle License Plate Recognition, using binary image, edge extraction and pattern recognition.

Excellent Graduation Design (3/67).

Publication: Gaoyang Ye,"License Plate Recognition", CN11-2739/N, China Science and Technology Information, Vol.482 No.21, Nov. 2013.

COURSES

Graduate:

Machine Learning

Big Data Management

Analysis of Computer Algorithms

Artificial Intelligence

Database Design

Data Structure and Algorithm

Operating Systems Concept

Discrete Structure

Statistical Methods for Data Science

Statistical Methods in AI and Machine Learning

Advanced Computational Methods for Data Science

Undergraduate:

Microcontroller

Control Theory

Sensor and Detection Technology

Signals and Systems

Automata

Electric Circuit



Contact this candidate