Post Job Free

Resume

Sign in

Data Engineer

Location:
Ames, IA
Posted:
February 28, 2019

Contact this candidate

Resume:

Siying Lyu

Email: ac8m1s@r.postjobfree.com Tel: 626-***-**** GitHub: https://github.com/SiyingLyu

Education

Masters in Computer Science, Iowa State University (ISU) (GPA:3.89/4.00) Aug. 2017 – Dec. 2019 Skill

• Language: Java, Python, C/C++, JavaScript

• Web technique: Django, HTML, CSS

• Libraries: Scikit, Keras, Numpy, Pandas, NLTK, SpaCy

• Databases: SQL, JDBC

• Algorithms: CNN, Naïve Bayes, Decision Tree, etc.

• Big data: HDFS, MapReduce, Spark, Hive, Impala

Working experiences

Shanghai Hypers Data Technology Inc. Data Engineer Intern (Hive) May. 2018 – Aug. 2018

• Optimized the queries on distributed system and reduced the running time by 50% with the application of Parquet format as well as DAG-based engines

• Applied bucket and partition on the table for performance improvement. The bucket improved running time by 5% while partition makes the queries 75% faster on the data with size of 100 GB

• Understood the data pipeline of end-to-end production process including event track, ETL, data application like data label and BI report

Selected Projects

eCommerce Web application (Python, Django, Javascript, JQuery, HTML)

• Built an eCommerce Website including the products potion, user manager potion and payment potion with Django framework

• Applied asynchronous method with Ajax to speed up the loading process

• Implement the credit card transaction potion with Stripe and email marketing with Mailchimp Text Classification and Hand-written Recognizer System (Python, Keras, SpaCy)

• Constructed Naïve Bayes, Decision Tree, and kNN classifiers to classify 10,000 text documents

• Implemented a deep neural network and convolutional neural network (CNN) modules

• Applied various activation functions and error functions, the accuracy is 98% in average Fall-detection Android application (Java)

• Designed Android applications to protect elder and disable people from lack of help after falling

• Applied SenserManager to get access to gyroscope sensor data and accelerator data

• Implemented 4-phase detection algorithm. The sensor data was used for the accuracy purpose in the first two phases, the threshold and timing out were set at later phases to decrease false alarm Information Retrieval System based on Vector Space Model (Java)

• Built a retrieval system which returns the top k related articles to a given query

• Implemented both the Term Proximity Score and Vector Space Model to analyze the phrase in the query in aspects of words distance, words order and words frequency in an article

• Optimized the algorithm by accessing the sparse vector from the available elements



Contact this candidate