Sign in

Engineer Data

Washington, District of Columbia, United States
March 08, 2019

Contact this candidate


He Cheng

Arlington, VA ***** 571-***-****


George Washington University Washington, DC

Master of Science in Electrical Engineering, GPA: 3.5 / 4.0 Aug 2016 - May 2018 Northeastern University Shenyang, China

Bachelor Relevant of Coursework: Science in Communications Big Data and Cloud Engineering Computing, Design & Analysis of Algorithm, Network Sep 2011 - etc. June 2015 TECHNICAL SKILLS

Programming: Python, Java, C, Linux Shell, HTML, CSS, Shell Scripting, Hive, Pig. Frameworks and Tools: Hadoop, Spark, Kafka, Spring MVC, Hibernate, AWS, React, Maven, Git, Docker Databases: MySQL, HBase, MangoDB


OnlineShop: Spring and Hibernate based Shopping and Ordering system Jan 2018 – May2018

• Built a web application based on Spring MVC to support users, item searching and listing.

• Enhanced authentication flows by using Spring Security to implement OAuth2.

• Utilized Hibernate to conduct database operations and developed Spring Web Flow to support item ordering. Event Search and Ticket Recommendation Aug 2017 – Dec 2017

• Developed an interactive web page to provide users with an event search engine and personalized recommendations based on their location and collection history. (HTML5, CSS, JSP, JDBC, Ticketmaster API)

• Based on users’ favorite events, designed algorithms to implement the content-based recommendation.

• Deployed server on Amazon EC2 to handle 100+ queries per second which was tested with Apache JMeter. Twitter Analytics Web Service Feb 2017 – May 2017

• Used AWS MapReduce and Hadoop Streaming to conduct extract, transform and load process (ETL) on 300 million (1TB) of raw tweet messages using Java, bash scripts and AWS API.

• Optimized and tested a variety of database schemas on MySQL and HBase to improve read/write throughput.

• Developed a fault-tolerant and scalable web-server, including back-end built on both MySQL and Hbase. Wikipedia Big Data Analysis Sep 2016 – Dec 2016

• Designed an efficient algorithm to filter a 500 MB Wikipedia traffic log in less than 1 minute.

• Conducted parallel analysis with Hadoop Streaming on AWS elastic MapReduce framework to process a 340G Wikipedia traffic log. Configured, deployed, executed and debugged a MapReduce job on AWS EMR.

• Designed efficient Python codes to finish the data processing on the cloud with little time and small cost. WORK EXPERIENCE

Sharing Mobile Group Co., Ltd. System Engineer Beijing, China Sep 2015 – May 2016

• Responsible for organizing data. For example, wrote python scripts using BeautifulSoup to scrape songs information that appeared on the Billboard Top 100 chart from 1958 to 2012.

• Maintained company internal personnel information system with Python.

• Designed dynamic data visualizations using Power BI, presented reports to division VPs in weekly meetings.

Contact this candidate