YUCHEN PENG
https://github.com/SuperLukedin, C: 862-***-**** *************@*****.***
Skills
• Programming Languages: Java, C/C++, Python, R,
SQL, JavaScript
• IDE: Cloud9, Brackets, Rstudio, Spyder, Eclipse, Visual Studio, IntelliJ, CodeBlocks
• Database: PostgreSQL, MySQL, MongoDB, Hbase
• Frameworks/Tools: ReactJS, Hadoop, Express.js,
Node.js, HTML, CSS, AWS, Linux, Cloudera, Spark, Pig, Hive, Hadoop, Sqoop, Docker, Tableau
Work Experience
Intern - Software Engineer 09/2018 to 12/2018
New Jersey Institute of Technology
• Responsible for developing data analysis and visualization web platform for City of Newark using Node.js, Express.js, D3.js. Constructed Redis as cache layer for Back end / Front end separation.
• Implement distributed data processing and computing system with Spark-Python, Kafka-Python. Created data analysis models in Tableau-Python environment.
• Created a scalable cloud deployment environment using Docker and scheduling framework Mesos. Summer Intern - Software Data Engineer 05/2018 to 09/2018 City of Newark / NJIT Ying Wu College Of Computing Newark, NJ
• Optimized data storage schema for HBase to speed up data query performance.
• Transformed raw data into relational database with ETL application to prepare unruly data for machine learning.
• Built external data management application based on relational database logic, enable the users to join, search, import, update data, and translate address into latitute/longitute information for the usage of heat map visualization.
• Developed data processing pipeline with oozie. Ingest data with Sqoop and store into HDFS. Implement Interactive analysis using Impala and Hive in Cloudera environment.
Projects
Real-time Log Analysis System 03/2018 to 04/2018
• Implemented 186GB online store real-time access log data ingestion using Flink, Pig, storing the data on Hadoop. And interactively query the data via Presto, data visualization on Superset.
• Data processing using Pig/Spark/Hive, writing custom UDF, loader and storer to implement business logic.
• Implementing a data processing pipeline using Oozie. Big Data Analysis System of Cryptocurrency 12/2018 to 01/2019
• Implemented a high performance data processing platform using Apache Kafka, Apache HBase, and Apache Spark to analyze cryptocurrency data.
• Developed a dashboard to visualize real-time transactions data of cryptocurrency using NodeJS and Redis.
• Optimized payload size using Google Protocol Buffer to improve system throughput by 30%. Education and Training
Master of Science: Computer Science GPA 3.75/4.00, Jan.2018 - May.2019 New Jersey Institute of Technology
Bachelor of Science Sep.2007 - Jun.2011
Shenzhen University