Hao Cheng
**********@*****.*** 201-***-**** https://www.linkedin.com/in/hao-cheng-54574316b/
SKILLS
• Language: Node.js, Python, Java, Scala, JavaScript/ES6, C/C++, SQL, HTML5/CSS3, Shell, YAML
• Frameworks/Packages: React.js, Redux, Django, Airflow, Kafka, Keras, Numpy, Pandas, Scikit-Learn
• Database/Tools: MySQL, MongoDB, Git, AWS, Docker, Jenkins, Spark, Hadoop, ElasticSearch, Kibana
• Knowledge: OOP/OOD, CI/CD, Design Patterns (MVC, Pub/Sub, Microservice), Machine Learning, Big Data WORK EXPERIENCE
Business Intelligence Analyst, Bling Jewelry & AYLLU, North Bergen, NJ Sept 2019 – Present
• Take ownership of designing ETL pipelines from inventory data sources to the back-end of Amazon using SQL
• Partner with cross-functional teams to generate automated reports to quantify business problems and keep track of key performance metrics for customer behavior insights
• Automate data extraction from flat files and load into various tables in database with SSIS in the ETL process
• Design and maintain Tableau interactive dashboards to explore trends & peaks in customer experience, increasing working efficiency by 80%
• Set up A/B test for Google Shopping images and analyze the click-through rate to determine the best performing image using Ad-Words
Data Engineer, Auto-Link World Technology Ltd, Beijing, China May 2018 – Aug 2018
• Built a data lake and ETL pipeline in Spark to load data from S3, processed the data into Redshift for analytics
• Created, maintained tables in Airflow and dashboards in Tableau to automatically track daily change on key metrics
• Built model monitor dashboard by QuickSight with data scientists based on the data in S3 bucket and use Boto3 API to enable the dashboard automation and email notices
• Implemented business logic of various reports using complex SQL queries, with user-defined stored procedures, CTEs, UDFs, and triggers. Improved query performance by more than 30%. Data Engineer Intern, TXZ Adventure LLC, Seattle, WA Dec 2017 - Feb 2018
• Developed an inference API with Python, Lambda Function, Sagemaker, and Docker for matching tourists’ profiles
• Designed the API swagger file for API Gateway and implemented the AWS Lambda Function to invoke ML model
• Implemented Python Flask API and built model container by docker for inference model endpoint in Sagemaker
• Used Elasticsearch for searching metric logs and visualized with Kibana for monitoring the model performance
• Created potential customer segments and shed light on potential investment strategies to improve customer experience PROJECTS
Stevens Institute of Technology, Hoboken, NJ Jan 2019 – May 2019 Amazon Ratings Prediction NLP Pipeline
• Designed and implemented an NLP Pipeline to predict Amazon Ratings using Python, Pyspark, AWS, Docker
• Scraped customer reviews data from Amazon with Python (Requests, BeautifulSoup, etc.) in AWS Sagemaker
• Built ETL process to clean and normalize the texts data collected from the scraping process with Pyspark
• Predicted the star rating with Soft-Max classifier and Random Forest model and tuned hyper-parameters based on preprocessed data in Python to detect the fake reviews by comparing true labels and predicted ones Django based Online Real Estate Listing and Agency Web Service Oct 2018 – Dec 2018
• Implemented the web service with Django, MySQL, React, Redux for online Real Estate consulting service
• Designed and Built website frontend with Bootstrap, React.js and Redux for middleware and reducers
• Developed MVC backend with Django on MySQL and automated system deployment with Nginx and Digital Ocean EDUCATION
M.S. in Business Intelligence & Analytics, Stevens Institute of Technology, New Jersey Aug 2017 – May 2019 B.S. in Information and Computer Science, Beijing University of Technology, China Sept 2013 – June 2017