Post Job Free
Sign in

Python, Java, C, C++, C#, Sql

Location:
Kingston, ON, Canada
Posted:
June 11, 2024

Contact this candidate

Resume:

Emma Yu

******.**@*****.*** · 437-***-**** · emmacyu· Toronto, Canada (open to relocate)

Highlights

• 3 years hands-on experience in end-to-end data processing and analysis, owner of production data pipelines.

• Efficient time management & organization skills, capable of multitasking and work well in a team environment.

• Solid academic background and industry experience in computer science.

Education

Master’s in Computer Science, Queen’s University, Kingston, Canada (GPA - 3.8) 2021 – 2023 Courses: Machine Learning/Deep Learning, Foundations of Neural Networks, Software Engineering and Foun- dation Models, Engineering AI-Based Software Systems Master’s in Computer Science, Syracuse University, Syracuse, USA (GPA - 3.5) 2014 – 2016 B.S. in Computer Science, Liaoning Normal University, Dalian, China (GPA - 3.7) 2009 – 2013

Experience

Data Engineer, Tuhu Technology Inc., Wuhan, China 2019.11 – 2020.12

• Design data models and migrate data warehouse from MySQL to Hadoop, following the Star Schema.

• Design, build and launch new data extraction, transformation and loading processes (ETL) in production.

• Manage and revamp data dictionaries to include a more robust history for better data consistency.

• Supporting analystical data marts & warehouses; handling the day to day issues and fine tuning data models & data pipelines for enhanced performance.

• Mentor team members by giving/receiving actionable feedback. Data Engineer, Aetna HealthCare Inc., Hartford, CT, USA 2017.03 – 2018.01

• Executed complex HiveQL queries for required data extraction from Hive tables and wrote Hive UDF.

• Developed ETL pipelines to integrate and standardize large volume of data from various sources in multiple formats to help generate insights and address reporting needs.

• Involved in the scaling of the current SAS platform and data movement to Hadoop environment.

• Involved in data collection from the external interface using REST API’s with Python scripting, followed by data cleansing, standardizing and loading into Hadoop system. Data Engineer, Huawei North America Research Center,Santa Clara, CA, USA 2015.05 – 2017.03 Full-time employee: 2016.5 – 2017.3; Summer Intern: 2015.5 – 2015.8

• Applying RAFT (similar to Paxos) protocol to Postgres-XC based on Linux Suse 11 to build a more fault tolerant distributed database system.

• Honed critical thinking and communication skills through discussing research results in weekly seminars.

Skills

• Programming Languages: Python (i.e. Pandas, Numpy, scikit-learn, etc), Java, SQL, HiveQL

• Big data & Machine Learning: Hadoop, tensorflow, pytorch

• Database: Postgres, Oracle, MySQL, Presto, Netezza

• Engineering practice: Bash, Git, Jira, Travis-CI, Sqoop



Contact this candidate