Post Job Free
Sign in

Data Engineer Machine Learning

Location:
Houston, TX
Salary:
160000
Posted:
April 21, 2025

Contact this candidate

Resume:

Yang Liu Kunz

**** ****** ***** *** ***, Houston, Texas • 757-***-**** • *******@*****.*** •

https://www.linkedin.com/in/yang-liu-kunz/

Data Engineer/Tech Consultant

Databricks Certified Data Engineer and GenAI Engineer with expertise in ETL development using Delta Lake, machine learning, feature engineering, and data visualization. Passionate about building robust data architecture foundations that enable actionable business insights. Industry experience spans retail marketing analytics, healthcare, and banking services. WORK EXPERIENCE

Publicis Sapient Houston 01/2024 - Present

Senior Data Engineer(Full time) Houston, Texas

• Supported the migration of a leading healthcare client's marketing data platform from a traditional data warehouse to a modern Azure Databricks ecosystem.

• Transformed payment data for a consumer bank's EDL, ensuring accuracy through system-level reconciliation, delivering 18 data products related to loans and credit cards within 3 months.

• Led the redesign and migration of model score tables to the new Azure Databricks platform, transitioning from a legacy data warehouse.

• Designed and orchestrated scalable data pipelines on Azure Databricks, utilizing Azure Data Factory

(ADF) for orchestration.

• Led unit testing, CI/CD deployment, documentation, and onboarding/training of new team members

• Implemented data quality checks and governance best practices to ensure data accuracy and reliability.

EY 04/2021 - 01/2024

Senior Consultant - Data & AI (Full time)

• Designed and coded all aspects of data solutions using Databricks AWS for consumer journey analysis automation, shortened delivery time from 1 month to 2 days.

• Optimized data processing workflows using Apache Spark data frame and performance tuning, resulting in an 80% reduction in processing time.

• Created and proposed technical design documentation, which includes ETL functionality, specifications, data flows, and diagrams to detail the proposed implementation. Springboard 06/2020 - 02/2021

Fellow San Francisco, California

• 500+ hours of hands-on course material, with 1:1 industry expert mentor oversight, and completion of 2 in-depth portfolio projects.

• Mastered skills in Python, SQL, data wrangling, data visualization, hypothesis testing, and machine learning.

Adobe Tokyo Japan 10/2018 - 05/2019

AdCloud DSP Account Manager (Full time) Tokyo, Japan

• Display, video ads campaign management, and optimization for advertisers including campaign setup, KPI analysis, performance optimization, reporting, and communicating with agencies or advertisers.

• Client/Partner relationship cultivation with clients and partners from APAC, US, EU regions.

• Internal operation support including workflow optimization, documentation creation, and reorganization, monthly billing adjustment, invoice creation, etc. Criteo Tokyo Japan 05/2016 - 09/2018

Publisher Partnership Manager (Full time) Tokyo, Japan

• Managing existing accounts/relationships with publishers in Japan and occasionally overseas (US, UK), negotiating and optimizing traffic acquisition for Criteo.

• Analyzing KPIs, metrics for publishers to enhance their Ad revenue.

• Led Criteo publisher educational seminar in May 2017 with the goal of expanding Criteo head bidding integration among major Japanese Publishers.

EDUCATION

Master of Curriculum for Japanese Education in Japanese Education Yamaguchi University Yamaguchi University

Bachelor in Japanese Language

Wuhan University of Technology Wuhan

CERTIFICATIONS

Databricks Certified Generative AI Engineer Associate 09/2024 - Present Databricks

Databricks Certified Data Engineer Associate 03/2024 - Present Databricks

Databricks Certified Associate Developer for Apache Spark 3.0 - Python 06/2022 - Present Databricks

PROJECTS

Domain-Aware Retrieval-Augmented Generation (RAG) Prototype 07/2024 - 10/2024 Developed a proof-of-concept RAG system that integrates a Large Language Model (LLM) with a Milvus vector database—deployed via Docker—to enhance response relevance within a specific business domain.

Automated Animal Profile Generation for Local Shelter Using GenAI 02/2024 - 02/2024 Sales Analyzing and Weekly Sales Forecasting for Drug Store Rossmann 12/2020 - 02/2021 Detecting Potential Candidates Who are Looking for New Jobs for Training Institute

09/2020 -

11/2020

SKILLS

Adobe Experience Platform, Azure DevOps, Big Data, Chinese, Databricks, deep learning (Tensorow ), English, feature engineering, Git, GitHub, Heroku, Japanese, JIRA, Korean, MACHINE LEARNING, Matplotlib, MySQL, Pandas, Postman API, Pyspark, Python, Salesforce, scikit-learn, Seaborn, Snowake, Spanish, SQL, supervised learning (classication,regression ), Tableau, TensorFlow, unsupervised learning (clustering, PCA)



Contact this candidate