Post Job Free
Sign in

Quality Assurance Data Warehouse

Location:
Brooklyn, NY
Posted:
April 19, 2025

Contact this candidate

Resume:

Archie (Yuqiang) Zhang

***- ***-**** *************@*****.*** LINKEDIN WASHINGTON, DC

Professional Skills

Programming and Data Management: Python, SQL, PostgreSQL, MySQL, Hadoop, PySpark

ETL & Data Pipeline: Data Integration and mirgration, Data Quality Assurance, Workflow Automation, Data model, Data warehouse desgin, Data security and compliance, Data Manipulation

Cloud & Platform: Snowflake, AWS(Lambda, EC2, S3, CloudWatch, Airflow, SES), GCP(GA, Bigquery)

Visualization & reporting: PowerBI, Tableau, Rmarkdown, Metabase, Streamlit

System Administration & Development: Linux, Bash scripting, Git, Jira, Docker, Cursor, Versel Education

University of Michigan, Ann Arbor, MI, May 2023 Master of Science: Biostatistics Liverpool University, Suzhou, China, June 2021 Bachelor of Science: Applied Mathematics Work Experience

Peblla LLC, Data Engineer, Sep 2023 – Present, Rockville, Maryland

Designed and implemented a multi-layer data warehouse architecture (ODS, DWD, DWS, ADS, DIM) using the Snowflake Schema optimized for OLAP workloads, enabling efficient and scalable analytical reporting

Integrated AWS (Lambda, EC2, S3, CloudWatch, Airflow, SES), Google Cloud APIS (GCP) and Snowflake to design, build and optimize ETL process, significantly enhancing pipeline reliability, scalability and efficiency

Optimized Snowflake architecture through strategic data governance, systematically reducing credit usage and storage costs by 25%, while enhancing query performance by 40%

Implemented Authentication(MFA, SSO), monitoring mechanisms, data security policies (RBAC, Data Masking), functions (Time-Travel, Encrypt_raw and etc.), ensuring compliance with industry regulations

Created and maintained documentation of data models, ETL processes, and data security policies, reducing new team members’ onboarding time by 50% and ensuring consistent data governance practices Overture LLC, Data analyst, May 2022 – Aug 2022, Fairfax, Virginia

Developed 8 Power BI dashboards for sales analytics, enabling real-time KPI tracking

Designed Python ETL pipelines using pandas, numpy and SQL connectors (PostgreSQL/ODBC), Implementing data validation rules that reduced errors by 25%

Built machine learning-ready datasets and implemented time series model for predictive analytics initatives

Collaborated cross-functionally to translate business requirements into technical specifications, aligning data architecture with sales operations

Projects

Automated Reputation Data Integration Pipeline Chetmeter Peblla LLC 2024

Designed a serverless ETL Pipeline using AWS Lambda to orchestrate weekly batch ingestions of semi- structured reputation data from the Chatmeter API in Python, ensuring reliable data pipelines reliability

Integrated Snowflake with Amazon S3 using External Stages for seamless access to JSON files

Transformed semi-structures into designed structured snowflake data models using PARSE_JSON, FLATTEN

Processed reputation data to support weekly reports, providing actionable insights to stakeholders Customer Behavior Analysis Pipeline GCP (Google Analytics) Peblla LLC 2025

Designed, built and optimized a scalable data pipeline using Python SDK (Google Analytics Data API) to ingest, process and transform behavioral metrics across 300+ GA4 properties

Developed Python scripts to ingest JSON-based API responses into normalized DataFrames; Applid regex- based cleaning to standardize inconsistent session_medium and source fields for downstream analysis

Authored data governance documentation for develop team and collaborated cross-functionally with product and marketing teams to support A/B testing and data-driven campaign strategies Enterprise report for Clients Peblla LLC 2023

Utilized User-defined-functions, procedures, and tasks to perform oprations with dynamc tables and views

Implemented Role-Based Access Control(RBAC) policy in Snowflake to control user access to sensitive data

Maintained the self-created permission table and developed PROCEDURES and TASK with STREAM to automatically configure roles based on client ID, enhancing the efficiency and accuracy of access control

Managed secure batch data loading using RLS, developed client-specific dashboards in Power BI



Contact this candidate