Archie (Yuqiang) Zhang
***- ***-**** *************@*****.*** LINKEDIN WASHINGTON, DC
Professional Skills
Programming and Data Management: Python, SQL, PostgreSQL, MySQL, Hadoop, PySpark
ETL & Data Pipeline: Data Integration and mirgration, Data Quality Assurance, Workflow Automation, Data model, Data warehouse desgin, Data security and compliance, Data Manipulation
Cloud & Platform: Snowflake, AWS(Lambda, EC2, S3, CloudWatch, Airflow, SES), GCP(GA, Bigquery)
Visualization & reporting: PowerBI, Tableau, Rmarkdown, Metabase, Streamlit
System Administration & Development: Linux, Bash scripting, Git, Jira, Docker, Cursor, Versel Education
University of Michigan, Ann Arbor, MI, May 2023 Master of Science: Biostatistics Liverpool University, Suzhou, China, June 2021 Bachelor of Science: Applied Mathematics Work Experience
Peblla LLC, Data Engineer, Sep 2023 – Present, Rockville, Maryland
Designed and implemented a multi-layer data warehouse architecture (ODS, DWD, DWS, ADS, DIM) using the Snowflake Schema optimized for OLAP workloads, enabling efficient and scalable analytical reporting
Integrated AWS (Lambda, EC2, S3, CloudWatch, Airflow, SES), Google Cloud APIS (GCP) and Snowflake to design, build and optimize ETL process, significantly enhancing pipeline reliability, scalability and efficiency
Optimized Snowflake architecture through strategic data governance, systematically reducing credit usage and storage costs by 25%, while enhancing query performance by 40%
Implemented Authentication(MFA, SSO), monitoring mechanisms, data security policies (RBAC, Data Masking), functions (Time-Travel, Encrypt_raw and etc.), ensuring compliance with industry regulations
Created and maintained documentation of data models, ETL processes, and data security policies, reducing new team members’ onboarding time by 50% and ensuring consistent data governance practices Overture LLC, Data analyst, May 2022 – Aug 2022, Fairfax, Virginia
Developed 8 Power BI dashboards for sales analytics, enabling real-time KPI tracking
Designed Python ETL pipelines using pandas, numpy and SQL connectors (PostgreSQL/ODBC), Implementing data validation rules that reduced errors by 25%
Built machine learning-ready datasets and implemented time series model for predictive analytics initatives
Collaborated cross-functionally to translate business requirements into technical specifications, aligning data architecture with sales operations
Projects
Automated Reputation Data Integration Pipeline Chetmeter Peblla LLC 2024
Designed a serverless ETL Pipeline using AWS Lambda to orchestrate weekly batch ingestions of semi- structured reputation data from the Chatmeter API in Python, ensuring reliable data pipelines reliability
Integrated Snowflake with Amazon S3 using External Stages for seamless access to JSON files
Transformed semi-structures into designed structured snowflake data models using PARSE_JSON, FLATTEN
Processed reputation data to support weekly reports, providing actionable insights to stakeholders Customer Behavior Analysis Pipeline GCP (Google Analytics) Peblla LLC 2025
Designed, built and optimized a scalable data pipeline using Python SDK (Google Analytics Data API) to ingest, process and transform behavioral metrics across 300+ GA4 properties
Developed Python scripts to ingest JSON-based API responses into normalized DataFrames; Applid regex- based cleaning to standardize inconsistent session_medium and source fields for downstream analysis
Authored data governance documentation for develop team and collaborated cross-functionally with product and marketing teams to support A/B testing and data-driven campaign strategies Enterprise report for Clients Peblla LLC 2023
Utilized User-defined-functions, procedures, and tasks to perform oprations with dynamc tables and views
Implemented Role-Based Access Control(RBAC) policy in Snowflake to control user access to sensitive data
Maintained the self-created permission table and developed PROCEDURES and TASK with STREAM to automatically configure roles based on client ID, enhancing the efficiency and accuracy of access control
Managed secure batch data loading using RLS, developed client-specific dashboards in Power BI