Data Engineering Big

Location:

Seattle, WA

Salary:

150000

Posted:

October 07, 2025

Contact this candidate

Resume:

Ash Malik

Senior Data Engineering & Architecture Expert Cloud, Big Data, And

Modern Data Stack Specialist

Lima, OH 45801 571-***-**** ***.*****.***@*****.*** Experienced Data Engineering and Architecture Professional with a strong track record in designing, building, and optimizing scalable, reliable, and secure data ecosystems across diverse industries, including healthcare. Demonstrated expertise in data architecture, ETL/ELT development, data modeling, and modern cloud data platforms (AWS, GCP, Azure). Highly skilled in big data frameworks (Apache Spark, Hadoop, Kafka), real- time streaming, batch processing, and data lakehouse architectures. Proficient in Python, SQL, and modern data stack tools including Airflow, dbt, Snowflake, Redshift, and BigQuery. Adept at integrating data from diverse sources (APIs, databases, streaming), ensuring data quality, governance, lineage, and compliance with standards such as HIPAA, GDPR, and HITECH. Experienced in architecting end-to-end data engineering solutions, implementing data quality frameworks, and optimizing systems for performance, fault tolerance, and cost efficiency. Recognized for technical leadership, best practice definition, and team mentorship. Proven ability to collaborate cross-functionally with data engineers, analysts, scientists, and business stakeholders to deliver analytics-ready datasets powering BI, AI/ML, and decision-making. Committed to building automated, governed, and enterprise-grade data infrastructures that enable scalable, compliant, and business-aligned data strategies. Data Engineering & ETL/ELT Development

Building scalable ETL/ELT pipelines with Airflow,

dbt, Spark, and Informatica Automating

workflows, data cleaning, transformation, and

orchestration

• Big Data & Real-Time Processing Expertise in

Apache Spark, Hadoop, Kafka, Flink, and

Streaming Pipelines Handling batch and real-time

data (IoT, logs, clickstreams)

•

Cloud Data Platforms Hands-on with AWS (S3,

Glue, EMR, Redshift), GCP (BigQuery,

Dataflow), and Azure (Synapse, Data Factory)

Designing and deploying cloud-native data

architectures

• Data Modeling & Architecture Designing Star,

Snowflake, Data Vault, and Dimensional Models

Building Data Lakes and Lakehouse Architectures

•

Professional Summary

Skills

Programming & Querying Proficient in Python

(Pandas, scripting) and SQL (Joins, CTEs,

Window Functions) Knowledge of Scala and Java

for distributed data systems

• Data Governance, Quality & Compliance

Implementing HIPAA, GDPR, SOC 2 standards

Using Great Expectations, Collibra, Alation for

validation and lineage

•

DevOps & Automation CI/CD with Jenkins,

GitHub Actions Infrastructure as Code using

Terraform, containerization with Docker,

orchestration with Kubernetes

• Monitoring & Observability Tracking data pipeline health using Prometheus, Grafana, Airflow UI, and

CloudWatch Setting up alerts, logging, and

performance tuning

•

Data Integration & APIs Integrating data from

APIs, FHIR, HL7, JSON, CSV, Parquet, and other

sources Combining data across structured and

unstructured systems

• Leadership & Collaboration Leading teams,

mentoring junior engineers, and driving best

practices Collaborating with data scientists,

analysts, business, and compliance teams

•

Data Architect, 01/2023 - Current

1up Health, Inc – United States

Key Technologies: AWS, GCP, Snowflake, BigQuery, Airflow, dbt, Apache Spark, Kafka, Data Governance Tools, Python, SQL

Designed and implemented enterprise-scale data architectures, enabling unified, governed, and secure data ecosystems aligned with HIPAA, GDPR, and HITECH standards

•

Defined data strategy, architecture blueprints, and governance frameworks to support BI, AI/ML, and analytics initiatives.

•

Led the integration of diverse EHR, EMR, and clinical data sources, ensuring compliance, lineage, and data quality across systems.

•

Architected cloud-native data platforms using AWS and GCP, with real-time ingestion, data lakehouse, and warehouse layers.

•

Established data governance policies, access control, and metadata management ensuring security and transparency.

•

Collaborated with engineering, compliance, and analytics teams to deliver analytics-ready, reliable, and auditable data pipelines.

•

Provided technical leadership, mentoring teams, and defining best practices for data modeling, orchestration, and automation.

•

Senior Big Data Engineer, 07/2018 - 12/2022

Census - a Fivetran Company – United States

Work History

Key Technologies: Apache Spark, Hadoop, Kafka, Airflow, dbt, AWS (EMR, Glue, Redshift), GCP (Dataproc, BigQuery), Azure (HDInsight, Synapse)

Led the design and implementation of large-scale distributed data systems using Apache Spark, Hadoop, and Kafka for real-time and batch processing at petabyte scale.

•

Architected and optimized data lakehouse solutions across AWS, GCP, and Azure, enabling scalable and cost-efficient data storage and processing.

•

Developed and managed streaming data pipelines, integrating data from IoT devices, logs, and transactional systems for real-time analytics.

•

Implemented data quality frameworks with automated validation, lineage tracking, and error handling to ensure data integrity.

•

• Drove performance tuning and cost optimization strategies for compute clusters, reducing cloud spend. Mentored junior engineers, established coding standards, CI/CD practices, and orchestration workflows with Airflow and Glue.

•

Partnered with cross-functional teams (data scientists, analysts, product) to deliver machine learning-ready datasets and advanced analytics.

•

Data Engineer, 04/2014 - 06/2018

Databand.ai – United States

Key Technologies: Python, SQL, Airflow, dbt, Snowflake, BigQuery, APIs, Git, Linux Designed, developed, and maintained scalable ETL/ELT pipelines using Python, SQL, and Airflow, ensuring high-quality and reliable data ingestion from APIs, databases, and flat files.

•

Built and optimized data models (star/snowflake schemas) for analytics and reporting, enabling self- service BI and accurate business insights.

•

Integrated data from multiple sources into data warehouses such as Snowflake and BigQuery, ensuring consistency, quality, and governance.

•

Automated data workflows, implemented monitoring and alerting systems to ensure pipeline reliability and performance.

•

Collaborated with analysts and business teams to deliver analytics-ready datasets, supporting dashboards, KPIs, and decision-making.

•

Contributed to data quality assurance, documentation, and code reviews, fostering best practices in data engineering.

•

Bachelor of Science: Computer Science

Punjab University

Education

Contact this candidate