Hassaan Arsh
Senior/Principal Data Engineer Cloud Data Architect
***********.**@*****.*** +1-226-***-**** Lucan, ON NOM 2J0
Professional Summary
Highly experienced Data Engineer with 10 years of proven expertise in designing, developing, and operating
scalable, cloud-native data platforms across analytics, product, and operational systems. Skilled in end-to-end
data architecture, including ETL/ELT pipelines, lakehouse & data warehouse design, streaming & event-driven
architectures, cloud platforms, and governance frameworks.
Proven track record of leading data platform initiatives from design through production, improving data
quality, reliability, and performance while enabling self-service analytics and ML pipelines. Recognized for
technical leadership, mentoring engineers, and translating business needs into robust, secure, and highly
available data solutions. Adept at startups, scale-ups, and enterprise environments, making data platforms
enterprise-ready, cost-efficient, and future-proof.
Skills
Data Engineering & Architecture Data Quality, Governance & Observability
End-to-End ETL/ ELT Pipeline Design (Batch & Data Reliability Engineering (DRE), SLAs & SLOs
Streaming) Data Warehousing & Lakehouse Data Validation, Lineage & Traceability Data
Architecture Dimensional & Analytical Data Governance & Security (GDPR, SOC2, IAM, RBAC)
Modeling (Star, Snowflake, 3NF) Event-Driven & Observability: Prometheus, Grafana, Alerts &
Near Real-Time Data Systems Change Data Capture Dashboards
(CDC) & Data Ingestion Scalable, Fault-Tolerant,
Multi-Region Systems Orchestration & DevOps
Apache Airflow, Dagster, Prefect (Advanced DAGs)
Cloud & Data Platforms CI/ CD Pipelines & GitHub Actions Containerization
AWS (S3, Redshift, Glue, Athena, Lambda, EMR) GCP & Infrastructure as Code (Docker, Terraform)
(BigQuery, Dataflow, Pub/ Sub) Azure (Data Factory, Monitoring, Logging, Backfills, and Failure Recovery
Synapse, Blob Storage) Cloud-Native Architectures,
Cost Optimization, Auto-Scaling Programming & Query Optimization
Python, SQL (Advanced, Optimization, Stored
Streaming & Real-Time Processing Procedures) Spark, Scala, Java Performance
Apache Kafka, Kafka Connect, Spark Structured Tuning for Large-Scale Analytical Workloads
Streaming AWS Kinesis, GCP Pub/ Sub Event
Stream Processing, Windowing, Transformation Analytics & BI Enablement
Analytics-Ready Datasets, Data Marts, Semantic &
Databases & Storage Metrics Layers Integration with BI Tools (Looker,
PostgreSQL, MySQL, MongoDB, DynamoDB, Tableau, PowerBI) Support for Data Science & ML
Cassandra Snowflake, Databricks Pipelines
Leadership & Collaboration Machine Learning Data Pipelines
Technical Ownership & Platform Roadmaps Feature engineering, training & inference datasets,
Mentorship & Team Leadership Cross-Functional and ML-ready data platform support.
Stakeholder Communication Agile, Scrum, and
Sprint-Based Development
Data Observability & Monitoring
Automated monitoring, anomaly detection, lineage
tracking, and proactive alerting for reliable
pipelines.
Professional Experience
Senior/Principal Data Engineer, Adastra Corporation 01/2022 – Present
• Lead design and development of a modern cloud data platform, leveraging AWS
S3, Redshift, Lambda, and Snowflake to support analytics, reporting, and ML
workloads.
• Built high-throughput ETL/ ELT pipelines processing multi-terabyte daily datasets,
improving freshness, availability, and reliability by 40%+
• Architected real-time streaming pipelines using Kafka and Spark Structured
Streaming, enabling near real-time dashboards and alerts.
• Implemented data quality and observability frameworks, reducing production
data incidents by 45% and maintaining SLA compliance.
• Optimized warehouse and lakehouse performance, achieving ~30% annual
infrastructure cost reduction through partitioning, clustering, and query tuning.
• Mentored 4+ mid and senior data engineers, conducted architecture reviews, and
led platform roadmap planning.
• Collaborated with product, analytics, and ML teams to deliver feature-ready
datasets, supporting AI/ ML initiatives.
• Designed and enforced data governance, lineage, and compliance standards,
ensuring secure, auditable, and trusted data across the organization.
• Led initiatives to standardize analytics-ready datasets and semantic layers,
improving BI adoption and reducing inconsistencies across reporting tools.
Senior Data Engineer, Ataccama Corporation 06/2017 – 12/2021
• Developed and maintained batch ETL pipelines ingesting structured and semi-
structured data from multiple sources.
• Designed and optimized cloud data warehouses and analytics-ready data models
for finance, operations, and product teams.
• Implemented data validation, quality checks, and monitoring, improving trust
and reducing downtime.
• Collaborated with stakeholders to create self-service BI datasets, dashboards, and
operational reports.
• Refactored SQL queries and optimized ETL workflows, reducing pipeline run
times by 25%+
Junior Data Engineer, TechnoData Analytics Services 07/2014 – 05/2017
• Developed batch ETL pipelines for business and operational data ingestion.
• Assisted in building dimensional models and data marts to support analytics and
reporting.
• Wrote Python scripts and SQL queries to transform and clean large datasets.
• Partnered with analytics and product teams to deliver actionable datasets.
• Maintained documentation and knowledge bases for data pipelines and models.
KEY PROJECTS
Modern Cloud Data Platform
• Designed centralized data lakehouse and warehouse on AWS/ Snowflake.
• Integrated batch and streaming pipelines, supporting real-time dashboards, ML features, and analytics-
ready datasets.
• Reduced data delivery times by 40% and improved governance & lineage.
Data Warehouse Optimization & Migration
• Refactored legacy warehouses and pipelines during migration to modern cloud stack.
• Optimized queries, reduced compute costs, and enabled near real-time reporting.
• Improved platform reliability and maintainability, enabling enterprise-grade analytics.
Certificates
Databricks Lakehouse AWS Certified Data Analytics - Google Professional Data
Fundamentals Specialty Engineer
Lakehouse architecture, Delta Advanced cloud data analytics Designing, building,
Lake, and Spark-based data and ETL/ ELT expertise on AWS. operationalizing, and securing
pipelines. data processing systems on GCP.
Microsoft Azure Data Engineer
Associate (DP-203)
Implementing cloud data
solutions, pipelines, and
governance on Azure.
Education
Bachelor of Science in Computer Science (BSCS)