DHYEY CHAUHAN
Ottawa, ON +1-437-***-**** **************@*****.***
Summary
Results-driven Cloud and Data Engineer with 3+ years of hands-on experience designing, developing, and optimizing cloud-native data pipelines and infrastructure on AWS. Proven expertise in building scalable ETL workflows using AWS Glue, Lambda, Athena, and Redshift for real-time and batch data processing. Skilled in managing large datasets across S3, RDS, and Snowflake, while implementing CI/CD pipelines with GitHub Actions, Terraform, and CloudFormation. Adept at leveraging Apache Spark, Airflow, and Kafka for big data transformation and orchestration. Strong background in data modeling, governance, and compliance frameworks including HIPAA and GDPR, with a consistent focus on cost optimization, automation, and business impact. Skills
Programming & Scripting: Python, SQL, Bash, Java, PySpark
Data Engineering & ETL: Apache Airflow, dbt, AWS Glue, Azure Data Factory, Google Dataflow, Talend, Informatica, SSIS, SFTP, ETL Automation
Big Data & Distributed Systems: Apache Spark, Apache Hadoop, SparkSQL, Kafka, Apache Flink, Databricks
Data Warehousing: Snowflake, Amazon Redshift, Google BigQuery, Azure Synapse Analytics
Cloud Platforms & Services: AWS (S3, Redshift, Lambda, Glue, EC2, RDS), Azure, GCP
Databases: MySQL, PostgreSQL, SQL Server, Oracle, MongoDB, DynamoDB
Containerization & Orchestration: Docker, Kubernetes, Amazon ECS
Infrastructure as Code (IaC): Terraform, AWS CloudFormation
Monitoring & Logging: CloudWatch, Prometheus, Grafana, ELK Stack
DevOps & CI/CD Tools: Git, GitHub Actions, Jenkins, Azure DevOps, GitLab CI
Data Modeling & Governance: Star/Snowflake Schema, Data Lineage, Data Quality Management, Metadata Management, OLTP, OLAP, GDPR, HIPAA
Analytics & Visualization: Power BI, Tableau, Looker, SQL-based Dashboards Experience
Wawanesa Group Aug 2024 – Present
Cloud Engineer
Architected and deployed a fault-tolerant AWS infrastructure using EC2, S3, Lambda, Redshift, RDS, and CloudFront, supporting 1.5M+ monthly enterprise data transactions with 99.99% uptime.
Automated infrastructure provisioning via Terraform and CloudFormation, cutting manual configuration effort by over 70%, and reducing deployment errors by 60%.
Designed and orchestrated ETL pipelines using AWS Glue, Step Functions, and Lambda, enabling daily ingestion and transformation of 12M+ records from multiple sources.
Containerized and deployed over 20 microservices using Docker and Amazon ECS with Fargate, increasing deployment efficiency by 45% and improving scalability.
Developed and maintained CI/CD pipelines using GitHub Actions and AWS CodePipeline, reducing release cycles from 2 weeks to 2 days.
Monitored cloud workloads with CloudWatch, Prometheus, and SNS, reducing MTTR (Mean Time to Resolution) by 35% for production issues.
Enforced security controls including KMS encryption, IAM policies, and VPC segmentation, achieving 100% HIPAA compliance in quarterly audits.
Optimized S3 storage using intelligent tiering and lifecycle policies, cutting monthly costs by 22% (~$1,200/month savings). Creative Newtech Ltd Feb 2020 – Dec 2022
Data Engineer
Built and maintained over 30 automated ETL pipelines using Apache Spark (PySpark), Kafka, and Apache Airflow, processing 500GB+ daily from 20+ sources.
Migrated data workflows from on-prem to AWS S3, Redshift, and Athena, reducing report latency by 40% and saving analysts 8 hours/week.
Engineered transformation logic in SQL and Python to clean and enrich 10TB+ of healthcare datasets, increasing data quality scores by 28%.
Created high-performance data models in Redshift using star/snowflake schemas, improving dashboard load times by 50%.
Implemented serverless ETL transformations with AWS Glue and Lambda, decreasing processing costs by 30%.
Developed and published 12+ dashboards in Power BI and Tableau, supporting real-time KPI monitoring for 7 clinical departments.
Established data governance controls including encryption, masking, and audit logs, ensuring full HIPAA and GDPR compliance across 3 annual audits.
Integrated ETL jobs into CI/CD pipelines via GitHub and AWS CodeBuild, reaching 90% test coverage and reducing job rollback by 25%.
Projects
Real-Time Data Pipeline for Financial Transactions Tools & Technologies: AWS (S3, Lambda, Kinesis, Glue, Redshift), Apache Spark, Airflow, Python, SQL, Terraform, GitHub Actions Description: Developed a real-time data ingestion and transformation pipeline to process high-volume financial transactions for fraud detection and reporting.
Key Contributions:
Built streaming ingestion using AWS Kinesis Firehose and processed events via AWS Lambda and Glue Jobs.
Transformed data using PySpark on AWS Glue and loaded insights into Amazon Redshift for business dashboards.
Scheduled pipeline orchestration with Apache Airflow, ensuring >99% SLA compliance.
Enabled schema validation, logging, and S3 backup versioning for audit compliance and fault tolerance.
Automated infrastructure with Terraform and integrated deployment via GitHub Actions. Impact: Reduced data latency from 30 minutes to <5 minutes and improved fraud detection response times by 45%. Education
University of Guelph-Humber
PG Certificate in Cloud Computing (2024)
PG Certificate in AI & ML (Dean’s List, 2023)
GLS University, Gujarat
Bachelor of Science in Information Technology (2019) Certifications
AWS Certified Cloud Practitioner (2025)