Senior Data Engineer - Cloud-Native Data Platforms Expert

Location:

Bath Township, OH

Salary:

80000

Posted:

April 08, 2026

Contact this candidate

Resume:

Harsha Vardhan

Email: *****************@*****.***

Mobile: 253-***-****

Senior Data Engineer

PROFESSIONAL SUMMARY

●Data Engineer with 6+ years of experience designing, building, and optimizing cloud-native data platforms and large-scale ETL/ELT pipelines across Azure, AWS, and GCP ecosystems.

●Specialized in Azure Data Factory, Microsoft Fabric, Synapse, and Databricks, enabling metadata-driven frameworks, delta lakehouse implementations, and bronze–silver–gold architectures for enterprise data management.

●Expertise in AWS (Glue, Redshift, S3, Lambda, Kinesis) and GCP (BigQuery, Dataflow, Pub/Sub, Composer) to deliver real-time streaming pipelines, analytics platforms, and machine learning–ready datasets.

●Strong proficiency in SQL, Python, and PySpark for advanced transformations, query optimization, and performance tuning on structured, semi-structured, and unstructured data.

●Experienced in data governance, security, and compliance frameworks (IAM, Key Vault, KMS, Purview, Cloud Monitoring) ensuring data quality, lineage, and adherence to GDPR, HIPAA, and industry standards.

●Adept at data analysis and BI solutions using Power BI, Tableau, and Looker Studio, collaborating with business teams to deliver actionable insights, KPIs, and predictive models that drive decision-making.

TECHNICAL SKILLS

●Cloud - Microsoft Fabric, Azure (ADF, ADLS, Databricks, Synapse, Event Hub, Logic Apps), AWS (S3, Lambda), GCP (BigQuery, Data Studio)

●Languages - Java,Python, PySpark, SQL, Scala, Shell Script

●Databases - SQL Server, Azure SQL DB, Teradata, Oracle, Cosmos DB

●Big Data Tools - Apache Spark, Hadoop, Event Hub, Kafka, Stream Analytics

●Visualization - Power BI, Tableau

●ETL Tools - SSIS, Informatica

●CI/CD - Azure DevOps, Git, Jenkins, Terraform

●Modeling - Star Schema, Snowflake Schema

PROFESSIONAL EXPERIENCE

Google

March 2023 – Present

Senior Azure Data Engineer

●Engineered scalable ETL/ELT pipelines using Azure Data Factory (ADF) and Microsoft Fabric, integrating metadata-driven frameworks that improved data processing efficiency by 45%.

●Built lakehouse solutions on ADLS Gen2 with Delta Lake and Databricks, implementing bronze–silver–gold architectures for governance, lineage, and optimized analytical performance.

●Developed and optimized PySpark transformations in Azure Databricks, handling multi-terabyte datasets with advanced partitioning, caching, and performance tuning for faster processing.

●Integrated real-time streaming pipelines with Azure Event Hub, Databricks Structured Streaming, and Fabric Dataflows, enabling fraud detection, operational monitoring, and real-time analytics.

●Implemented data governance and security controls with Azure Purview, RBAC, and Key Vault, ensuring compliance with GDPR, HIPAA, and enterprise data privacy policies.

●Collaborated with business teams to deliver Power BI dashboards connected to Fabric and Synapse, providing predictive insights, KPIs, and self-service analytics that reduced reporting time by 35%.

AWS

January 2021 – March 2023

AWS Data Engineer

●Designed and developed scalable ETL/ELT pipelines using AWS Glue, EMR (Spark), and Step Functions, processing multi-terabyte structured and semi-structured datasets with high reliability.

●Built and optimized data lakes and warehouses on Amazon S3, Redshift, and Athena, leveraging partitioning, compression, and Spectrum to reduce query costs and improve performance by 30%.

●Implemented real-time streaming pipelines with Kinesis Data Streams, Firehose, and Lambda, enabling low-latency ingestion and analytics for fraud detection and transaction monitoring.

●Automated infrastructure provisioning and deployment using Terraform and CloudFormation, improving environment setup efficiency and ensuring consistency across accounts.

●Integrated machine learning workflows by connecting SageMaker, Redshift, and S3, streamlining model training, validation, and deployment into production-ready pipelines.

●Applied data governance, monitoring, and security frameworks using AWS IAM, CloudWatch, CloudTrail, and Lake Formation, ensuring compliance with HIPAA, GDPR, and enterprise security standards.

Oracle

April 2019 – January 2021

Data Engineer

●Designed and implemented scalable ETL/ELT pipelines using Cloud Dataflow and Dataproc (Spark) to process structured and semi-structured datasets for analytics and reporting.

●Built and optimized BigQuery data warehouses with clustering, partitioning, and materialized views, reducing query execution time by 40% and cutting storage costs by 25%.

●Developed real-time streaming pipelines leveraging Pub/Sub and Dataflow, enabling near real-time insights for customer behavior, fraud detection, and operational monitoring.

●Automated workflow orchestration with Cloud Composer (Airflow), streamlining compliance reporting and ensuring data pipelines met SLAs across multiple business domains.

●Partnered with business analysts to design interactive dashboards in Looker and Looker Studio, providing KPIs, forecasting, and predictive insights that improved decision-making speed by 30%.

●Applied data governance, monitoring, and security controls using IAM, Cloud Monitoring, Data Catalog, and KMS, ensuring adherence to GDPR and enterprise data privacy standards.

Education

●Master's in Computer Science – University of Cincinnati

●MVSR College Of Engineering - Bachelor's in Electrical Engineering

Contact this candidate