Senior Data Engineer with Cloud-Wedged ETL Expertise

Location:

Houston, TX

Salary:

80000

Posted:

April 06, 2026

Contact this candidate

Resume:

Somasekhar Mamidipaka

Email: ***************@*****.***

Mobile: 401-***-****

LinkedIn: linkedin.com/in/somasekhar-mamidipaka1210/ Senior Data Engineer

PROFESSIONAL SUMMARY

Deliver 5+ years of data engineering experience building scalable ETL and ELT pipelines across Azure, AWS, and GCP for analytics, governance, reporting, and integration.

Specialize in Python, SQL, Spark, Databricks, Airflow, dbt, and cloud warehousing to develop trusted data products, reusable models, resilient integrations, and orchestration at scale.

Strengthen data quality, lineage, metadata, observability, and security controls while enabling governed datasets for business intelligence, operational reporting, advanced analytics, and compliance across environments.

Collaborate with analysts, engineers, and business teams to automate ingestion, optimize performance, and deliver analytics-ready datasets supporting regulated, enterprise-scale decision making across organizations consistently.

Guided teams to achieve project milestones, enhancing collaboration and boosting overall productivity.

Mentored junior staff to develop leadership skills, resulting in improved team performance and morale. TECHNICAL SKILLS

Cloud Platforms - AWS (EC2, Lambda, Glue, S3, Kinesis, IAM, EKS, Redshift), Azure (ADF, Synapse, Azure SQL, Entra ID, Key Vault), GCP (BigQuery, GKE, Cloud Storage), OpenShift

Infrastructure as Code (IaC) - Terraform, Ansible, ARM Templates, Bicep, CloudFormation, Jenkins, Azure DevOps

Monitoring and Incident Response - New Relic, AWS CloudWatch, Azure Monitor, ServiceNow, RCA, SLA Management

Security and Compliance - IAM, Encryption, NIST 800-53, CIS Benchmarks, PCI-DSS, RBAC, Key Vault, Audit Logging

CI/CD and DevOps - Jenkins, GitHub Actions, Git, GitLab, CodePipeline, CI/CD Pipelines, Shell Scripting

Programming & Scripting - Python, SQL, Bash, PowerShell

Data Engineering - AWS Glue, Azure Data Factory, DBT, Apache Kafka, Spark, Hive, GCP Dataflow, ETL tools

Databases - Redshift, Snowflake, Azure SQL, PostgreSQL, MongoDB, MySQL

Dashboards and Visualization - Power BI, Tableau, Looker, AWS QuickSight, Tableau Prep

Data Analytics Tools - Alteryx, RapidMiner

Software Architecture - large-scale architecture initiatives

Containers and Containerization - containers, containerized deployments PROFESSIONAL EXPERIENCE

Hanover Insurance Group March 2025 – Present

Azure Data Engineer

Designed Azure Data Factory and Azure Databricks pipelines to ingest insurance claims data, improving transformation consistency, governed delivery, and reporting reliability for enterprise analysts.

Engineered Azure Synapse, ADLS, Python, and SQL workflows to curate policy datasets, enabling reusable models, secure integrations, and trusted underwriting reporting across teams securely.

Optimized PySpark and Spark jobs on Azure Databricks, reducing processing bottlenecks, strengthening data quality validation, and accelerating analytics-ready insurance data availability across functions enterprisewide.

Standardized metadata, lineage, and RBAC controls across Azure platforms, aligning governance requirements with secure ingestion patterns and dependable warehouse assets for compliance reporting teams.

Automated Azure DevOps deployments and observability checks for ETL workflows, minimizing manual failures, improving release consistency, and supporting critical insurance reporting operations for stakeholders.

Engineered data preparation and workflow automation, resulting in a 40% increase in data processing speed and reducing manual intervention by 60% across enterprise-scale data initiatives.

Orchestrated architecture design and large-scale architecture initiatives, achieving 99.9% system uptime and supporting enterprise-level data solutions with enhanced scalability and reliability.

Pioneered data processing automation and automation pipeline management, reducing processing time by 70% and saving 100+ engineering hours monthly in a shared services environment. Lahey Hospital & Medical Center September 2024 – February 2025 AWS Data Engineer

Integrated AWS Glue, S3, and Redshift workflows for patient, encounter, and billing data, improving governed availability for clinical, financial, and operational reporting enterprise-wide consistently.

Configured EMR, Spark, and Lambda processing to streamline HL7, FHIR, and CCDA transformations, increasing pipeline resilience and supporting timely downstream consumption securely enterprise-wide consistently.

Validated schema quality, lineage, and CDC controls across APIs and DynamoDB integrations, reducing reconciliation issues and strengthening trustworthy healthcare datasets for stakeholders consistently enterprise- wide.

Analyzed Snowflake, Tableau, and Power BI requirements with business teams, shaping dimensional models that improved patient service visibility and decision support outcomes consistently enterprise-wide.

Modernized batch and streaming orchestration through Airflow, dbt, and Terraform on AWS, improving release consistency and accelerating secure onboarding for enterprise data products organization-wide.

Architected performance optimization and code quality improvements, resulting in a 30% reduction in system errors and boosting system performance for containerized deployments.

Revolutionized Operational Insights and enterprise rollouts, achieving a 25% increase in decision-making speed and improving enterprise-level governance compliance across cross-functional initiatives.

Mentored and guided teams, enhancing leadership skills and boosting team productivity by 40% in scrum teams handling enterprise-scale data initiatives.

BMW Manufacturing September 2021 – July 2023

GCP Data Engineer

Orchestrated BigQuery, Dataflow, and Pub/Sub pipelines to process manufacturing data, improving scalable analytics delivery, curated datasets, and operational reporting readiness across facilities consistently.

Modernized Cloud Composer and dbt workflows with SQL and Python, enabling dependable transformations, reusable models, and faster access to governed analytical data assets consistently.

Established GCP IAM, metadata, and lineage controls across analytics platforms, strengthening secure access, data governance, and confidence in shared reporting assets across functions organization-wide.

Delivered Looker and Tableau-ready datasets from BigQuery and Spark pipelines, improving dashboard performance, self-service analytics adoption, and business decision support across departments consistently.

Refined data quality monitoring and observability across GCP integrations, resolving anomalies quickly, reducing rework, and supporting reliable manufacturing analytics across business teams organization-wide consistently.

Modernized ETL tools and Alteryx integration, optimizing data workflows and improving data accuracy by 30% for enterprise-level data solutions.

Quantified results-driven scalability and scheduling scalability, resulting in a 50% improvement in task dependency tuning and efficiency in batch processing tools.

Achieved seamless containerized deployments and resolved performance issues, improving application response time by 45% and enhancing troubleshooting efficiency. Poorvika Mobiles Pvt. Ltd March 2019 – August 2021 Data Engineer

Designed Azure Data Factory and Databricks pipelines for sales, inventory, and customer datasets, improving governed availability for merchandising, finance, and operations teams enterprise-wide consistently.

Streamlined PySpark and Delta Lake transformations on Azure Synapse, standardizing transactional records that strengthened trusted dashboards for retail planning and forecasting enterprise-wide consistently today.

Implemented Snowflake, dbt, and Power BI semantic layers, improving self-service analytics adoption and accelerating category performance reporting for business stakeholders enterprise-wide consistently today successfully.

Monitored pipeline quality, lineage, and observability through Airflow and Purview controls, reducing production incidents and increasing confidence in enterprise retail datasets consistently enterprise-wide today.

Monitored ETL and ELT workflows through Azure DevOps and observability practices, improving incident response, reducing reruns, and supporting dependable business reporting across functions organization-wide.

Engineered automation with RapidMiner and Tableau Prep, streamlining data workflows and reducing processing time by 60% in enterprise-scale data initiatives.

Orchestrated OpenShift containers and shared services environment integration, achieving 99% deployment success rate and reducing infrastructure costs by 20%.

Pioneered SQL/PL-SQL queries optimization, resulting in a 35% increase in query execution speed and enhancing database performance across enterprise-level governance frameworks. CERTIFICATIONS

Microsoft Certified: Azure Data Engineer Associate

AWS Certified Data Engineer - Associate

Google Professional Data Engineer

Databricks Certified Data Engineer Associate

EDUCATION

Master's in Computer Science - University of Massachusetts

Bachelor's in Electronics & Communication Engineering - Acharya Nagarjuna University

Contact this candidate