Data Engineer - Cloud-Native Data Platforms Specialist

Location:

Houston, TX, 77019

Salary:

80000

Posted:

March 20, 2026

Contact this candidate

Resume:

Somasekhar M

Email: ***************@*****.***

Mobile: 401-***-****

LinkedIn: linkedin.com/in/somasekhar-mamidipaka1210/ Data Engineer

PROFESSIONAL SUMMARY

Designs scalable data platforms with Python, SQL, Spark, and cloud services, delivering reliable pipelines, trusted datasets, analytics-ready architectures, and measurable operational stability enterprise-wide consistently.

Builds batch and real-time ETL/ELT workflows with Airflow, Kafka, Databricks, and Snowflake, improving data accessibility, operational resilience, stakeholder alignment, and decision support outcomes measurably.

Optimizes warehousing, orchestration, and governance processes across AWS, Azure, and GCP ecosystems, strengthening performance, observability, security, compliance readiness, and stakeholder confidence in reporting consistently.

Collaborates with analytics, engineering, and business teams to translate requirements into durable data solutions, enabling faster insights, standardized models, dependable integrations, and platform growth.

Facilitated excellent written oral communication skills to enhance team collaboration and improve project outcomes.

Applied passion automation continual process improvement to streamline workflows, boosting efficiency by 30%. TECHNICAL SKILLS

Cloud Platforms - AWS (EC2, Lambda, Glue, S3, Kinesis, IAM, EKS, Redshift), Azure (ADF, Synapse, Azure SQL, Entra ID, Key Vault), GCP (BigQuery, GKE, Cloud Storage)

Infrastructure as Code (IaC) - Terraform, Ansible, ARM Templates, Bicep, CloudFormation, Jenkins, Azure DevOps

Monitoring and Incident Response - New Relic, AWS CloudWatch, Azure Monitor, ServiceNow, RCA, SLA Management

Security and Compliance - IAM, Encryption, NIST 800-53, CIS Benchmarks, PCI-DSS, RBAC, Key Vault, Audit Logging

CI/CD and DevOps - Jenkins, GitHub Actions, Git, GitLab, CodePipeline, CI/CD Pipelines, Shell Scripting

Programming & Scripting - Python, SQL, Bash, PowerShell, Perl

Data Engineering - AWS Glue, Azure Data Factory, DBT, Apache Kafka, Spark, Hive, GCP Dataflow, Informatica

Databases - Redshift, Snowflake, Azure SQL, PostgreSQL, MongoDB, MySQL, Oracle, Oracle Exadata

Dashboards and Visualization - Power BI, Tableau, Looker, AWS QuickSight

System Administration and Infrastructure - Linux-based processes, Linux environment setup, Unix file systems, mount types, permissions, pipes

PROFESSIONAL EXPERIENCE

Hanover Insurance Group March 2025 – Present

Azure Data Engineer

Architected batch ETL pipelines with Python, SQL, and Snowflake, consolidating policy and claims data into governed models that improved downstream reporting reliability substantially enterprise-wide.

Orchestrated Airflow workflows across AWS services, automating ingestion, validation, and recovery processes that reduced manual intervention, improved SLA adherence, and strengthened operational continuity organization- wide.

Integrated Kafka and Spark processing for near real-time insurance events, enabling faster data availability, higher pipeline resilience, and more dependable analytics for underwriting stakeholders.

Optimized Redshift and SQL workloads by refining partitioning, joins, and storage strategies, improving query performance, lowering latency, and accelerating recurring business intelligence deliveries consistently.

Standardized CI/CD and Terraform deployment practices for data products, increasing release consistency, environment traceability, governance alignment, operational readiness, and platform stability across enterprise teams.

Engineered data warehouses and ETL/database load/extract processes using Oracle Exadata, improving data flows efficiency by 40% and reducing query processing time by 30%.

Pioneered backend focus on system/architecture improvements in Linux environment setup, enhancing Unix file systems, mount types, and permissions, resulting in 25% faster deployment cycles. Lahey Hospital & Medical Center September 2024 – February 2025 AWS Data Engineer

Engineered secure ELT pipelines with Azure Data Factory, Databricks, and SQL, transforming clinical and operational datasets into trusted structures supporting timely reporting needs enterprise-wide.

Validated data quality through automated checks, lineage documentation, and exception handling, improving confidence in healthcare dashboards, compliance readiness, while reducing recurring reconciliation issues significantly.

Configured Azure Synapse and Power BI integrations, delivering curated datasets that simplified analyst access, improved semantic consistency, and enhanced decision-making across finance operations enterprise-wide.

Streamlined PySpark transformations for high-volume records, improving processing efficiency, data standardization, and enabling reliable publication of standardized metrics for cross-functional hospital teams daily consistently.

Collaborated with business and technical stakeholders to define data requirements, translate priorities into backlog- ready solutions, strengthen delivery alignment, and improve governance across critical domains.

Orchestrated automation and continual process improvement with orchestration tools and Perl, boosting operational efficiency by 35% and reducing manual intervention by 50%.

Architected ETL tools integration with Informatica and relational databases, achieving 99.9% data accuracy and reducing data transformation time by 20%.

BMW Manufacturing September 2021 – July 2023

GCP Data Engineer

Designed scalable ingestion frameworks with Kafka, Spark, and Delta Lake, centralizing manufacturing telemetry, improving traceability, and enabling consistent analytics across plant operations enterprise-wide successfully.

Automated pipeline monitoring with Databricks, Airflow, and logging controls, accelerating issue detection, reducing downtime, and improving reliability for production-facing data workflows consistently enterprise-wide daily.

Established dimensional models in Snowflake and SQL Server, organizing supply chain and quality datasets into reusable structures that improved reporting consistency enterprise-wide significantly measurably.

Analyzed performance bottlenecks across ETL jobs and storage layers, applying optimization techniques that shortened runtimes, stabilized throughput, and supported faster operational insights consistently enterprise-wide.

Implemented Git-based versioning and CI/CD controls for data engineering assets, improving collaboration, auditability, deployment discipline, release governance, and confidence across distributed delivery teams consistently.

Modernized Linux-based processes using standard tools and pipes, achieving seamless integration and reducing processing overhead by 25%.

Quantified practical working experience in Agile methodology, enhancing team collaboration and delivering 15% more features per release cycle.

Poorvika Mobiles Pvt. Ltd March 2019 – August 2021 Data Engineer

Built cloud-based data pipelines with AWS Glue, S3, and Python, integrating sales and inventory sources into accessible datasets for business reporting needs enterprise-wide consistently.

Developed dbt and SQL transformations that standardized retail metrics, improving dashboard accuracy, model consistency, and enabling more reliable trend analysis across categories enterprise-wide consistently.

Maintained data warehouse structures and orchestration schedules, ensuring dependable refresh cycles, strong data availability, and minimizing disruptions for downstream analytics consumers daily consistently enterprise-wide.

Provisioned Docker and Kubernetes-ready components for deployment workflows, improving portability, scalability, release consistency, and resilience of data processing applications across environments consistently enterprise-wide daily.

Documented schemas, lineage, and operational runbooks for critical datasets, strengthening knowledge transfer, support readiness, maintainability, and long-term continuity of platform assets organization-wide consistently daily.

Revolutionized excellent written oral communication skills by identifying key stakeholder needs, resulting in a 30% increase in project approval rates.

CERTIFICATIONS

Microsoft Certified: Azure Data Engineer Associate

AWS Certified Data Engineer - Associate

Google Professional Data Engineer

Databricks Certified Data Engineer Associate

EDUCATION

Master's in Computer Science - University of Massachusetts

Bachelor's in Electronics & Communication Engineering - Acharya Nagarjuna University

Contact this candidate