Post Job Free
Sign in

Data Engineer Quality

Location:
Hyderabad, Telangana, India
Salary:
120000
Posted:
September 10, 2025

Contact this candidate

Resume:

Chinnolla Sathvik Reddy

New jersey +1-201-***-**** ******************@*****.***

Professional Summary

Data Engineer with over 5 years of experience designing scalable cloud-based data architectures and robust ETL pipelines. Expertise in SQL, Python, and T-SQL optimization enables integration of diverse sources into consistent, machine-readable formats. Proven ability to enhance data quality and manage enterprise data warehouses while ensuring compliance with data privacy standards. Skilled in leveraging Azure, Databricks, and CI/CD practices to deliver reliable, high-performance solutions, demonstrating a commitment to professional development and career advancement in the data engineering field.

Technical Skills

•Cloud Platforms & Services: AWS (S3, Redshift, EMR, Glue, Lambda, EC2, RDS, CloudFront, CloudWatch), Azure (Data Lake Storage, Data Factory, Databricks, Synapse Analytics)

•Big Data Technologies: Apache Spark (PySpark, Spark SQL, Spark Streaming, Databricks, AWS EMR), Kafka (Message Queuing, Stream Processing), Hadoop (Hive, MapReduce), Kinesis

•Data Warehousing & Databases: Snowflake, Amazon Redshift, PostgreSQL, MySQL, MS SQL Server, NoSQL (DynamoDB, MongoDB), Cosmos

•Programming & Scripting: Python (Expert - Advanced Data Pipelines, Automation, Infrastructure Scripting), SQL (Expert - Complex Queries, Relational Databases, NoSQL, Performance Tuning), Scala (Proficient), Java (Intermediate), Bash / Shell Scripting, PowerShell

•ETL & Data Engineering: Complex Data Pipeline Development (Batch & Real-time), Data Architecture Design (On-prem, Cloud,

Hybrid), Data Migration, Data Wrangling, Data Modeling, Data Quality, Data Governance, dbt, ETL procedures, DDL, DML, Stored

Procedure

•Test Data Management & Privacy Tools: Informatica TDM, IBM Optim, Delphix, CA TDM, Test Data Manager (TDM), Synthetic Data Generation, Al-driven test data creation

•Data Privacy & Compliance: GDPR, HIPAA, referential integrity, data masking, subsetting

•CI/CD & Version Control: CI/CD Pipelines (Jenkins, GitHub, Azure DevOps), Docker, UNIX/Linux, Agile Engineering Practices

•Monitoring & Observability: Splunk (Alerting, Monitoring, Dashboards, Problem Analysis), Dynatrace (Conceptual understanding,

APM concepts), Prometheus, Grafana, Log Management

•Business Intelligence: Power BI, Tableau, Microsoft Excel

Experience

Centene Jul 2024 - Present

Data Engineer Irving TX

•Built scalable hybrid data architecture in Azure to support ML/Al workflows, increasing data reliability and enabling processing of petabyte-scale healthcare datasets from over 10+ sources.

•Reduced data processing time by 30% by optimizing Spark jobs (PySpark/Scala) in Databricks, significantly improving model training throughput and downstream analytics readiness.

•Engineered custom ETL pipelines using Python and SQL, integrating with enterprise data warehouses to improve data accuracy and reduce transformation errors by 25%.

•Cut manual provisioning effort by 50% by automating cloud infrastructure and workflow orchestration in Python, accelerating deployment cycles and enforcing CI/CD best practices.

•Improved incident response time by 35% through KPI monitoring and observability strategies using Splunk and Prometheus, enhancing system visibility and minimizing downtime.

•Ensured HIPAA compliance across PHI pipelines and supported secure data handling using Informatica TDM and CA Test Data Manager for 100% masked QA/UAT datasets.

•Resolved 95% of ingestion failures within SLA by delivering Level 2 production support, conducting root cause analysis, and maintaining ingestion system health.

•Reduced test data provisioning time by 40% through automated synthetic data generation and enforcement of referential integrity, boosting QA efficiency and ensuring data privacy compliance.

Paychex Nov 2023 - Jun 2024

Data Engineer Rochester NY

•Improved pipeline stability by 25% by managing and optimizing high-volume AWS environments using proactive capacity planning and continuous monitoring.

•Reduced deployment errors by 40% through automation of data pipeline deployments and infrastructure tasks with Python, enhancing CI/CD pipeline reliability.

•Migrated 20+ legacy pipelines from on-prem to AWS S3 using Databricks and Python, enabling scalable, cost-effective ETL with robust physical data modeling.

•Engineered PySpark/Scala applications for payroll data validation, transformation, and cleansing, ensuring compliance readiness and accelerating model training cycles.

•Cut incident resolution time by 30% by identifying and fixing system defects and data anomalies, driving increased platform stability and reliability.

•Enabled real-time analytics using Apache Kafka to process streaming payroll events, improving operational visibility and decision-making speed.

•Strengthened GDPR compliance by collaborating with InfoSec and QA teams to implement Delphix-based masking workflows for secure data handling in test environments.

•Enhanced test data readiness by 35% by building TDM pipelines with Al-driven synthetic data, improving test coverage and supporting pet care/childcare scenario simulations.

VMWare Jun 2022 - Jul 2023

Data Engineer Bangalore

•Reduced downstream data errors by 30% by building and maintaining complex ETL pipelines using Python, Spark, and IBM DataStage, embedding rigorous data quality checks.

•Improved ingestion efficiency by 60% through automation of data collection from external sources to AWS S3, enabling scalable integration and reducing manual intervention.

•Optimized performance of 20+ analytical workflows by managing data movement from AWS S3 into Snowflake and DynamoDB, supporting a high-throughput enterprise data warehouse.

•Delivered ML-ready datasets via robust Python and Spark (Scala) pipelines, ensuring consistent and timely data availability for performance optimization across business units.

•Enabled seamless data migration to AWS Redshift and S3, facilitating the movement of structured and semi-structured data and improving data warehousing scalability.

•Increased query performance by up to 25% through SQL-driven data modeling and storage structure optimization, also ensuring smooth backup and recovery during critical migrations.

•Enhanced QA pipeline accuracy by 40% by maintaining referential integrity during test automation and integrating IBM Optim for secure data masking in non-prod environments.

•Accelerated sprint delivery by 20% by contributing to Agile planning, cross-functional collaboration, and multi-initiative coordination, demonstrating high-pressure project execution capability.

Verisk Mar 2020 - May 2022

Data Engineer Hyderabad

•Boosted data processing performance by 35% by developing and optimizing complex data architectures and pipelines using advanced SQL and Python, aligned with enterprise data warehouse goals.

•Reduced data discrepancies by 40% through implementation of automated quality checks and reconciliation across structured (SQL) and NoSQL (MongoDB, Elasticsearch) sources.

•Unified 10+ disparate data sources into machine-readable datasets by engineering Spark-based enrichment and transformation workflows, streamlining analytics readiness.

•Cut manual intervention by 50% by automating batch processing and monitoring tasks with UNIX/Linux scripting, increasing data flow consistency and operational reliability.

•Enhanced platform scalability and integrity by designing custom ETL frameworks and dimensional data models, supporting seamless integration and reporting.

•Improved critical dataset delivery timeliness by 25% by supporting end-to-end data integration with performance-focused design for downstream system reliability.

•Accelerated query performance by 30% via SQL tuning and optimization, improving insight delivery for life insurance analytics and enhancing user experience.

•Strengthened data governance practices through collaboration with engineering teams on documentation, transformation accuracy, and cross-functional standards alignment.

Education

Saint Peter's University

Master of Science, Data Science

Certification

•Certified AWS Data Engineer.

•Microsoft certified - azure data fundamentals



Contact this candidate