Data Engineer with Cloud & ETL Expertise

Location:

Dallas, TX

Posted:

January 12, 2026

Contact this candidate

Resume:

Abhijit Maddineni

*************@*****.*** +1-214-***-**** www.linkedin.com/in/abmaddineni/

The Colony, TX – 75056 U.S. Citizen Public Trust Clearance (Active) PROFESSIONAL SUMMARY

Result-oriented Data Engineer with 5+ years in developing and managing data platforms across AWS, Azure, and GCP. Proficient in building automated ETL pipelines using PySpark, SQL, and Terraform, streamlining data integration, and cloud infrastructure management. Expertise in integrating CI/CD and DevOps workflows, improving data reliability, reducing cloud operational costs by 22%, modernizing legacy systems, implementing data quality frameworks, and enhancing analytics scalability within hybrid environments. EXPERIENCE

Data Engineer: SMX Apr 2025 – Aug 2025

• Developed AWS Glue ETL pipelines within GovCloud, transforming 4TB of legacy HDFS data into bronze, silver, and gold layers using PySpark, S3, and Glue Catalog

• Automated cloud infrastructure deployment via Terraform for GovCloud resources, including Glue jobs, IAM roles, and S3 buckets, maintaining compliant provisioning within Dev, QA, and Prod

• Engineered a YAML-based ingestion framework for FTP, SFTP, API, and Batch inputs with parameterized data quality rules for schema drift and validation, ensuring 99% integrity across data pipelines

• Created AWS Step Functions to replace legacy workflows with retry logic and event triggers, enhancing metadata tracking and lineage documentation for improved audit and compliance monitoring

• Integrated CI/CD pipelines via Azure DevOps to automate configuration deployments in AWS GovCloud, enhancing metadata tracking, lineage documentation, and FedRAMP-compliant data governance by 20% Data Engineer: CVS Health Sep 2024 – Apr 2025

• Developed large-scale ETL pipelines on GCP using Cloud Composer, Pub/Sub, and Dataflow, managing 5TB+ daily data ingestion across enterprise datasets

• Deployed Python- and SQL-based validation frameworks, achieving 99.9% data accuracy for ingestion stages and automating quality checks for consistency

• Streamlined pipeline monitoring with Stackdriver alerts, SLA dashboards, and failure notifications, improving detection of 90% of critical job errors

• Constructed BigQuery data models supporting analytics and data science workflows, leveraging clustering, partitioning, and materialized views, minimizing processing costs by 35%

• Established data lineage and metadata documentation using Data Catalog while collaborating with BI teams to develop self-service dashboards, expanding real-time reporting access and governance visibility Data Engineer: Docpoint Solutions Nov 2023 – Sep 2024

• Engineered Azure data platforms using Data Factory, Databricks, and PySpark, processing over 7TB of data monthly across batch and streaming analytical workflows

• Constructed event-driven ingestion pipelines leveraging Azure Event Hubs and Functions, driving a 95% improvement in near real-time data delivery speed and strengthening reporting accuracy

• Implemented Python-based validation and anomaly detection scripts integrated with Azure Monitor and Log Analytics, detecting 92% inconsistencies and anomalies before production deployment

• Enhanced Synapse Analytics and Delta Lake queries with indexing, partitioning, and optimized Databricks parallel processing, improving execution speed by 28% and stabilizing high-volume data ingestion Data Engineer: Epsilon Aug 2020 – Nov 2023

• Designed AWS Glue, Lambda, and S3 ETL pipelines, processing data, ensuring ingestion and transformation throughout multi-source enterprise datasets

• Migrated Teradata workloads to BigQuery, scaling data capacity by 40%, optimizing resource utilization, and enhancing structured query execution for data-driven decision systems

• Created Kafka and Spark Streaming architectures for real-time ingestion, maintaining sub-4-second latency, boosting alert accuracy, and timeliness for analytics stakeholders’ enterprise-wide

• Deployed Airflow DAGs orchestrating AWS and GCP workflows, achieving 99% task reliability and automating cross-cloud scheduling for analytics, finance, and compliance departments

• Introduced IAM governance, Terraform automation, and Alation tracking, improving metadata accuracy by 45%, while executing cloud audits, minimizing infrastructure costs by 22% SKILLS

• Python

• SQL

• PySpark

• Shell Scripting

• AWS

• GCP

• Azure

• ETL Design

• Data Modeling

• Schema Evolution

• Airflow

• Terraform

• Azure DevOps

• Docker

• Kubernetes

• Kafka

• Pub/Sub

• Event Hubs

• Spark Streaming

• Glue Catalog

• IAM

• KMS

• FedRAMP Compliance

• ADLS

• Cloud Storage

• Snowflake

• Redshift

• BigQuery

• Teradata

• CloudWatch

• Power BI

• Looker

PROJECTS

Teradata to GCP Modernization and Migration Oct 2024 – Apr 2025

• Directed the modernization of a large-scale enterprise data warehouse, transitioning workloads from Teradata to Google Cloud (BigQuery, Dataflow), enhancing analytics scalability

• Designed ETL pipelines using Cloud Composer (Airflow) & Python, strengthening data movement reliability Cross-Cloud Data Platform – GCP to AWS with Azure Monitoring Mar 2022 – Oct 2023

• Directed complete migration of business-critical pipelines and data assets from GCP (BigQuery) to AWS

(Redshift), enhancing data accessibility by 40%

• Redesigned ETL workflows using Airflow, AWS Glue, S3, Lambda, and minimizing cross-cloud delays EDUCATION

Vignan University

Bachelor of Technology in Computer Science (Hons)

CERTIFICATIONS

• Azure Security Engineer (AZ-500) and Azure Data Fundamentals (DP-900)

• GCP Certified Professional Data Engineer

Contact this candidate