Data Engineer - ETL & Cloud Data Platform Specialist

Location:

Plano, TX

Posted:

January 12, 2026

Contact this candidate

Resume:

ARUN SUMANTH POLINENI

• Data Engineer • 913-***-**** • Plano, TX – 75024 •***************@*****.***

PROFESSIONAL SUMMARY

Data Engineer with 4+ years of experience building scalable data platforms and high-volume ETL pipelines across Azure, AWS, and Snowflake. Proven track record of improving pipeline performance, reducing cloud costs, and enabling business teams with reliable, analytics-ready data. Strong expertise in ADF, Databricks, PySpark, SQL, and data modeling, with experience supporting financial analytics, regulatory reporting, and enterprise BI workloads.

EDUCATION

Master of Science, Computer Science - 3.63/4.0 GPA Jan 2023 – May 2024

University of Missouri - Kansas City, Kansas City, MO

TECHNICAL SKILLS

•Programming Languages: Python, SQL, Apache Spark, PySpark, Spark SQL, DAX

•Data Modeling and ETL: ETL Processes, Data Warehousing, Data Modeling, Informatica PowerCenter, SSIS, Alteryx, Apache Airflow, Medallion Architecture

•Cloud Technologies: Microsoft Azure (Data Factory, Databricks, Synapse, Data Lake Storage, Logic Apps, Cosmos DB, ADLS, Azure Key Vault), AWS (S3, EC2, Redshift, Glue, Lambda, RDS), Google Big Query, Microsoft Fabric

•Databases & Warehouses: MySQL, SQL Server, Azure DB, Postgres SQL, Mongo DB, Snowflake, AWS Redshift, Azure Synapse

•Big Data Technologies: Apache Spark, Apache Hadoop, Apache Kafka

•Packages: NumPy, Pandas, Matplotlib

•Devops & Infrastructure as Code (IaC): Azure DevOps, Jenkins, Kubernetes, Terraform

•Tools: SSMS, Power BI, Visual Studio, Jupyter, Microsoft Word, Excel

•Project Management Methodologies: Agile and Waterfall

•Other Technologies: Version control tools (Git & Github), Linux, Data Governance (Collibra DGC), UNIX, API and Web Services.

WORK EXPERIENCE

Data Engineer Jul 2024 - Present AppWorks, USA

●Designed and deployed scalable 40+ ETL pipelines using Python, PySpark, Azure Data Factory, SSIS, and Databricks, cutting data processing time by up to 30%.

●Implemented Medallion Architecture within Azure Databricks and Snowflake, ensuring high-quality, analytics-ready data layers.

●Orchestrated complex data workflows with Apache Airflow, enhancing pipeline reliability and automation and reducing manual intervention by 40%.

●Engineered secure, scalable data platforms on Azure Data Lake Storage, Synapse Analytics, and Snowflake, optimizing storage, query performance, and cost-efficiency by 35%.

●Processed large-scale datasets with PySpark, improving performance for high-volume data workloads.

●Conducted advanced data analysis using Python (Pandas, NumPy) and SQL, yielding actionable insights for strategic decisions.

●Developed dynamic Power BI dashboards integrated with Snowflake and Azure, boosting real-time business intelligence capabilities.

●Implemented robust data validation frameworks, increasing pipeline reliability and data accuracy by over 20%.

●Built reusable ETL components for ingesting data (APIs, files, databases) into Snowflake and Azure Data Lake, reducing new pipeline development time by 30%.

●Worked in the finance domain, supporting market intelligence and analytics platforms with secure, scalable data pipelines to power financial insights.

Data Engineer Mar 2021 - Dec 2022

Accenture Solutions, Hyderabad, India

●Designed and deployed 50+ parameterized Azure Data Factory (ADF) pipelines for automated ETL, improving efficiency by 30%.

●Implemented ETL in Azure Databricks using Spark and PySpark, boosting performance by 25%.

●Monitored pipeline executions, resolving discrepancies and ensuring robust data processing.

●Built automated workflows in ADF and Databricks with Logic Apps and Functions for alerts and logging, speeding up issue resolution by 50%.

●Established CI/CD pipelines in Azure DevOps, reducing ADF component deployment time across environments by 50%.

●Performed SQL analysis in Snowflake, uncovering insights to support data-driven decisions.

●Collaborated with stakeholders to align data solutions with business needs, generating ad-hoc reports using Snowflake and Power BI.

●Implemented data governance and compliance frameworks (GDPR, CCPA) using Collibra and data classification tools to enforce policies and maintain audit readiness.

●

Data Engineer Feb 2020 - Mar 2021 Capgemini, Pune, India

●Developed ETL pipelines using AWS Glue, Informatica PowerCenter, and PySpark, enhancing data integration and processing efficiency by 25%.

●Optimized real-time streaming pipelines with Kafka and Spark Streaming, reducing latency and accelerating high-frequency data processing by over 25%.

●Managed data storage solutions using MongoDB, Amazon Redshift, GCP Bigquery, and Snowflake, reducing compute costs by 20% and improving analytics performance.

●Automated scalable data workflows via Apache Airflow and Spark, reducing AutoML model training time by 30%.

●Engineered cloud-native data platforms leveraging AWS services (EC2, S3, Lambda, Glue, Redshift, Snowflake) for secure, scalable analytics.

●Provisioned and managed AWS data environments using Terraform, enabling repeatable and secure infrastructure deployment.

●Containerized data processing applications using Docker and deployed them via Kubernetes clusters for scalable, portable execution environments

CERTIFICATIONS

● Microsoft Certified Azure Fundamentals (AZ-900)

● Microsoft Certified Azure Data Fundamentals (DP-900)

● Microsoft Certified Fabric Data Engineer (DP-700)

ACHIEVEMENTS

● Awarded “Rookie of the Month” for outstanding contributions to pipeline automation and performance optimization.

● Recognized with “Monthly Grammy Award” for exceptional teamwork and successful project deliveries.

Contact this candidate