Post Job Free
Sign in

Data Engineer: Cloud, PySpark, SQL, Airflow, DBT

Location:
Dayton, OH
Posted:
February 13, 2026

Contact this candidate

Resume:

Sai Swaroop Morampudi

Email: **************@*****.*** LinkedIn: linkedin.com/in/saiswaroopmorampudi2 Mobile: +1-551-***-****

SUMMARY

Data Engineer with hands-on experience designing, building, and maintaining scalable data pipelines in cloud-based environments.

Proficient in Python, SQL, and PySpark, with experience developing batch and streaming data workflows using Databricks, Airflow,

and Kafka. Hands-on experience with Azure and AWS data platforms, including ETL/ELT development, data modeling, and delivery

of analytics-ready datasets for reporting and machine learning use cases. Strong focus on data quality, performance optimization,

and collaboration with cross-functional teams.

TECHNICAL SKILLS

• Programming & Querying: Python, SQL, PySpark

• Big Data & Processing: Apache Spark, Azure Databricks

• Streaming Technologies: Apache Kafka, Spark Structured Streaming

• Cloud Platforms: Azure (ADF, ADLS Gen2, Databricks, Synapse), AWS (S3, Glue)

• Data Warehousing: Snowflake, BigQuery

• Orchestration & ELT: Apache Airflow, dbt

• DevOps & Tools: Git, Azure DevOps, Docker, Linux

• Visualization: Power BI

PROFESSIONAL EXPERIENCE

Data Engineer Intern – Artificial Inventions, New Jersey, USA May 2024 – November 2024

• Supported development of cloud-based data pipelines using Python and SQL for analytics and reporting use cases.

• Built and maintained batch ETL workflows using Azure Data Factory to ingest data from Azure SQL Database, REST APIs,

and ADLS Gen2.

• Developed PySpark transformation jobs on Azure Databricks to clean, join, and aggregate large datasets.

• Implemented incremental load logic using watermark columns to improve processing efficiency and reduce cloud costs.

• Loaded curated datasets into Azure Synapse and Snowflake for downstream analytics and BI consumption.

• Created and maintained Airflow DAGs to schedule, monitor, and manage data workflows.

• Performed data validation and quality checks prior to production deployments.

• Used Git and Azure DevOps for version control, code reviews, and CI/CD processes.

Data Engineer – IVIS, Guntur, India August 2021– July 2022

• Designed and maintained batch ETL pipelines using Python and SQL to ingest data from relational databases and flat files.

• Developed PySpark transformations for data cleansing, joins, aggregations, and enrichment of large datasets.

• Loaded curated datasets into cloud-based data warehouses to support analytics and reporting requirements.

• Implemented data quality checks including null validation, duplicate detection, and row count reconciliation.

• Assisted with pipeline monitoring, issue resolution, and documentation of data flows and transformations.

• Collaborated with senior data engineers to optimize SQL queries and improve pipeline performance and reliability.

ACADEMIC PROJECTS

E-Commerce Customer Data Engineering & Recommendation Pipeline:

• Designed and implemented an end-to-end data engineering pipeline using Python, PySpark, and Apache Airflow to ingest

customer, order, and clickstream data.

• Built incremental ELT workflows and analytics-ready fact and dimension tables to support machine-learning-based product

recommendations and customer behavior analysis.

Retail Sales Analytics with dbt and Cloud Data Warehouse:

• Built a modular analytics data warehouse using dbt and Snowflake / BigQuery, implementing advanced SQL models, CTEs,

and data quality tests.

• Designed star and snowflake schemas, automated documentation with dbt Docs, and optimized query performance by 30%

through partitioning, clustering, and SQL refactoring.

NYC Taxi Trip Data Engineering & ML Forecasting:

• Engineered a batch data pipeline using Python and PySpark to ingest and transform large-scale NYC taxi trip datasets.

• Developed machine learning forecasting models to predict trip volume and demand trends, supporting analytics and

capacity planning use cases.

EDUCATION

Master of Science in Data Science

Saint Peter's University, New Jersey, USA September 2022 – March 2024

Bachelor of Technology, Electronics and Communication Engineering

Ramachandra College of Engineering, Eluru, India July 2017 – August 2021

CERTIFICATIONS

.

• Microsoft Certified: Azure Data Engineer Associate (DP-203)

• Microsoft Certified: Azure AI Engineer Associate (AI-102)



Contact this candidate