Data Engineer

Location:

Nashua, NH

Posted:

October 03, 2025

Contact this candidate

Resume:

Shravya Reddy Pullagurla

Data Engineer / Data Analyst

*****************@*****.*** +1-603-***-****

Nashua, NH

PROFESSIONAL SUMMARY

Data Engineer with 6 years of experience specializing in data engineering, data visualization, and data-driven decision-making. Skilled in SQL, Python, Azure (ADF, Databricks), Snowflake, and AWS (Athena, Glue, S3). Recently delivered a data lakehouse POC using Apache Iceberg and Delta Lake to enable versioned tables, cross-cloud readiness, and advanced data processing. Experienced in building scalable ETL workflows and driving data-driven decision-making.

● Built scalable data pipelines using advanced SQL features like CTEs and window functions to transform and combine data from multiple sources, ensuring accuracy and consistency for reporting

● Developed Python scripts to automate recurring data processing tasks such as file ingestion, data validation, and API integration improving speed, reliability, and reducing manual work.

● Worked on cloud-based data solutions across AWS (S3, Glue) and Azure (ADF, ADLS) to support secure, high-volume data transfers and transformation workflows

● Built ELT pipelines in Databricks using PySpark, transforming clickstream and transactional data into structured layers in Snowflake to support real-time dashboards and analytics.

● Designed end-to-end ETL/ELT workflows with PySpark and SQL, orchestrated using Apache Airflow, enabling efficient batch processing and timely data delivery to stakeholders.

● Used CI/CD practices with Git and Jenkins to version, test, and deploy pipeline changes smoothly across environments, improving development velocity and release quality

TECHNICAL SKILLS

Programming Languages : Python, SQL, Pyspark

Big Data and Lakehouse : Databricks, Apache Spark, Delta Lake, Apache Iceberg Cloud Platforms : Azure (ADF, ADLS, databricks), AWS (S3, Glue, Redshift, Lambda), Snowflake Data Warehousing : Snowflake, Redshift, Azure Synapse ETL & Orchestration : Azure Data Factory, Databricks, AWS Glue, Apache Airflow, DBT Data Processing and Analysis Libraries : Pandas, NumPy, SciPy, Jupyter Notebook Version Control and Collaboration : Git/GitHub, Jenkins, JIRA, Docker BI and Visualization Tools : Tableau, Power BI, Birst Databases : PostgreSQL, MySQL, MongoDB

IDEs and Development Tools : VS Code, Jupyter Notebook Data Modeling and Documentation : DBT (Data Build Tool) PROFESSIONAL EXPERIENCE

Intuit, Data Engineer 10/2024 – Till Date

● Designed and optimized PySpark pipelines in Databricks to process clickstream and transactional data, delivering insights for 500K+ monthly user sessions.

● Extensive experience with Azure Data Factory (ADF), orchestrating dynamic and parameterized pipelines, integrating with Databricks notebooks, REST APIs, and handling complex data movement into Azure SQL DB and Data Lake.

● Implemented Delta Lake features (schema evolution, time travel, merge/upsert) to support versioned and reliable datasets

● Developed real-time streaming pipelines using Databricks Autoloader, enabling checkpointing and incremental data ingestion for continuous processing.

● Built metadata-driven ingestion frameworks, SCD Type 2 pipelines, and rule-based validation layers using Python, SQL, and PySpark.

● Proficient in implementing Unity Catalog for centralized data governance, fine-grained access control, three-level namespace management (catalog > schema > table), and lineage tracking across Databricks workspaces.

● Conducted data quality validation and UAT with stakeholders using SQL/Postman, and implemented a compliance-focused data validation framework that reduced data errors by 15%.

● Optimized Snowflake queries and clustering strategies, improving performance of high-volume analytical workloads by 35%.

● Developed and maintained data pipelines using Python and Azure Databricks for extracting, transforming, and loading data from various sources into Azure Data Lake Storage and Azure SQL Database.

● Implemented incremental partitioning strategies that cut pipeline job duration by 45% and improved SLA adherence.

● Built pipeline health dashboards to monitor SLA breaches, volume spikes, and processing delays through log aggregation. Marlabs Data Engineer 08/2022 – 09/2023

● Built and orchestrated AWS Glue pipelines to integrate large datasets into S3 data lakes, supporting downstream analytics

● Automated serverless workflows with Lambda and Step Functions, cutting deployment effort by 50% and improving reliability

● Designed and implemented source-to-target (S2T) mappings and end-to-end lineage for core financial data pipelines across AWS Glue and Redshift.

● Collaborated with cross-functional teams (analytics, product) to gather requirements and translate them into scalable data solutions.

● Developed and executed SQL queries to validate data at various checkpoints within the data pipeline, ensuring the accuracy and reliability of migrated data

● Monitored Superglue job executions through CloudWatch metrics and logs, proactively diagnosing failures, optimizing runtimes, and consistently maintaining SLA adherence for critical data pipelines.

● Created Tableau dashboards to monitor data quality metrics, with daily and weekly refresh cycles to provide real-time insights and track data quality trends

●

Infor 07/2019 – 07/2022

Data Engineer

● Designed complex SQL queries leveraging advanced joins, subqueries, and window functions to efficiently retrieve, transform, and analyze high-volume datasets

● Built and managed interactive Power BI dashboards with advanced DAX functions and Power Query, enabling insightful visual reporting and business intelligence

● Applied data warehouse principles, including dimensional modeling and star schemas, to design scalable data structures and improve reporting and analytics in SQL Server and cloud platforms

● Researched and presented solutions to Valley Water client by analyzing reports in Lawson which resulted in reducing time complexity by 20%

● Partnered with SMEs and finance stakeholders to gather requirements and align data solutions with business goals

● Special expertise in developing General Ledger detailed report, Balance sheet, Income statement, Drill down reports and Interactive dashboards

● Developed BI Solutions (Dashboards, embedded KPI’s, custom filters, Analytics reports) for business users to show opportunities for new revenue growth based on Sales, Customer, and industry

● As part of Data Modelling, I have performed following steps - Enabling data sources, set the primary keys for each source, specify the joins between the sources define hierarchies, levels, grain of data sources, define column properties, Validate the model, and process the data

KEY ACCOMPLISHMENTS

● CY21 Q2 Global Infor Value Award: in the category “sense of Urgency”

● GDS UTSAAD award for the Maximum billable utilization in the team Jan 2020

● Received Silver batch for receiving multiple raves from project managers.

Contact this candidate