Data Engineer Engineering

Location:

Denton, TX

Salary:

70000

Posted:

September 10, 2025

Contact this candidate

Resume:

Navyasri Anegouni

Email: ******************@*****.***

Mobile: +1-469-***-****

Data Engineer

PROFESSIONAL SUMMARY:

4+ years of data engineering experience, demonstrating strong analytical thinking and attention to detail in designing scalable data systems, while working with minimal supervision as a team player.

Expertise in developing robust ETL pipelines and proficiency in connecting dots across applications to understand the E2E view, ensuring seamless data integration and transformation.

Skilled in data ingestion and possess strong communication and presentation skills to effectively communicate across the organization, both to technical and non-technical audiences.

Strong programming proficiency in Python, PySpark, SQL, and shell scripting, utilized to automate complex data workflows and enhance data reliability, while identifying priorities effectively.

Adept in cloud-based architecture design and management, aligning solutions with performance and scalability goals, and willingness to ask questions and reach out for assistance as required.

Solid experience in optimizing Spark jobs and SQL queries, implementing caching, partitioning, and indexing strategies to boost performance and reduce latency in large-scale environments.

Deep understanding of data warehousing principles, including dimensional modeling and schema designs, with practical knowledge in integrating structured and semi-structured data sources.

Proficient in leveraging Power BI and Tableau to develop interactive dashboards and reports, translating complex datasets into actionable insights for stakeholders and executive decision-makers.

Familiar with governance tools, ensuring compliance with organizational policies and maintaining data lineage, quality, and integrity throughout the entire data lifecycle, with innovative thinking.

Proven experience executing large-scale data migration initiatives with minimal business disruption, using automated scripts and orchestration tools, and know-how effort and financials estimation.

Demonstrated excellence in building CI/CD pipelines using Azure DevOps, Git, and Jenkins to automate deployment, version control, and testing of data pipelines in Agile development environments.

Created reusable, modular components for data validation, transformation logic, and quality checks, significantly reducing ETL development time and increasing consistency across multiple data products.

Experience with Oracle Exadata or 10g and above, and proficiency with query tools to aid data analysis, including strong PL/SQL skills to write and analyze complex queries.

Collaborated with cross-functional teams, including data architects, analysts, and business users, to define data strategies and build scalable solutions aligned with evolving enterprise needs.

TECHNICAL SKILLS:

Languages - Python, SQL, Scala, Shell Scripting, PL/SQL

Big Data Technologies - Apache Spark, Hadoop, Kafka, Hive, Databricks

Cloud Platforms - Microsoft Azure (ADF, ADLS, Synapse, SQL DB, EventHub)

Databases - SQL Server, MySQL, PostgreSQL, Oracle, Oracle Exadata

ETL Tools - Azure Data Factory, Informatica, SSIS

Visualization Tools - Power BI, Tableau

CI/CD & DevOps - Git, Azure DevOps, Jenkins

Other Tools - Azure Monitor, Azure Purview, Postman, JIRA, Confluence, Microsoft Office Suite

Processes - Agile, Scrum

PROFESSIONAL EXPERIENCE:

CVS Health July 2024 – Present

Data Engineer

Responsibilities:

Applied analytical thinking and innovative thinking skills to design scalable data pipelines using Azure Data Factory and Databricks, ingesting high-volume healthcare data from diverse sources. This ensured data accuracy and timeliness for executive dashboards and analytics, aligning with compliance and auditing requirements.

Collaborated with data scientists and analysts, demonstrating strong communication and presentation skills to develop analytic-ready datasets using PySpark, ensuring data was transformed, validated, and structured for machine learning workflows. This improved knowledge transfer and onboarding efficiency for new team members.

Integrated Kafka and Azure EventHub to enable real-time data streaming into Databricks environments, allowing near-instant analysis and reporting capabilities for operational and patient-related metrics, showcasing attention to detail. This reduced incident response times and improved system reliability.

Implemented Azure Monitor to track pipeline health metrics and created alerting frameworks that reduced incident response times and improved system reliability across production environments, demonstrating problem-solving skills. This streamlined build and release processes.

Applied data governance standards by integrating Azure Purview to track lineage, enforce data quality, and catalog enterprise datasets, aligning with compliance and auditing requirements, showcasing strong PL/SQL skills. This reduced development redundancy.

Automated deployment of Azure data pipelines using Azure DevOps and Terraform, streamlining build and release processes while cutting manual deployment efforts by over 40%, demonstrating innovative thinking skills. This supported efficient processing.

Engineered a metadata-driven ETL framework to support parameterized, reusable pipelines, reducing development redundancy and improving maintainability across business domains, showcasing analytical thinking. This optimized Azure Synapse query performance.

Enabled incremental data loads using watermarking and Change Data Capture (CDC) logic within ADF, supporting efficient processing and up-to-date reporting across downstream systems, demonstrating attention to detail. This ensured data accuracy.

Participated in Agile sprint planning, retrospectives, and stand-ups, contributing to iterative delivery cycles and continuous improvement initiatives within a cross-functional engineering team, showcasing team player skills. This improved knowledge transfer.

Authored architecture documentation, data flow diagrams, and standard operating procedures (SOPs), improving knowledge transfer and onboarding efficiency for new team members, demonstrating strong communication skills. This reduced incident response times.

UPS Feb 2023 – June 2024

Data Engineer

Responsibilities:

Engineered robust batch pipelines in Azure Data Factory for logistics analytics, migrating legacy ETL jobs into scalable Databricks notebooks that enhanced runtime performance and long-term maintainability, showcasing analytical thinking. This ensured data accuracy.

Developed modular PySpark code templates for repeatable transformations and validation logic, streamlining development and achieving over 30% reduction in code duplication and onboarding time, demonstrating attention to detail. This improved system reliability.

Implemented advanced data quality checks and anomaly detection mechanisms using PySpark UDFs, safeguarding integrity and consistency of high-volume shipment and tracking data, showcasing problem-solving skills. This streamlined build and release processes.

Collaborated with architects to design resilient workflows using checkpointing, retry logic, and fallback mechanisms, ensuring zero data loss and high fault tolerance across all production pipelines, demonstrating innovative thinking skills. This supported efficient processing.

Optimized Azure Synapse query performance by applying strategic indexing and partitioning, reducing data retrieval time and query execution costs significantly across reporting workloads, showcasing strong PL/SQL skills. This reduced development redundancy.

Built Azure Functions to enable event-triggered pipeline execution, dynamically orchestrating downstream processing tasks and enhancing responsiveness to data availability, demonstrating analytical thinking. This optimized Azure Synapse query performance.

Led schema evolution design using JSON-based schema registry logic in Databricks, allowing flexible ingestion of datasets with changing structure without manual intervention, showcasing attention to detail. This ensured data accuracy.

Contributed to CI/CD automation efforts via GitHub and Azure DevOps pipelines, embedding testing, deployment, and rollback procedures to ensure reliable production releases, demonstrating problem-solving skills. This streamlined build and release processes.

Designed and deployed Power BI dashboards to track delivery KPIs, on-time shipment percentages, and SLA violations, enabling real-time visibility for operational leadership teams, showcasing strong communication skills. This improved knowledge transfer.

Participated in Agile ceremonies including standups, sprint planning, demos, and retrospectives, collaborating with cross-functional teams to iteratively deliver data-driven capabilities, showcasing team player skills. This reduced incident response times.

Wells Fargo Mar 2021 – Dec 2022

Software Engineer Trainee

Responsibilities:

Supported development of ETL pipelines to meet financial reporting and compliance standards, leveraging SQL Server, SSIS, and Azure SQL to automate and streamline regulatory data submissions, showcasing analytical thinking. This ensured data accuracy.

Authored advanced SQL queries and stored procedures for data cleansing, validation, and transformation, ensuring accuracy and consistency in financial and operational datasets, demonstrating attention to detail. This improved system reliability.

Assisted in migrating legacy on-prem SQL Server workloads to Azure SQL Database and Synapse Analytics, improving system scalability and aligning with cloud modernization goals, showcasing problem-solving skills. This streamlined build and release processes.

Built automated data ingestion scripts using Python and shell scripting to fetch and preprocess third-party financial data on a scheduled basis, reducing manual workload, demonstrating innovative thinking skills. This supported efficient processing.

Participated in Agile sprint planning and design sessions to gather business requirements and convert them into scalable, testable technical specifications for ETL solutions, showcasing strong PL/SQL skills. This reduced development redundancy.

Designed, maintained, and tested SSIS packages and Power BI dashboards to serve internal data processing and financial metric visualization needs, demonstrating analytical thinking. This optimized Azure Synapse query performance.

Collaborated with QA teams on defining test strategies and resolving data anomalies, contributing to higher reliability and reduced defect leakage in production, showcasing attention to detail. This ensured data accuracy.

Documented data dictionaries, ETL mappings, and workflow logic to support cross-functional collaboration and stakeholder transparency across data lifecycle processes, demonstrating problem-solving skills. This streamlined build and release processes.

Provided production support for ETL workflows, performing root cause analysis and optimizing job schedules and queries to reduce failure rates and performance issues, showcasing strong communication skills. This improved knowledge transfer.

Engaged in internal learning programs and certification workshops focused on Azure and DevOps, actively contributing to innovation initiatives and continuous team improvement, showcasing team player skills. This reduced incident response times.

Educational Details:

Master of Science in Information Technology and Management - Lindsey Wilson University

Bachelor of Technology in Electronics & Communication Engineering - Sreyas Institute of Engineering and Technology

Contact this candidate