Data Engineer Quality Assurance

Location:

Parsippany, NJ

Posted:

September 10, 2025

Contact this candidate

Resume:

Navya Nagendla

+1-571-***-**** Email: **********@*****.*** VA

PROFESSIONAL SUMMARY

A passionate Data Engineer with around 3 years of experience building and optimizing data pipelines, automating workflows, and working with cloud platforms like AWS and GCP. Skilled in turning complex data into valuable insights and collaborating with teams to drive better business decisions. Practical experience with ETL procedures, cloud technologies (AWS, GCP), and data warehousing (Redshift, BigQuery) for two years. Strong command of Python, SQL, and data pipeline automation (AWS Lambda, Hive, Apache Airflow), familiarity with big data frameworks (Hadoop, Apache Spark for processing massive amounts of data. demonstrated capacity to streamline data workflows, improving data delivery and performance. Competent in query optimization, data modeling, and data quality assurance for quicker reporting and analytics. Adept at delivering meaningful insights while interacting with business teams, data scientists, and analysts in Agile contexts. proficiency in using Snowflake to build data pipelines and Tableau to create interactive dashboards. adept at using cloud, SQL, and Python technologies to streamline data processes and provide useful business insights.

SKILLS

Programming Languages: Python, SQL, Scala, Bash, R, PySpark

Data Processing: Apache Hive, Apache Flink, AWS Glue

Data Warehousing: Snowflake, Amazon Redshift, Google BigQuery

Big Data &ETL tools: Hadoop, Apache Spark, Hive, Apache Airflow, Talend, Kafka, Informatica, Terraform

Data Visualization: Tableau, QlikView

Cloud Platforms: AWS (S3, Redshift, Lambda), Google Cloud Platform (BigQuery, Dataflow)

CI/CD DevOps Tools: Gitlab, Docker, Kubernetes, Jenkins, Maven, Terraform, CloudFormation, Azure DevOps

Version control: Git, GitHub, GitLab

Testing & Monitoring: Junit, Mockito, Swagger, Postman

Libraries/Tools/Technologies: JDBC, CI/CD Pipeline, Kafka, Linux, Splunk, SDLC, JMeter, Gradle, JSON, Tomcat

WORK EXPERIENCE

Client: Ups Parsippany, NJ Dec 2024-Present

Role: Data Engineer

Roles & Responsibilities:

Developed ETL pipelines integrating Azure SQL Database and Azure Data Factory, significantly enhancing data warehousing capabilities in healthcare analytics.

Automated end-to-end data processes to improve operational efficiency and patient data management, utilizing Azure Synapse and Azure Machine Learning.

Employed Azure Active Directory for implementing robust data security measures, safeguarding sensitive healthcare data.

Managed large-scale data analytics projects efficiently using Azure Synapse, ensuring timely insights for healthcare decision-making.

Utilized scripting within an agile project framework to streamline deployments and operational tasks, leveraging Git for version control.

Enhanced data processing efficiency and reliability by streamlining ETL processes with Azure Databricks.

Orchestrated data migration and integration projects, ensuring seamless data flows across multiple healthcare systems using Azure Data Factory.

Implemented data quality initiatives to maintain accuracy and integrity of healthcare datasets, applying SQL and Python scripts.

Designed and maintained healthcare data models in Azure SQL Database, optimizing them for performance and scalability.

Executed data security protocols using Azure Active Directory and encryption techniques to comply with healthcare regulations.

Facilitated agile project management practices to enhance team productivity and meet tight project deadlines.

Leveraged Docker containers to ensure consistent environments across development, testing, and production phases.

Applied Azure Machine Learning to develop predictive models that enhanced patient care outcomes and operational efficiencies.

Configured and managed ADLS for scalable data storage solutions, accommodating growing data needs in the healthcare sector.

Integrated Azure Data Factory with existing healthcare systems to automate data ingestion and improve real-time data availability.

Utilized Python for data cleansing and preprocessing tasks, ensuring high-quality data for analytics and reporting.

Conducted comprehensive testing of data solutions to ensure compliance with both technical and healthcare industry standards.

Utilized Terraform for infrastructure as code practices, automating the setup and maintenance of cloud resources.

Developed dashboards and reports using Power BI, providing actionable insights to healthcare administrators and decision-makers.

Enhanced system monitoring and maintenance using Azure monitoring tools, ensuring high availability and performance of data solutions.

Collaborated with cross-functional teams to align data engineering solutions with broader healthcare IT strategies.

Mentored junior data engineers and analysts, providing guidance on best practices in data management and Azure cloud technologies.

Reviewed and updated data governance policies to ensure compliance with new healthcare data regulations and best practices.

Environment: Azure SQL Database, Azure Data Factory, Azure Synapse, Azure Machine Learning, Azure Active Directory, Git, Azure Databricks, SQL, Python, Docker, ADLS, Terraform, Power BI, Azure Monitor.

InfoSys Ltd, Hyderabad, India, Oct 2021 to Aug 2023

Role: Data Automation Engineer

Roles & Responsibilities:

Designed and implemented real-time data processing pipelines using Apache Flink, enabling immediate insights into market trends for venture capital decisions.

Configured and managed AWS Data Pipeline and SSIS to automate data flows, significantly improving efficiency in data ingestion and integration.

Utilized AWS Glue for scalable, serverless data transformation, streamlining the processing of complex datasets from multiple sources.

Optimized data storage and analysis using AWS Redshift, enhancing query performance for high-volume financial datasets.

Established robust data orchestration workflows with Apache NiFi, facilitating better data governance and traceability across the enterprise.

Developed containerization strategies with Docker, standardizing development environments and simplifying deployments across cloud platforms.

Employed Terraform to manage cloud infrastructure, ensuring consistent and reproducible environments for venture capital data operations.

Crafted sophisticated data cleansing and preparation scripts in Python, ensuring high data quality for critical financial analysis.

Implemented AWS S3 for secure and scalable storage, managing vast amounts of venture capital data effectively.

Integrated Apache Kafka for efficient data streaming, facilitating real-time data collection and analysis.

Orchestrated complex ETL workflows with AWS Data Pipeline, enhancing data availability and accessibility for analysts.

Automated data quality checks using Python, maintaining the integrity and accuracy of financial models and reports.

Deployed AWS EMR for distributed data processing, enabling efficient handling of big data workloads and complex computations.

Configured SQL databases on AWS Redshift, optimizing performance for data-intensive applications in the venture capital sector.

Developed comprehensive data backup and recovery strategies using AWS S3, ensuring business continuity and data safety.

Leveraged AWS CloudWatch for monitoring and alerting, enhancing operational visibility and proactive incident management.

Conducted performance tuning of data processing jobs, reducing latency and increasing throughput for analytics applications.

Facilitated cross-functional team collaboration, aligning data engineering practices with venture capital analytical needs.

Provided training and mentorship on the use of AWS services and Python programming to enhance team capabilities.

Architected and maintained secure data exchange channels using AWS services, protecting sensitive financial information during transactions.

Environment: Apache Flink, AWS Data Pipeline, SSIS, Hive, AWS Glue, AWS Redshift, Apache NiFi, Docker, Terraform, Python, AWS S3, Apache Kafka, AWS EMR, SQL, AWS CloudWatch.

Contact this candidate