Post Job Free
Sign in

Data Engineer with 6+ Years of ETL and CDC Expertise

Location:
Lewisville, TX
Salary:
95000
Posted:
December 01, 2025

Contact this candidate

Resume:

Prasanth Goggela Tagoor

Email: *************@*****.*** Mobile: 913-***-**** Location: Texas, United States

PROFESSIONAL SUMMARY

Over 6 years of experience as a Data Engineer, specializing in data hydration and ETL transformations for analytics platforms.

Proven expertise in setting up Change Data Capture (CDC) using Debezium and other tools, ensuring efficient data flow into data lakes.

Skilled in Apache Spark, with hands-on experience in Data Frames, Spark SQL, and Spark Streaming for both batch and streaming data processing.

Proficient in orchestrating ETL jobs and streaming data pipelines, transforming raw CDC data into query-able formats for analytics.

Extensive knowledge of AWS services, including S3, EMR, Glue Data Catalog, and Lambda functions, optimizing data storage and processing.

Strong background in performance tuning and implementing Big Data concepts, enhancing data processing efficiency.

Familiar with Apache Airflow for workflow management, ensuring reliable data pipeline orchestration.

Committed to continuous learning and professional development, holding multiple certifications in data engineering and analytics.

SKILLS

Programming Languages: Java, Python, Scala

Operating Systems: Windows, Linux

Cloud Platforms: AWS (S3, EMR, Glue Data Catalog, Lambda, Step Functions, MWAA), Google Cloud Platform, Azure

DevOps & CI/CD: Apache Airflow, AWS Batch

Development Tools: Apache Spark (Data Frames, Spark SQL, Spark Streaming), Databricks, TensorFlow

Reporting Tools: Power BI, Tableau, Excel

Frameworks & Libraries: Apache Hudi, Apache Griffin

Databases & Data Warehousing: SQL, NoSQL

Big Data & Streaming: Change Data Capture (CDC), ETL Pipelines, Streaming Data Processing

Testing & QA: Unit Testing, Integration Testing

Security & Compliance: Data Governance, Compliance Standards

Monitoring & Observability: AWS CloudWatch, Apache Kafka

Collaboration Tools: JIRA, Confluence

Documentation Tools: Markdown, Microsoft PowerPoint

CERTIFICATIONS

AWS Certified Data Analytics – Specialty

Azure Data Engineer Associate (DP-203)

Databricks Certified Data Engineer Professional

Google Cloud Professional Data Engineer

TensorFlow Developer Certificate

Machine Learning Specialization – Coursera (Andrew Ng)

EDUCATION

Master of Science in Big Data Analytics University of Central Missouri 4.0 GPA

Bachelor of Technology in Electrical and Electronics Engineering KL University 92%

WORK EXPERIENCE

NVIDIA Corporation – Austin, TX

Senior Data Engineer - AI & Analytics Platforms – Nov 2023 to Present

Spearheaded the design and implementation of a scalable data pipeline using Apache Spark, enhancing data processing speed by 40%, which significantly improved analytics capabilities for AI-driven projects.

Optimized ETL processes by integrating AWS Glue and Apache Airflow, resulting in a 30% reduction in data latency and improved data availability for real-time analytics.

Collaborated with cross-functional teams to develop machine learning models, utilizing TensorFlow and PySpark, which increased predictive accuracy by 25% in customer behavior analysis.

Engineered a robust data lake architecture on AWS S3, leveraging EMR for batch processing, which streamlined data access and reduced storage costs by 20%.

Automated data quality checks using AWS Deequ, ensuring data integrity and consistency across various data sources, which enhanced trust in analytics outputs.

Delivered comprehensive documentation and training sessions for team members on best practices in data engineering and analytics, fostering a culture of knowledge sharing.

Implemented performance tuning strategies for Spark SQL queries, resulting in a 35% improvement in query execution times, thereby enhancing overall system efficiency.

Developed and maintained CI/CD pipelines for data workflows using AWS Lambda and Step Functions, improving deployment frequency and reducing rollback incidents by 15%.

Conducted regular code reviews and performance assessments, mentoring junior engineers and promoting adherence to coding standards and best practices.

Engaged in continuous learning and professional development, obtaining certifications in AWS Data Analytics and Google Cloud Data Engineering to stay updated with industry trends.

Technologies Used: Java, Python, Apache Spark, AWS Glue, Apache Airflow, EMR, S3, TensorFlow, AWS Lambda, AWS Deequ

Humana Inc – Louisville, KY

Big Data Engineer – Mar 2021 to Jul 2023

Automated data ingestion processes using Apache Kafka and Spark Streaming, achieving a 50% increase in data throughput and enabling real-time analytics for healthcare applications.

Engineered ETL pipelines that processed over 5TB of data daily, utilizing Apache Spark DataFrames and SQL, which improved reporting accuracy and reduced processing time by 30%.

Collaborated with data scientists to develop predictive models for patient health outcomes, leveraging machine learning techniques that improved patient care strategies by 20%.

Optimized data storage solutions on AWS S3, implementing lifecycle policies that reduced storage costs by 25% while maintaining compliance with healthcare regulations.

Led the migration of legacy data systems to a cloud-based architecture, enhancing scalability and reliability of data access for over 1,000 users across the organization.

Conducted performance tuning and optimization of Spark jobs, resulting in a 40% reduction in resource consumption and improved job execution times.

Developed comprehensive data governance policies and procedures, ensuring data quality and compliance with HIPAA regulations across all data engineering processes.

Implemented monitoring and alerting systems using AWS CloudWatch, significantly reducing downtime and improving system reliability for critical data pipelines.

Provided mentorship to junior data engineers, fostering a collaborative environment and enhancing team productivity through knowledge sharing and skill development.

Engaged in cross-departmental projects to enhance data accessibility and usability, resulting in improved decision-making processes across the organization.

Technologies Used: Java, Python, Apache Spark, Apache Kafka, AWS S3, AWS CloudWatch, SQL, Data Governance, Machine Learning, ETL

Hexaware Technologies – Chicago, IL

Data Engineer – Jun 2019 to Feb 2021

Developed and maintained ETL processes for data integration from multiple sources, utilizing Apache Spark and Python, which improved data availability for analytics by 35%.

Collaborated with business analysts to gather requirements and translate them into technical specifications, ensuring alignment between data solutions and business needs.

Implemented data validation and cleansing processes, enhancing data quality and reducing errors in reporting by 20%, which improved stakeholder confidence in analytics.

Participated in the design and deployment of a data warehouse solution, leveraging AWS Redshift, which facilitated advanced analytics capabilities for business intelligence teams.

Automated reporting processes using Python scripts, reducing manual effort by 50% and enabling timely insights for strategic decision-making.

Conducted performance tuning of SQL queries, resulting in a 30% improvement in report generation times and enhancing user experience for data consumers.

Engaged in Agile methodologies, participating in sprint planning and retrospectives, which improved project delivery timelines and team collaboration.

Developed comprehensive documentation for data processes and workflows, ensuring knowledge transfer and continuity within the data engineering team.

Assisted in the migration of on-premises data solutions to cloud-based platforms, enhancing scalability and reducing operational costs by 15%.

Provided support for data-related inquiries and troubleshooting, ensuring timely resolution of issues and maintaining high levels of service for internal stakeholders.

Technologies Used: Python, Apache Spark, SQL, AWS Redshift, ETL, Data Warehousing, Agile, Data Quality, Automation, Reporting



Contact this candidate