Data Engineer Big

Location:

Round Rock, TX

Posted:

June 17, 2025

Contact this candidate

Resume:

SHAMANTHAKA VEERAREDDY

Email: ***********.**********@*****.***

Location: Austin, TX, 78613 Phone: +1-937-***-**** TECHNICAL SKILLS

• Big Data Technologies: Apache Spark, Kafka, Hadoop, Databricks, Hive, HDFS

• Programming Languages: Python, Scala,SQL

• Data Engineering Tools: Apache Airflow, Kubernetes, Docker

• Cloud Platforms: AWS (S3, Athena, Glue), Azure Blob

• Database Systems: MySQL, Snowflake, Postgres, Trino Query Engine

• Version Control: Git, GitHub

• Other Tools: Agile Frameworks

CERTIFICATION

Databricks Certified Associate Developer for Apache Spark -3.0 PROFESSIONAL EXPERIENCE

Client: PrismHR Inc, Role: Data Engineer Date: October 2024 – Present

• Led the migration of client data in structured batches, ensuring feature compatibility and data integrity throughout the process.

• Designed and implemented ETL and validation pipelines, enabling seamless migration of client data into Prism's database.

• Engineered scalable ETL workflows for processing large datasets, leveraging Apache Airflow for orchestration, Kafka for real-time pipeline development, and Spark for distributed data processing.

• Collaborated with engineering and product analytics teams to rigorously test, troubleshoot, and optimize migration pipelines across multiple environments, ensuring operational excellence.

• Utilized Lenses.io for real-time monitoring of Kafka topics, ensuring accurate and efficient data flow into both raw and clean history layers of the data lake.

• Executed comprehensive data validation and issue resolution, addressing pipeline failures, data mismatches, and missing records to ensure high-quality data delivery.

• Worked with Change Data Capture (CDC) pipelines by defining and managing table configurations, enabling seamless updates into the data lake.

• Authored detailed technical documentation and facilitated knowledge-sharing sessions to ensure clarity in migration processes and effective troubleshooting practices. Technologies: Python, Spark, SQL, Scala, Kafka, AWS (S3, Athena), Postgres, Databricks, Airflow Client: Comcast LLC, Role: Data Engineer Date: March 2021 – Sept 2024

• Designed scalable data pipelines for Xfinity network analytics teams to track and predict network performance.

• Built and deployed ETL pipelines to feed data into Xfinity’s product analytics systems. Developed Spark applications for processing data from external sources (AWS) into in-house data lakes.

• Worked in a multi-cloud environment which includes on-prem Hadoop and AWS cloud systems.

• Implemented and automated Big Data infrastructure using the AWS ecosystem (S3,Athena) and Spark.

• Worked with Apache Airflow for scheduling workflows - developing DAGs, scheduling, and streamlining data pipelines.

• Performance tuning - Optimized Databricks Spark jobs to reduce memory consumption and processing time.

• Collaborated with business users to understand requirements, analyze data, and provide solutions.

• Partnered with engineering teams through design, architecture, deployment, and production support phases.

• Worked within an Agile framework, participated in sprint planning, and conducted code reviews. Technologies: Databricks, Python, SQL, Spark, Kubernates, Trino, Minio(S3), AWS (S3,Athena) Cloud Ingest Inc, Role: Data Engineer Date: Aug 2020 – Feb 2021

• Developed ETL pipelines for data cleansing, integration, and transformation. Built Spark applications to process structured and unstructured data (JSON, CSV, Parquet) into HDFS.

• Worked on spark jobs with Spark-SQL queries for efficient data processing and manipulation. Collaborated with cross-functional teams to identify critical business problems and develop data- driven solutions.

• Participated in peer reviews to ensure code correctness and efficiency. Sagarsoft Inc, Role: Software Engineer Date: May 2017 – July 2018

• Developed a robust system for reading, validating, and inserting product information from CSV files into MySQL databases, ensuring data integrity.

• Automated the data processing workflow with daily cron jobs for consistent and timely updates.

• Created a notification service to send automated emails to administrators after successful data insertion.

• Utilized Kafka to publish product availability data for real time monitoring.

• Worked on features to read customer cart and order data and send personalized product notifications via email using Kafka.

EDUCATION

• Master of Computer Science, Wright State University, Fairborn, OH

• Bachelor of Technology in Computer Science, JNTU, Hyderabad, India

Contact this candidate