SHAMANTHAKA VEERAREDDY
Email: ***********.**********@*****.***
Location: Austin, TX, 78613 Phone: +1-937-***-**** TECHNICAL SKILLS
• Big Data Technologies: Apache Spark, Kafka, Hadoop, Databricks, Hive, HDFS
• Programming Languages: Python, Scala,SQL
• Data Engineering Tools: Apache Airflow, Kubernetes, Docker
• Cloud Platforms: AWS (S3, Athena, Glue), Azure Blob
• Database Systems: MySQL, Snowflake, Postgres, Trino Query Engine
• Version Control: Git, GitHub
• Other Tools: Agile Frameworks
CERTIFICATION
Databricks Certified Associate Developer for Apache Spark -3.0 PROFESSIONAL EXPERIENCE
Client: PrismHR Inc, Role: Data Engineer Date: October 2024 – Present
• Led the migration of client data in structured batches, ensuring feature compatibility and data integrity throughout the process.
• Designed and implemented ETL and validation pipelines, enabling seamless migration of client data into Prism's database.
• Engineered scalable ETL workflows for processing large datasets, leveraging Apache Airflow for orchestration, Kafka for real-time pipeline development, and Spark for distributed data processing.
• Collaborated with engineering and product analytics teams to rigorously test, troubleshoot, and optimize migration pipelines across multiple environments, ensuring operational excellence.
• Utilized Lenses.io for real-time monitoring of Kafka topics, ensuring accurate and efficient data flow into both raw and clean history layers of the data lake.
• Executed comprehensive data validation and issue resolution, addressing pipeline failures, data mismatches, and missing records to ensure high-quality data delivery.
• Worked with Change Data Capture (CDC) pipelines by defining and managing table configurations, enabling seamless updates into the data lake.
• Authored detailed technical documentation and facilitated knowledge-sharing sessions to ensure clarity in migration processes and effective troubleshooting practices. Technologies: Python, Spark, SQL, Scala, Kafka, AWS (S3, Athena), Postgres, Databricks, Airflow Client: Comcast LLC, Role: Data Engineer Date: March 2021 – Sept 2024
• Designed scalable data pipelines for Xfinity network analytics teams to track and predict network performance.
• Built and deployed ETL pipelines to feed data into Xfinity’s product analytics systems. Developed Spark applications for processing data from external sources (AWS) into in-house data lakes.
• Worked in a multi-cloud environment which includes on-prem Hadoop and AWS cloud systems.
• Implemented and automated Big Data infrastructure using the AWS ecosystem (S3,Athena) and Spark.
• Worked with Apache Airflow for scheduling workflows - developing DAGs, scheduling, and streamlining data pipelines.
• Performance tuning - Optimized Databricks Spark jobs to reduce memory consumption and processing time.
• Collaborated with business users to understand requirements, analyze data, and provide solutions.
• Partnered with engineering teams through design, architecture, deployment, and production support phases.
• Worked within an Agile framework, participated in sprint planning, and conducted code reviews. Technologies: Databricks, Python, SQL, Spark, Kubernates, Trino, Minio(S3), AWS (S3,Athena) Cloud Ingest Inc, Role: Data Engineer Date: Aug 2020 – Feb 2021
• Developed ETL pipelines for data cleansing, integration, and transformation. Built Spark applications to process structured and unstructured data (JSON, CSV, Parquet) into HDFS.
• Worked on spark jobs with Spark-SQL queries for efficient data processing and manipulation. Collaborated with cross-functional teams to identify critical business problems and develop data- driven solutions.
• Participated in peer reviews to ensure code correctness and efficiency. Sagarsoft Inc, Role: Software Engineer Date: May 2017 – July 2018
• Developed a robust system for reading, validating, and inserting product information from CSV files into MySQL databases, ensuring data integrity.
• Automated the data processing workflow with daily cron jobs for consistent and timely updates.
• Created a notification service to send automated emails to administrators after successful data insertion.
• Utilized Kafka to publish product availability data for real time monitoring.
• Worked on features to read customer cart and order data and send personalized product notifications via email using Kafka.
EDUCATION
• Master of Computer Science, Wright State University, Fairborn, OH
• Bachelor of Technology in Computer Science, JNTU, Hyderabad, India