Data Engineer Processing

Location:

Dallas, TX, 75203

Posted:

September 10, 2025

Contact this candidate

Resume:

Harshith Gaddam Data Engineer

+1-513-***-**** ***************@*****.***

SUMMARY

Data Engineer with around 4 years of experience specializing in ETL development, data extraction, modification, validation, and reporting. Proficient in Python, SQL, Scala, and Shell Scripting for comprehensive data engineering tasks. Expertise in NumPy, Pandas, SciPy and Scikit-Learn for advanced data processing, machine learning. Proven ability in Big Data technologies like Hadoop, Apache Spark, Databricks, Hive, Sqoop, Oozie, and HDFC for robust data processing pipelines. Skilled in managing databases MySQL, PostgreSQL, Oracle, and MongoDB and using visualization tools Tableau, Power BI, MS Excel for effective insights communication. Adept at leveraging cloud technologies, particularly AWS EC2, EBS, S3, EMR, AWS Glue, and DMS and Azure to design, implement and optimize scalable data solutions. Experienced in Agile and Waterfall methodologies, adapting to varying project requirements. TECHNICAL SKILLS

Programming Language: Python, SQL, R, Scala

Database: MS SQL Server, PostgreSQL, MongoDB, MySQL Big Data Eco system: Hadoop, MapReduce, Databricks, Hive, Apache Spark, Pig, Hive, HDFS Reporting & ETL Tools: Tableau, Power BI, SSR, SSIS, SSAS Packages: NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, Seaborn, TensorFlow Frameworks: Kafka, Airflow, Snowflake, Django, Docker Cloud Technologies: AWS, GCP, Azure

Other Tools: Anaconda, Kubernetes, Sitecore, CMS, Jira, Confluence, Jenkins, Git, MS Office PROFESSIONAL EXPERIENCE

Verizon, USA Nov 2024 - Current

Data Engineer

• Develop Python scripts for efficient data processing, utilizing Pandas and NumPy, enhancing analytical capabilities.

• Manage MySQL databases, ensuring seamless data retrieval and storage, contributing to increase in database efficiency.

• Leverage Amazon Web Services, employing AWS Glue, EC2, and PySpark for computing, and S3 as a storage mechanism, leading to improvement in data processing speed.

• Designed and implemented robust ETL workflows using SSIS, ensuring accurate and efficient data extraction, transformation, and loading across diverse systems.

• Implemented Hadoop and PySpark for efficient big data processing, reducing processing time.

• Experienced in designing and managing DAGs in Apache Airflow for orchestrating ETL workflows and data pipeline execution.

• Design and implement interactive Power BI dashboards, dynamic visualizations, user-driven parameters for data exploration.

• Implement CI/CD pipelines using Jenkins, developed Ansible templates for automatic code deployment, and configured SonarQube to enhance code quality, reduction in deployment time.

• Apply Agile methodologies in collaborative environments, facilitating streamlined project development and delivery. MAQ Software, India Sep 2020 - Jul 2023

Data Engineer

• Developed and maintained high-performance Spark-based pipelines for processing large-scale batch and streaming data, using advanced techniques like partitioning and broadcast joins to enhance efficiency.

• Developed Python applications, scripts, and automation tools, enhancing workflow efficiency and productivity by automating repetitive tasks, reducing manual effort.

• Designed and implemented scalable ETL pipelines using Apache Spark on Databricks, improving data processing efficiency and enabling real-time analytics for business decision-making.

• Developed dynamic and interactive reports using SSRS, enabling real-time insights, improving data-driven decision-making.

• Proficient in using dbt to build modular, testable data transformation pipelines, data modelling, version-controlled deployments.

• Skilled in working with Snowflake, focusing on scalable data warehousing, and seamless integration with modern data stacks.

• Experienced in distributed data computing tools, with a focus on Apache Kafka for real-time data streaming, event-driven architecture, and scalable data integration across microservices.

• Experienced in working with Docker for containerizing applications, using Kubernetes for orchestrating deployments

• Executed SQL queries and optimized database performance, resulting in reduction in query execution time.

• Executed data visualization using Tableau, providing actionable insights and improving decision-making processes.

• Employed Git for version control, enhancing collaboration and code management during software development.

• Employed SCRUM methodology, leading daily stand-ups, and sprint planning, resulting in on-time project delivery. EDUCATION

Master's, Management Information Systems

Lamar University, Beaumont, TX, USA

Bachelor's, Information Technology

Jawaharlal Nehru Technological University, Hyderabad, India CERTIFICATIONS

AWS: Cloud Practitioner Essentials

Microsoft: Power BI Data Analyst Associate

Contact this candidate