Data Engineer Quality

Location:

Hyderabad, Telangana, India

Posted:

September 10, 2025

Contact this candidate

Resume:

Rakesh Duguri

Data Engineer

+1-848-***-**** ************@*****.*** United States https://www.linkedin.com/in/rakesh-duguri- 0a5810261/

SUMMARY

Experienced Data Engineer with over 7+ years of expertise in designing and implementing robust data pipelines and infrastructure across diverse industries. Proficient in programming languages including Python, Java, and Scala, with a strong focus on optimizing data movement efficiency and processing times. Expert at leveraging a variety of data storage solutions, from SQL and NoSQL databases to cloud platforms like Amazon Redshift, Google Big Query, and Snowflake.

TECHNICAL SKILLS

• Programming Languages: Python, Java, Scala

• Data Storage: SQL, NoSQL databases, Amazon Redshift, Google BigQuery, Snowflake, Data Lakes

• Big Data Technologies: Apache Hadoop, Apache Spark, Databricks, Apache Flink

• Data Processing Frameworks: Apache Kafka, Apache NiFi

• ETL Tools: Apache Airflow, Talend, Pentaho, Informatica

• Cloud Platforms: Amazon Web Services, Microsoft Azure, Google Cloud Platform

• Containerization and Orchestration: Docker, Kubernetes

• Data Integration: Apache Camel, MuleSoft

• Version Control: Git

• Database Management Systems: MySQL, PostgreSQL, Oracle, Snowflake

• Data Quality and Profiling: Apache Griffin, Trifacta, Talend Data Quality

• Collaboration Tools: Jira, Confluence

• Monitoring and Logging: ELK Stack, Prometheus

• Scripting and Automation: Shell scripting, PowerShell

• Continuous Integration/Continuous Deployment: Jenkins, GitLab CI

• Data Governance: Apache Atlas, Collibra

• Machine Learning

PROFESSIONAL EXPERIENCE

UPS March 2023 Present

SANDY SPRINGS, ATLANTA GA

Data Engineer

• Designed and implemented robust data pipelines and infrastructure using Python and SQL, optimizing data movement efficiency and processing times.

• Executed comprehensive data quality measures and monitoring systems, ensuring compliance with regulations and enhancing data governance.

• Collaborated cross-functionally to translate complex business needs into innovative data solutions, improving data-driven decision-making accuracy.

• Engineered and optimized data pipelines using Airflow, Kafka, and Luigi, streamlining data flow and reducing latency.

• Implemented data quality tools, Great Expectations and Trifacta, to reduce errors and accelerate the identification of data anomalies.

• Implemented and automated complex data workflows using Pentaho’s graphical interface, improving data processing efficiency and reducing manual intervention.

• Utilized cloud technologies, including AWS as the primary platform, to implement cost-effective and scalable solutions.

• Utilized Pentaho for designing and deploying ETL processes, enhancing data integration and processing efficiency.

• Engaged in full systems life cycle management activities, providing key insights for analyses, technical requirements, and coding.

• Implemented version control using Git, ensuring seamless collaboration and tracking of code changes.

• Executed strategic optimizations in data warehousing solutions, including BigQuery, Snowflake, Redshift.

• Played a pivotal role in the adoption of additional cloud platforms like GCP, expanding the company's cloud capabilities.

• Conducted training sessions for the team on Python and SQL integration for enhanced data manipulation.

• Implemented automated monitoring systems for IoT devices, reducing response time to device issues.

• Developed and maintained documentation for data pipelines and infrastructure, streamlining knowledge transfer.

• Collaborated with external vendors to seamlessly integrate APIs, reducing data integration time.

• Actively participated in industry conferences and forums, staying abreast of the latest trends and technologies.

• Assisted in the onboarding of new team members, providing mentorship and support. Samsung January 2019 – December 2020

Data Engineer

• Spearheaded the development and implementation of scalable data pipelines at Samsung, enhancing data ingestion, transformation, and storage processes, optimizing overall data processing efficiency.

• Designed and implemented ETL/ELT workflows on Data Warehouses and Data Lakes utilizing technologies such as Snowflake and Redshift, improving data processing times and ensuring timely delivery of insights.

• Orchestrated complex data flows using tools like Airflow, ensuring seamless automation and monitoring of data pipelines, enhancing data pipeline reliability and quality.

• Collaborated closely with stakeholders to understand and translate business needs into technical requirements, fostering effective communication between data engineers, analysts, data scientists, and software engineers.

• Utilized Python and Java to develop and optimize data processing algorithms, reducing code execution time.

• Played a pivotal role in establishing infrastructure to strategically analyze big data, fostering a data-driven culture within the organization.

• Integrated SQL for efficient database management, improving data retrieval times.

• Created interactive reports and dashboards using Pentaho BI tools, enabling data-driven decision-making and enhancing business insights.

• Leveraged cloud platforms such as AWS, Azure, and GCP, demonstrating familiarity with relevant data services to enhance data processing capabilities.

• Leveraged Pentaho for building and optimizing ETL processes, facilitating efficient data extraction and transformation.

• Implemented Apache Airflow, Luigi, Prefect, Kafka, and NiFi for data pipeline orchestration, streamlining workflows and reducing manual intervention.

• Maintained version control using Git, ensuring code integrity and collaboration efficiency within the data engineering team.

• Streamlined data pipelines and improved performance by configuring and tuning Pentaho transformations and jobs for high efficiency and minimal latency.

• Optimized Linux-based systems to support data engineering tasks, improving system performance.

• Successfully collaborated with cross-functional teams, providing technical leadership and support to achieve common data goals and deliver valuable insights.

• Trained and mentored junior data engineers, fostering skill development and knowledge transfer within the team.

• Participated in performance tuning and optimization initiatives, optimizing resource utilization and costs.

• Demonstrated a keen focus on continuous learning and staying updated on the latest industry trends, contributing to the implementation of cutting-edge technologies in data engineering processes. Cardinal Health Inc April 2015 – December 2018

Data Engineer

• Spearheaded the design and implementation of robust data pipelines, ensuring seamless ETL processes from diverse sources into data warehouses.

• Orchestrated the development and maintenance of data models, guaranteeing data consistency and accuracy, and enhancing accessibility for analysts and scientists.

• Automated critical data workflows, implementing cutting-edge code and tools that streamlined data pipelines and optimized overall data processing speed.

• Integrated Pentaho to automate and manage complex data workflows, enhancing pipeline reliability and performance.

• Implemented stringent data quality checks and security measures, ensuring compliance with industry regulations.

• Troubleshooted and debugged data pipeline issues promptly, minimizing disruptions and ensuring continuous and smooth data flow.

• Collaborated seamlessly with analysts and scientists, understanding and addressing their data needs, fostering a cohesive team environment.

• Designed and executed cloud-based data solutions, leveraging AWS, Azure, and GCP for enhanced scalability.

• Developed tools and scripts for data ingestion, processing, and manipulation, enhancing automation and reducing data processing time.

• Authored comprehensive documentation for data pipelines and models, facilitating knowledge transfer and ensuring smooth onboarding of new team members.

• Contributed significantly to performance optimization and scalability efforts, enhancing system responsiveness.

• Leveraged BigQuery, Snowflake, and Redshift for data warehousing and lakes, optimizing data storage and retrieval processes.

• Employed Airflow and Luigi for data processing automation, improving workflow efficiency.

• Demonstrated proficiency in Git for version control, ensuring effective collaboration and code management within the data engineering team.

• Applied database administration knowledge, including MySQL and PostgreSQL, for efficient query optimization.

EDUCATION

• Master of Science in Branch Wilmington University, Delaware

• Bachelor of Technology in Branch DRK college of engineer and Technology, India CERTIFICATIONS

• Google Cloud Ceritified Professional Data Engineer

Contact this candidate