Data Engineer Processing

Location:

Euless, TX

Posted:

February 03, 2025

Contact this candidate

Resume:

Sai Krishna Nama

+1-316-***-**** ****************@*****.*** LinkedIn SAI KRISHNA NAMA

Professional Summary

A dedicated Data Engineer with expertise in Big Data technologies and cloud platforms, skilled in creating scalable ETL workflows using Databricks and PySpark. Proven track record in optimizing data systems and automating CI/CD pipelines with Jenkins and Docker. Experienced in leveraging AWS services like S3 and Redshift for high-performance solutions. Committed to translating complex business requirements into effective technical solutions to drive organizational success. EDUCATION

University of Webster, San Antonio, Texas

Master's, Computer Science 2024

WORK EXPERIENCE

Sonata One Software Dallas Texas

Data Engineer Jun 2024 - Present

• Migrated legacy data systems to AWS Glue and Lambda, reducing operational costs by 20%

• Designed and optimized scalable ETL workflows using Databricks and PySpark to enhance data processing performance.

• Developed robust backend data models and pipelines to support advanced analytics and real-time decision-making.

• Collaborated with cross-functional teams to analyze data sources and implement scalable solutions.

• Automated ETL workflows using Talend for seamless data integration into cloud environments, leveraging AWS services like S3, Lambda, and RDS.

• Created advanced SQL scripts to ensure data consistency, accuracy, and integrity for analytics and reporting.

• Built and maintained dashboards using Tableau and AWS Quick Sight, delivering actionable insights.

• Automated CI/CD pipelines with Jenkins and containerized applications using Docker for streamlined deployments.

• Leveraged Kubernetes to enhance scalability and fault tolerance within microservices architectures.

• Optimized data workflows in cloud environments to improve performance and resource utilization. Ever seen Pvt Ltd Hyderabad India

AWS Data Engineer Aug 2019 - Aug 2022

• Implemented a multi-tiered data validation process, improving data integrity by 25% and enhancing reporting accuracy across 7 business units.

• Enhanced Spark job performance by efficiently configuring and managing AWS EMR clusters, reducing data processing time by 30%.

• Configured and managed Databricks environments for seamless integration with AWS services.

• Automated and optimized data pipelines using Databricks Delta Lake, processing 10 TB of batch data daily while integrating AWS S3, Hive, and Redshift.

• Designed and developed pipelines for managing HR and Payroll data using Apache Spark, resulting in improved data accuracy and streamlined payroll processing

• Built and deployed Airflow DAGs with Spark, EMR, and Hive operators to automate data workflows, resulting in streamlined data processing and reduced manual intervention

• Configured and managed AWS EMR clusters to run Spark jobs efficiently, enhancing data processing speed and reliability.

• Optimized Spark applications by caching data, partitioning, and performance tuning for improved processing times.

• Automated CI/CD workflows for Spark applications using Jenkins and Docker, ensuring faster deployments.

• Deployed applications on Kubernetes, enabling scalability and efficient resource management. Skills & Interest

New Skill Group 1: Data Warehousing, ETL Processes, SQL, Python, Data Modeling, Big Data Technologies, Apache Kafka,Apache Spark, Apache Flink, Apache HBase, Data Pipeline Development, Data Quality Assurance, Cloud Computing, AWS,GCP,Microsoft Azure, Data Visualization, Business Intelligence, Financial Data Analysis, Snowflake, NoSQL Databases,MySql,Microsoft SQL server, Oracle SQL,Postgrsql,Presto Trino,Dagster,MLflow,Helm,Terraform, Informatica,Talend,Data Governance, Machine Learning Basics, Data Integration, Data Architecture, Version Control (Git).

Contact this candidate