Data Engineer Machine Learning

Location:

Atlanta, GA

Posted:

October 06, 2025

Contact this candidate

Resume:

Suresh Kumar Reddy Danada

******@***********.*** Atlanta, Georgia +1-571-***-**** LinkedIn

PROFESSIONAL SUMMARY

Data Engineer with 4 years of experience with a proven track record in designing, implementing, and maintaining robust data infrastructure and pipelines. Proficient in Python, SQL, and Scala and Machine Learning, Specialize in ETL processes, leveraging tools like Apache Spark and Hadoop for distributed data processing and also proficient in NLP, Deep Learning, and machine learning models for analysis.

TECHNICAL SKILLS

Programming & Scripting: Python, R, SQL, PL/SQL, Core Java, JavaScript, DAX, HTML, CSS, XML, C (Basics) Data Analytics & Visualization: Power BI, Tableau, Tableau Prep, PowerApps, SSRS, Microsoft Excel, Power Query, Ad-Hoc Analytics, Data Visualization, Statistical Analysis, Forecasting Big Data & Cloud Technologies: Apache Spark, PySpark, Hadoop, Hive, Pig, MapReduce, HDFS, Kafka, Sqoop, Databricks, Snowflake, AWS (EC2, S3, Glue, Redshift, EMR), Azure Data Factory, Google Data Flow Databases & Data Engineering: MySQL, PostgreSQL, MS SQL Server, Oracle, Teradata, Redshift, Data Warehousing, ETL Pipelines, Data Transformation, Data Modeling, Data Quality Assurance, Data Lineage, EDW Architecture, Performance Tuning Tools & Platforms: Docker, Jenkins, REST APIs, Git, JIRA, Microsoft Office Suite (Word, Excel, Access, PowerPoint, Outlook, Teams, OneDrive), Data Verse, Windows, Linux (Ubuntu), macOS PROFESSIONAL EXPERIENCE

Data Engineer, Wells Fargo Aug 2024 – Present Remote, USA

Developed a real-time ETL pipeline with Apache Kafka and Spark (Scala) to analyze customer behavior data on an e- commerce platform, resulting in a 15% increase in sales conversions.

Utilized Hadoop and MapReduce to optimize large-scale historical data processing, implementing data mapping and transformation logic to ensure consistency and integrity across diverse sources.

Architected and optimized the Snowflake data warehouse for efficient storage and retrieval of structured and unstructured data, ensuring scalability and performance for large datasets.

Automated comprehensive ETL workflows using Python and SQL for data extraction, transformation, and loading into the warehouse, while leveraging MS Excel for ad-hoc analysis and reporting.

Developed interactive dashboards using Power BI, enabling real-time visualization of key business metrics and empowering data-driven decision-making across departments.

Integrated predictive models using Amazon SageMaker for customer churn prediction and personalized product recommendations, deploying models and data pipelines in the cloud using AWS Glue, Amazon S3, and Amazon Redshift for secure, scalable infrastructure.

Automated CI/CD using AWS CodePipeline and AWS CodeBuild for data infrastructure deployment, while optimizing Spark job performance and implementing monitoring tools like Amazon CloudWatch for continuous improvement. Data Engineer, Genpact Mar 2020 – Jul 2023 Hyderabad, India

Developed a highly scalable data integration and real-time analytics platform using Python and PySpark, streamlining data processing and improving the organization’s ability to derive actionable insights for better decision-making.

Architected and optimized ETL pipelines with Python, leveraging PySpark on Azure Databricks to enhance data processing by 40% and ensure high data quality across 10TB+ of structured, semi-structured, and unstructured data.

Utilized Pandas, NumPy, and SQLAlchemy for efficient data manipulation and transformation within the pipelines, improving data management workflows.

Integrated machine learning models with scikit-learn and TensorFlow into data pipelines, automating forecasting and customer behavior prediction, achieving a 30% increase in accuracy and saving over 100 man-hours per month.

Implemented CI/CD pipelines with Azure DevOps for continuous integration and deployment, reducing deployment time by 50% and doubling the frequency of deployments to meet evolving business demands.

Managed Azure-based data warehouse infrastructure utilizing Azure Blob Storage, Azure Synapse Analytics, Azure Data Factory, and Azure Data Lake, ensuring 90% uptime and enabling real-time data access for 50+ global stakeholders.

Leveraged Tableau for building interactive dashboards and visualizing key performance indicators, allowing real-time data analysis and better insights for stakeholders.

Conducted data management tasks, including data cleansing, validation, and enrichment, using Python and SQL to ensure accurate and up-to-date data for business analysis and reporting. EDUCATION

Master of Science, Auburn University at Montgomery Aug 2023 - Dec 2024 Montgomery, USA Information Systems

Bachelors of Technology, Pulla Reddy Engineering College Aug 2016 – Sept 2020 Kurnool, India Mechanical Engineering

CERTIFICATE

AWS Certified Data Engineer – Associate (Link)

Python for Data Science – IBM (Link)

Contact this candidate