Post Job Free
Sign in

Data Engineer Processing

Location:
Rolla, MO
Posted:
May 05, 2025

Contact this candidate

Resume:

Alekhya Reddy Guntaka

********************@*****.*** 573-***-**** UNITED STATES

PROFESSIONAL SUMMARY

Data Engineer with over 4 years of experience in designing, building, and optimizing scalable data pipelines using big data technologies, cloud platforms, and advanced analytics. Proficient in Python, Java, SQL, Spark (PySpark), and Kafka, with hands-on expertise in AWS, Snowflake, and real-time data processing. Skilled in developing fault-tolerant, low-latency data solutions and writing highly optimized queries for large datasets. Adept at leveraging open-source tools, implementing CI/CD pipelines, and collaborating with cross-functional teams to drive data-driven decision-making and deliver impactful business solutions.

EXPERIENCE

Data Engineer Raysoft Global Inc Oct 2024– Mar 2025 Texas, USA

Responsibilities

•Developed and implemented scalable data solutions using Python in an Agile/Scrum environment, integrating Electronic Health Record (EHR) systems with patient management platforms to improve patient interaction tracking and enhance operational efficiency by 10%.

•Utilized expertise in cloud platforms (Azure) to optimize healthcare data workflows, achieving a 15% improvement in processing speed and ensuring seamless scalability for patient-facing applications like telemedicine portals.

•Configured integrations between AWS S3 and EHR systems to establish fault-tolerant, cloud-based architectures, enabling low-latency data access for patient analytics and clinical reporting.

•Worked with cross-functional teams using Agile methodologies to deploy real-time data processing for healthcare solutions, increasing diagnostic and analytical accuracy by 15% through precise data handling and attention to detail.

•Improved CI/CD pipelines with GitHub Actions, applying strong organizational skills to streamline deployments, minimize integration errors, and support rapid iteration cycles in a cloud-based healthcare ecosystem.

Software Engineer Vishtik LLC Mar 2024– Oct 2024 Texas, USA

Responsibilities

•Developed and tuned ETL pipelines using Python and SSIS to process large datasets from diverse sources (Excel, XML, SQL Server), ensuring scalability and data integrity.

•Utilized PySpark for distributed data processing, improving performance of data workflows across cloud environments.

•Created parameterized dashboards in Power BI to visualize real-time data insights, supporting business stakeholders in decision-making.

•Implemented error-handling mechanisms in data pipelines, enhancing fault tolerance and reliability of data processing systems.

Graduate Assistant Jan 2023 – Dec 2023 Hyderabad, India

Cloud Computing and Big Data Management (COMP SCI 6304)

•Supported research and grading for cloud computing and big data coursework, focusing on distributed systems and data pipeline optimization.

•Assisted in evaluating student projects involving Spark, AWS, and real-time data processing techniques.

Data Engineer RMSI Pvt Ltd Oct 2020 – Jan 2022 Hyderabad, India

Responsibilities

•Built event-driven data pipelines using Apache Kafka and SQL, enabling real-time analytics and improving operational insights by 15%.

•Designed and optimized complex SQL queries and stored procedures in SSMS for large-scale datasets, ensuring efficient data retrieval.

•Developed interactive Power BI dashboards to track KPIs, driving data governance and visibility into production data trends.

•Conducted POCs to evaluate open-source tools for data processing, contributing to scalable and cost-effective solutions.

SKILLS

•Programming/Scripting: Python, Java, SQL, PySpark, Scala (basic)

•Big Data Technologies: Apache Spark, Kafka, Spark Streaming, Presto

•Cloud Platforms: AWS (S3, EC2), Azure (Data Studio), Snowflake

•Database Management: MySQL, MS SQL, PostgreSQL, Oracle RDBMS

•Tools: Power BI, Tableau, SSIS, SSRS, GitHub, Jenkins, Docker, JIRA

•Data Engineering: ETL/ELT, CI/CD pipelines, real-time data processing, data governance

•Libraries: Pandas, NumPy, Scikit-learn

ACADEMIC PROJECTS & ACTIVITIES

Fraud Detection in Banking Transactions

•Implemented ETL processes using Python and Spark to cleanse and process large financial datasets, reducing data size by 20%.

•Developed predictive models with an AUC score of 80%, leveraging Tableau for visualizing transaction patterns.

Diabetes Prediction Using ML Algorithms

•Built and optimized data pipelines with PySpark for exploratory data analysis, achieving 100% clean data for model training.

•Validated supervised ML models through A/B testing, enhancing prediction accuracy for healthcare applications.

Movie Review Sentiment Analysis

•Preprocessed and analyzed large review datasets using Python and SAS, reducing anomalies by 20% and achieving 60% clustering accuracy.

EDUCATION

Master of Science in Information Science and Technology Dec 2023

Missouri University of Science and Technology, Rolla, MO, US

Bachelor of Technology in Computer Science and Engineering Sep 2020

JNTU, Hyderabad, India



Contact this candidate