Computer Science Data Engineering

Location:

St. Louis, MO

Posted:

July 14, 2025

Contact this candidate

Resume:

Prabhath Pasula

940-***-**** *******@****.*** www.linkedin.com/in/prabhath-pasula

Education

Southern Illinois University Edwardsville Edwardsville, IL,USA Master of Science in Computer Science Aug. 2024 – current Sreenidhi Institute of Science and Technology Hyderabad,TS,INDIA Bachelor of Technology in Computer Science Engineering Aug. 2017 – May 2021 Experience

Data Engineering Analyst Dec 2022 – Aug 2024

Accenture Solutions Pvt Ltd Hyderabad, India

• Automated daily reporting on steel production throughput using Tableau, Spark, Python, and shell scripting. Built a modular framework comprising scripts that fetched equipment status and applied domain-specific computation rules based on Furnaces availability, ladle usage, steel grade, cooling time, and chemical composition to estimate projected 24-hour throughput, which was then compared against actual output for production efficiency tracking—cutting manual effort by 60%.

• Designed and deployed ETL pipelines into Azure Data Lake, handling batch and real-time ingestion via Kafka, enabling scalable analytics for manufacturing operations.

• Migrated internal APIs from Flask to FastAPI, leveraging asynchronous programming to reduce latency and improve concurrency; implemented OAuth 2.0 for secure, token-based user and service authentication.

• Built reusable Python-based data quality validation layers to automate consistency and integrity checks, reducing manual QA effort by 40% and increasing data reliability for reporting and decision-making.

• Led the deployment and monitoring of critical data workflows in Azure, ensuring robust handling of streaming and batch data jobs with integrated alerting and recovery mechanisms. Data Engineering Associate Nov 2021 – Dec 2022

Accenture Solutions Pvt Ltd Hyderabad, India

• Developed and optimized large-scale PySpark workflows to process multi-GB datasets from steel industry production systems; enhanced Hive query performance and job orchestration, resulting in a 30% reduction in processing time for backend pipelines.

• Ingested large datasets from Oracle, MySQL, and SQL Server databases into Hadoop HDFS using Sqoop and HQL, enabling scalable, high-throughput analytics for steel production metrics and downstream BI tools.

• Gained hands-on experience in Azure Synapse Analytics, designing optimized data warehouse queries and improving read performance on large fact tables via proper partitioning and indexing.

• Built and maintained Airflow DAGs to schedule PySpark jobs, achieving 20% performance gain in HDFS-level processing through code refactoring and resource tuning.

• Contributed to successful Hadoop–Oracle migration projects, supporting data lake development and ensuring integrity across pipeline stages.

• Standardized data analytics protocols and documentation practices, while implementing rigorous data integrity checks and enforcing consistent schema mappings across heterogeneous systems during batch migrations—ensuring compliance, clarity, and on-time, reliable project delivery. Projects

Hospital Management Web System Java Servlets, SQL, HTML/CSS, REST APIs Developer Intern – Cantilever Labs May 2020 – Aug 2020

• Designed and developed a data-driven full-stack web system for Jeevan Hospitals to manage patient-doctor consultations, symptom tracking, prescriptions, and medicine delivery workflows.

• Implemented SQL-backed backend logic for structured storage and efficient retrieval of consultation records, transaction data, and appointment history.

• Integrated secure payment and pharmacy APIs, enabling real-time ingestion and synchronization of transactional data for remote healthcare operations.

• Delivered a responsive, validated UI, ensuring consistent and clean input of medical records and improving data integrity across the platform.

• Earned an Industry Readiness Certificate, demonstrating hands-on system design and data architecture capabilities aligned with real-world production standards.

Real-Time Face Recognition on Edge Devices —Python,YOLOv3,OpenCV,RaspberryPi,Deep Learning Developer Intern – Central Institute of Tool Design (CITD) May 2020 – Aug 2020

• Engineered a real-time face recognition system using YOLOv3 and OpenCV, optimized for Raspberry Pi to support security use cases on low-power, edge devices.

• Implemented machine learning-based object detection and facial recognition pipelines for live video feeds, including bounding box extraction and identity tagging modules.

• Tuned model inference and frame processing pipelines to achieve real-time responsiveness, while minimizing false positives and ensuring consistent detection accuracy.

• Demonstrated practical deployment of deep learning on constrained hardware, integrating Python-based optimizations for inference efficiency.

• Validated usecase applicability for real-time surveillance and biometric monitoring in resource-limited environments, blending ML, IoT, and computer vision domains. Certifications

Microsoft Certified: Azure Data Engineer Associate (DP-203) – 2024 Google Cloud Certified: Professional Data Engineer Power BI: Microsoft Power BI Desktop and Service (Udemy, 2023) NPTEL: Database Management Systems

NPTEL: Joy of Computing Using Python – Silver Medalist Technical Skills

Programming Languages: Python, SQL, Java, C++, R

Big Data & Distributed Systems: Hadoop, Spark, Hive, HDFS, PySpark, Kafka, Airflow Cloud Platforms: Azure, Google Cloud Platform (GCP) Databases: Oracle SQL Developer, SQL Server, MongoDB, Spark SQL, HQL ETL & APIs: FastAPI, Flask, REST APIs, Apache Kafka DevOps & Containers: Kubernetes, Docker, Git, CI/CD Data Visualization: Tableau, Power BI, Matplotlib, Pandas Machine Learning & Data Science: Scikit-learn, OpenCV, Deep Learning, Computer Vision

Contact this candidate