Sai Prerana Mandalika
*************@*****.*** +1-240-***-**** Washington, DC https://www.linkedin.com/in/mandalikap/ WORK EXPERIENCE
Medequip, Inc Jun 2024 – Aug 2024
Data Science Intern. Washington, USA
Designed and implemented modular data models & ETL pipelines using DBT and Snowflake, transforming raw Medisoft EMR billing data into clean, analytics-ready tables for Billing teams, reducing manual data prep by 50%.
Built automated data quality checks and anomaly alerts using DBT tests, orchestrated through Dagster, which improved pipeline reliability and reduced claim processing errors by 40%.
Developed dynamic Tableau dashboards and KPI reports to monitor patient collections, denial codes, and reimbursement trends, enabling real-time insights and reducing revenue cycle turnaround time by 35%. Georgetown University Nov 2023 – Jan 2024
Graduate Research Assistant Washington,USA
Built a deep learning pipeline using structured EHR data to predict Type-2 Diabetes onset, attaining and AUC-ROC score of 80% and improving early risk detection
Developed a stratification framework using stepwise logistic regression (AUC-ROC: 0.83) and KMeans clustering
(Silhouette Score: 0.62), segmenting EHR data into clinically distinct subgroups. Micron Technologies, LLC June 2021 – Feb 2023
Software Developer Hyderabad,IN
Conducted Ad-hoc SQL querying to clean, transform, render transactions between MySQL and external systems.
Developed scalable ETL pipelines using DBT and Snowflake, orchestrated via Apache Airflow to automate 1M+ daily sensor data, for data quality checks, for manufacturing analytics.
Designed Tableau dashboards for real-time monitoring of sensor data, enabling proactive issue detection. RELEVANT PROJECTS
GenAI Powered Data Engineering Agent Jan 2025 – May 2025
Built a data engineering agent using OpenAI LLMs to automate ETL workflows, generating python code for serialization
/deserialization and enabled seamless interaction via Streamlit and FastAPI, boosting interoperability.
Built End to End monitoring using AWS CloudWatch, capturing metrics across Glue, S3, OpenAI APIs-including job success rates, model latency (1.2s), and schema inference accuracy (90%) and reducing ETL failure diagnostics by 40%. Smart City End to End data pipeline April 2025 – Present
Designed and implemented a real-time Smart City data streaming pipeline, integrating Apache Kafka, Zookeeper, and spark within Dockerized environments to enable scalable data ingestion and processing.
Leveraged AWS Glue for ETL, Athena for querying, and Redshift for centralized storage, configured IAM roles for secure access control.
State Analysis Big data project. Jul 2024 – Dec 2024
Developed scalable sentiment analysis system across US states by processing 12M subreddit records with AWS S3, Glue, Redshift, applying PySpark pipeline and fine-tune NLP models to boost sentiment classification accuracy by 25%.
Analyzed large-scale regional data in Redshift to uncover socio-political sentiment trends. EDUCATION
Georgetown University Aug 2023 – May2025
Master of Science in Data Science and Analytics Washington,USA Current GPA: 3.95
Achievements: Returning Student Scholarship
Chaitanya Bharathi Institute of Technology Aug 2017 - Jun 2021 Bachelor of Science in Computer Science and Engineering Hyderabad,IN National Level Smart India Hackathon Winner
TECHNICAL SKILLS
Data Analysis&Visualization: Python, R, SQL, C, Postgresql, Tableau,Looker, Javascript, PowerBI, Airflow, Git, DBT Cloud Technologies: AWS SageMaker, EC2, Bedrock, Cloudwatch, Kafka, PySpark, Glue, Athena, Redshift, FastAPI AI/Machine Learning&Statistical Modeling: Pandas, Numpy, Regression,Tensorflow,Pytorch,NLP, Hypothesis testing