Post Job Free
Sign in

Machine Learning Researcher and Data Scientist Advocate

Location:
Boston, MA
Posted:
January 05, 2026

Contact this candidate

Resume:

Sai Pranavi Jeedigunta

617-***-**** **********.*@************.*** linkedin.com/in/pranavijs/ github.com/SaiPranaviJeedigunta Open to Relocate Education

Master of Science in Information Systems Dec. 2025 Northeastern University Boston, MA

Relevant Courses: Data Science Engineering Methods and Tools, Program Structure and Algorithms, Big Data Systems and Intelligence Analytics, Data Management and Database Design, User Experience Design and Testing Bachelor of Technology in Computer Science Engineering Jun. 2023 Gokaraju Rangaraju Institute of Engineering and Technology Hyderabad, India Work Experience

Machine Learning Research Assistant Mar. 2025 – Dec. 2025 Northeastern University Boston, MA

• Contributing to a clinical research project on early sepsis prediction using multi-stage assessment data, benchmarking white-box

(Random Forest, Logistic Regression, decision trees) and black-box (ANN, LSTM) models.

• Supporting publication preparation by analyzing predictive accuracy, model interoperability, and clinical relevance.

• Collaborating on a fall detection system for elderly care using IoT sensor data (acceleration, angular velocity), applying residual networks (ResNet) to predict falls 3–6 seconds in advance. Data Science and AI Development Intern Jul. 2025 – Aug. 2025 SteamIQ Boston, MA

• Developed statistical and ML models on IoT steam trap telemetry using Pandas, Scikit-learn, and XGBoost to predict performance and detect anomalies, improving fault detection accuracy by 25% compared to baseline statistical thresholds.

• Queried and maintained PostgreSQL databases to organize analytics data, enabling model training and reporting.

• Engineered a scalable embedding upload pipeline for the RAG workflow, orchestrated with Airflow, to batch-process documents with metadata into Pinecone by preventing duplicate processing and reducing latency by 30%.

• Deployed LightRAG via Docker on DigitalOcean with a React.js chatbot interface, enabling scalable, context-aware retrieval across integrated vector and knowledge-graph systems. Research Assistant Jan. 2022 – May 2023

Gokaraju Rangaraju Institute of Engineering and Technology Hyderabad, India

• Automated data collection and annotation via web scraping (BeautifulSoup) and MakeSense.ai.

• Built PowerBI dashboards and Python visualizations to analyze violation trends and model performance metrics.

• Developed real-time violation detection models using YOLOv5, OpenCV(Computer Vision), and TensorFlow, improving accuracy by 30% on a dataset of 10,000+ images and videos. published at IEEE 2023 ICACCS. Skills

Programming & Frameworks: Python, Java, FastAPI, Docker, REST APIs, Keras, Jupyter Notebook, Terraform Database Systems & Data Warehousing: SQL, MySQL, Oracle SQL, BigQuery, Snowflake, NoSQL, Pinecone Big Data & ETL Tools: PySpark, Kafka, GCP Dataflow, Apache Airflow, LlamaIndex, Azure Databricks, Talend Data Mining & Machine Learning: PyTorch, Scikit-learn, NLP, spaCy, NLTK, BeautifulSoup, Selenium, TensorFlow Data Analysis and Visualization: Power BI, Tableau, Matplotlib, Seaborn, Streamlit, Pandas, GeoPandas Cloud Platforms & Version Control: Google Cloud Platform (GCP), AWS, Azure, Git, GitHub, GitLab, Kubernetes Projects

Multi-Agent RAG Application (LangGraph, Pinecone, Streamlit, Docling) Jan. 2025

• Built a multi-agent RAG system with semantic search, summarization, and doc Q&A, automated ingestion with Airflow + Docling, and deployed via CI/CD with 100+ test cases for benchmarking. Multi-Modal Document Analysis Platform (BeautifulSoup, Selenium, Snowflake) Dec. 2024

• Developed a Streamlit platform for 100+ users to query/summarize research docs, automated ETL of HTML/PDF sources with Airflow, built a FastAPI RAG using metadata + OpenAI/NVIDIA models, and deployed with Docker + CI/CD. Impact of Information Revelation in Book Popularity (Python, Scikit-learn, NLP) Oct. 2024

• Analyzed 5,000+ books from Project Gutenberg using NLP and regression models (LASSO, Ridge) to quantify narrative complexity via Kullback–Leibler divergence, achieving an R of 0.85 and identifying key predictors of popularity. Clinical Trial Data Pipeline (RAG, FastAPI, Pinecone, OpenAI, PostgreSQL) Sep. 2024

• Built an end-to-end pipeline using web scraping and EDGAR APIs, with a RAG workflow leveraging Pinecone + OpenAI APIs to extract baseline measures and endpoints from unstructured documents into structured PostgreSQL. Interactive Model Evaluation Framework (Streamlit, GAIA, OpenAI, BigQuery) Sep. 2024

• Built a Streamlit app to evaluate GPT-4 with 100+ GAIA test cases, enabling query validation + stepwise feedback, and engineered real-time logging with Dataflow + BigQuery to generate performance insights and visual reports. Publications

AI-Powered Early Detection of Sepsis in Emergency Medicine. Link at: mdpi.com/2075-1729/15/10/1576 Traffic Rules Violation Detection using YOLOv5 and Haar Cascade. Link at: ieeexplore.ieee.org/document/10112954 AI-based Indoor and Outdoor Fall Detection Infrastructure - Under review, 2025.



Contact this candidate