DEEKSHA Tolapu *************@*****.*** 214-***-**** LinkedIn GitHub
SUMMARY: AI Engineer with 3 years of experience architecting production-grade AI systems and Big Data pipelines. Currently at Sri Tech, I specialize in Agentic Frameworks, architecting autonomous AI agents using LangGraph to build stateful systems that plan, execute, and validate tasks beyond simple LLM calls. My background includes production experience at Alba, where I built and deployed real-time inference microservices using Python and FastAPI, managing strict production SLAs for manufacturing. Additionally, at HippoTech, I established strong data engineering foundations working with Databricks and petabyte-scale data lakes on Azure and Spark. WORK EXPERIENCE
Sri Tech,Dallas Oct 2025-Present
AI Engineer
• Engineered an Agentic RAG system using LangChain and LangGraph to automate contract analysis, reducing manual review time by 90% through a stateful workflow of intent classification, planning, and slot filling
• Designed document preprocessing pipelines with chunking, metadata tagging, and embeddings to improve search relevance and contextual responses.
• Built a scalable retrieval pipeline integrating AWS Textract for OCR and MongoDB Atlas for vector search and session persistence, enabling the system to retain context and maintain accurate multi-turn conversations.
• Developed a full-stack application featuring a React frontend and FastAPI backend, utilizing AWS Bedrock for natural language generation and a Human-in-the-loop mechanism to dynamically resolve query ambiguities Alba Life Science,Dallas Aug 2024-May 2025
Data Scientist
• Integrated an anomaly detection model on Vertex AI with a FastAPI service on Cloud Run to support biomedical imaging workflow monitoring, enabling faster predictions and improved anomaly detection.
• Engineered a 200+ feature dataset from biomedical sensor data, high-resolution lab imagery, and experimental test outputs; trained XGBoost and Random Forest ensemble models using scalable cloud pipelines, achieving ~0.93 F1 score while maintaining low false-positive rates critical for research accuracy.
• Enhanced vector database indexing strategies and document chunking methods to improve retrieval accuracy and response relevance in AI-driven knowledge systems
• Built a RAG pipeline with LangChain to index SOPs, research documentation, and experimental logs in a vector database with hybrid retrieval and improved retrieval precision by ~15%, helping teams resolve issues faster
• Developed and containerized Streamlit and FastAPI apps with Docker, orchestrated via Kubernetes, to automate daily tasks (visualization, queries, reporting, anomaly detection), reducing manual effort by 25%. Hippo Tech,India July 2021-July 2023
Data Scientist
• Managed a large-scale healthcare data platform on Azure Synapse integrating patient records, claims, billing, insurance, and operational data, improving query performance and enabling cross-functional analytics for revenue cycle optimization
• Developed demand forecasting and revenue analytics pipelines using PySpark on Databricks to analyze claims trends and patient billing patterns, improving forecasting accuracy and supporting better financial planning.
• Optimized ETL pipelines to enhance data processing efficiency and reduce latency, enabling near real-time reporting and analytics for faster, data-driven decisions.
• Enhanced AI-driven support workflows by applying domain-specific embeddings and LLM fine-tuning, improving intent recognition accuracy for healthcare queries and helping automate patient and billing support interactions.
• Built automated Azure Data Factory pipelines feeding real-time Power BI dashboards to monitor claims processing, billing KPIs, and operational metrics, improving visibility and supporting faster decision-making. EDUCATION
The University of Texas at Dallas May 2025
Master of Science, Information Technology and Management GPA 3.5/4 Stanley College Of Enginnering and Technology June 2022 Computer Sciences GPA 3.3/4
TECHNICAL SKILLS
Programming Languages: Python, C++, C, Java, R, SQL, MATLAB, HTML, CSS, JavaScript,Rust,Hack,C# Platforms: Databricks, Airflow,dbt, Dbeaver, Jupyter Lab, R Studio, MySQL, GCP, Git, Linux, Microsoft Azure, AWS. Frameworks: TensorFlow, PyTorch, OpenCV, NLTK, Pandas, Matplotlib, scikit-learn, Tableau, Snowflake, Spark, Hadoop