AI / ML Engineer

Location:

Overland Park, KS

Posted:

June 16, 2026

Contact this candidate

Resume:

Sai Kashyap Kaligotla

+1-307-***-**** ****************@*****.*** Overland Park, KS

LinkedIn: linkedin.com/in/kashyap-kaligotla GitHub: github.com/Kashyap-84 Portfolio: kashyap-portfolio-one.vercel.app

CAREER PROFILE

AI/ML Engineer with 5+ years of experience designing, training, and deploying production-grade AI and machine learning systems across financial services, healthcare, insurance, and telecommunications, with deep expertise across the full ML lifecycle from data engineering through model serving and observability. Strong builder of Generative AI and agentic AI applications using LangChain, LangGraph, LlamaIndex, and CrewAI, including RAG pipelines, multi-agent orchestration, prompt engineering, and LLM fine-tuning on foundation models from OpenAI, Anthropic, AWS Bedrock, and Google Vertex AI. Hands-on Python engineer comfortable with PyTorch, TensorFlow, and Hugging Face Transformers, distributed training on multi-GPU clusters, vector databases including FAISS, Milvus, Qdrant, and Pinecone, and modern MLOps tooling such as MLflow, Kubeflow, Docker, and Kubernetes. Cloud-fluent across AWS, Azure, and GCP, with practical experience building scalable APIs and microservices using FastAPI and Flask, integrating ML models into enterprise workflows, and partnering closely with data engineering, product, and platform teams to ship reliable AI solutions. Effective communicator who translates complex model behavior into clear business impact, mentors junior engineers, and consistently delivers measurable improvements in latency, accuracy, and operational efficiency.

SKILLS

• Languages: Python, SQL, Java, Bash, Scala

• Generative AI & LLMs: LangChain, LangGraph, LlamaIndex, CrewAI, AutoGen, OpenAI APIs, Anthropic Claude, AWS Bedrock, Google Gemini, prompt engineering, RAG pipelines, LLM fine-tuning, hallucination mitigation

• Agentic AI: Multi-agent orchestration, agent design patterns, tool invocation, autonomous workflows, agent evaluation, Google Agent Development Kit (ADK), MCP integrations

• ML & Deep Learning: PyTorch, TensorFlow, scikit-learn, XGBoost, LightGBM, Hugging Face Transformers, Keras, distributed training (DeepSpeed, FSDP, Accelerate)

• Foundation Models & Architectures: Transformers, encoder-decoder models, auto-regressive models, vision-language models, embedding models, model fine-tuning

• Vector Databases & Search: FAISS, Milvus, Qdrant, Chroma, Pinecone, semantic search, hybrid retrieval, chunking strategies

• MLOps & Production: MLflow, Kubeflow, model versioning, CI/CD pipelines, drift detection, observability, A/B testing, model monitoring

• Cloud Platforms: AWS (SageMaker, Bedrock, Lambda, S3, EKS), Azure (ML, OpenAI, AKS), GCP (Vertex AI, Gemini, GKE)

• APIs & Serving: FastAPI, Flask, REST, gRPC, microservices, Docker, Kubernetes, model inference optimization

• Data Engineering: Apache Spark, PySpark, Databricks, Delta Lake, Kafka, Airflow, ETL pipelines EXPERIENCE

Client: Cboe Global Markets AI/Machine Learning Engineer Overland Park, KS Aug 2025 – Present

• Designed and deployed production-grade Generative AI applications including RAG pipelines and LLM-powered financial document intelligence using LangChain, LangGraph, and AWS Bedrock foundation models integrated with enterprise data sources.

• Built agentic AI systems with multi-agent orchestration, tool invocation, and autonomous workflows using LangGraph and CrewAI, automating multi-step financial analysis and research workflows.

• Developed and fine-tuned deep learning models in PyTorch and Hugging Face Transformers for prediction, classification, and embedding generation, applying transfer learning and parameter-efficient techniques across foundation model architectures.

• Engineered RAG retrieval layers with FAISS and Milvus vector databases, optimized chunking strategies and embedding models, and implemented hallucination mitigation through grounding and citation patterns.

• Built scalable FastAPI microservices on Kubernetes for low-latency model serving, integrated with Kafka for event-driven inference, and applied Redis caching for sub-100ms response targets.

• Implemented end-to-end MLOps pipelines with MLflow for experiment tracking and model versioning, deployed CI/CD automation via GitHub Actions, and set up production monitoring with drift detection and alerting.

• Partnered with data engineering, product, and platform teams to translate ambiguous business needs into clear AI roadmaps, presented model behavior and tradeoffs to executive stakeholders.

• Mentored junior engineers on LLM design patterns, prompt engineering, and responsible AI practices, contributed to internal AI engineering playbooks and architectural review forums. Client: McKesson Corporation Machine Learning Engineer Overland Park, KS Feb 2025 – Jul 2025

• Built Generative AI and RAG applications for healthcare document automation and clinical knowledge retrieval using LangChain, LlamaIndex, and Azure OpenAI on Azure cloud infrastructure.

• Developed agentic workflows orchestrating multi-step clinical and supply chain operations, integrated tool-use patterns with healthcare APIs and enterprise data systems.

• Fine-tuned transformer models on healthcare-specific corpora using PyTorch and Hugging Face, applied parameter-efficient methods to balance accuracy and inference cost in production settings.

• Designed Qdrant-backed vector retrieval pipelines for healthcare policy and procedural document search, applied chunking and embedding strategies tuned for clinical text characteristics.

• Deployed scalable FastAPI services on Azure Kubernetes Service with CI/CD automation, MLflow-based model lifecycle management, and production monitoring with drift detection.

• Built distributed PySpark data pipelines on Databricks processing terabyte-scale healthcare datasets, feeding model training, evaluation, and feature stores.

• Collaborated with clinical SMEs, product, and engineering teams to align AI capabilities with regulatory and operational requirements, mentored junior team members on LLM evaluation best practices. Client: Kshema General Insurance Data Scientist Hyderabad, TS Jul 2022 – Nov 2023

• Built production AI/ML systems for insurance fraud detection, claims classification, and risk scoring using Python, PyTorch, and scikit-learn, deployed on AWS with SageMaker and Lambda.

• Developed early-stage Generative AI document automation using LangChain and LLM APIs to extract structured fields from unstructured insurance policy and claims documents.

• Engineered FAISS-based retrieval and embedding pipelines for policy similarity search and underwriting decision support, integrated with FastAPI microservices.

• Built end-to-end ML pipelines with MLflow versioning, Docker and Kubernetes deployment, and monitoring dashboards tracking model performance and data drift in production.

• Designed PySpark ETL pipelines processing large-scale claims and policy datasets, applied feature engineering and statistical analysis to improve model lift and stability.

• Partnered with claims operations, underwriting, and IT teams to deliver measurable business outcomes including reduced fraud loss and faster claims processing.

• Contributed to internal AI guidelines and presented model findings to senior leadership using clear visualizations and business-focused storytelling.

Client: T-Mobile Data Scientist Hyderabad, TS Jul 2020 – Jun 2022

• Designed and deployed customer analytics and network intelligence models using Python, PyTorch, and TensorFlow on AWS, supporting churn prediction, customer lifetime value, and propensity scoring.

• Built early LLM-powered customer service assistants and semantic search components using LangChain and Hugging Face Transformers, integrated with internal CRM systems.

• Developed FAISS-based embedding retrieval over telecommunications knowledge bases, applied NLP techniques for intent classification and topic modeling on customer interactions.

• Built end-to-end MLOps pipelines with MLflow experiment tracking, Docker and Kubernetes container orchestration, and CI/CD automation via GitHub Actions on AWS infrastructure.

• Designed PySpark distributed data pipelines processing high-volume customer and network telemetry datasets, supporting training, evaluation, and production inference.

• Collaborated with marketing, customer experience, and engineering teams to translate analytics into actionable strategies, presented results to non-technical audiences with clear business framing.

• Mentored interns and junior analysts on ML fundamentals, model evaluation, and Python engineering best practices in a fast-paced Agile environment.

PROJECTS

DocOps AI: Multimodal Document Intake and Exception Review

• Built multimodal document intelligence pipeline benchmarking OCR-first, layout-aware, and vision-language approaches with RAG retrieval using LangGraph and FAISS, drift monitoring, and human-review feedback loop.

• Shipped containerized FastAPI inference API on Kubernetes with MLflow experiment tracking, data validation, and pro- duction latency profiling delivering sub-second response times. OpsPulse: Ticket Demand Forecasting and SLA-Risk Monitoring

• Built multi-horizon forecasting platform combining transformer-based sequence models and gradient boosting with proba- bilistic evaluation metrics, rolling backtests, and fairness audits.

• Implemented drift detection, retraining triggers, and SLO-based alerting in a containerized MLOps pipeline integrating Kafka events and Kubernetes-based inference services. EDUCATION

University of Central Missouri, USA Masters in Computer Science Jan 2024 – Dec 2025 Andhra University, India Bachelor’s in Computer Science Aug 2017 – Jun 2021 CERTIFICATIONS

• AWS Certified AI Practitioner

• NVIDIA Certified Associate – Generative AI and LLMs

Contact this candidate