Senior GenAI & Agentic ML Engineer (LLMs, MLOps)

Location:

San Francisco, CA, 94102

Posted:

April 09, 2026

Contact this candidate

Resume:

SAI CHARAN J

Senior Generative AI & Agentic Data Scientist LLMs · MLOps · AWS FinTech & HealthTech

+1-469-***-**** ****************@*****.*** linkedin.com/in/sai-charan-reddy-22345ai github.com/Saicharanjanga San Diego, CA

PROFESSIONAL SUMMARY

Senior AI/ML Engineer with 5+ years delivering production GenAI and ML systems across fintech and healthcare. Deep expertise in end-to-end LLM pipelines — from fine-tuning (LoRA/QLoRA/PEFT) and RAG architecture to agentic workflows with LangChain and LangGraph. Experienced in MCP (Model Context Protocol) for tool-connected agent systems, and LLM agent benchmarking and testing frameworks for evaluating agent reliability, tool-use accuracy, and task completion. Built and deployed scalable inference services on AWS (SageMaker, ECS, Lambda) processing 50M+ daily transactions at sub-100ms latency. Proven track record in regulated environments (PCI-DSS, HIPAA): 34% fraud detection improvement, automated retraining pipelines, guardrails against prompt injection, and PHI/PII data governance at scale. MS in Computer Science (GPA 3.9).

CORE COMPETENCIES

GenAI & LLM Systems

LangChain · LangGraph · AutoGen

RAG · Fine-tuning (LoRA/QLoRA)

Prompt Eng · LLM-as-Judge Eval

Agentic & MLOps

MCP · Tool-use Agents · Memory

Agent Benchmarking & Testing

CI/CD · Monitoring · Rollback

Cloud & Deployment

AWS SageMaker · ECS · EKS · Lambda

Docker · Kubernetes · Terraform

PyTorch · Hugging Face · FastAPI

TECHNICAL SKILLS

GenAI / LLMs

LangChain, LangGraph, AutoGen, CrewAI, OpenAI API, Anthropic Claude API, Hugging Face Transformers, RAG, PEFT, LoRA, QLoRA, SFT, Prompt Engineering, MCP (Model Context Protocol), LLM-as-Judge, Offline Eval Harnesses

ML / Deep Learning

PyTorch, TensorFlow, Scikit-learn, XGBoost, LightGBM, SHAP, LIME, NLP (spaCy, BERT), MLflow, Feature Engineering, Model Monitoring, Drift Detection

Agentic Systems

MCP (Model Context Protocol), Tool-use Agents, Memory Architectures, Agent Benchmarking & Testing (AgentBench, ToolBench, GAIA), Safety Guardrails, Prompt Injection Defense, Multi-agent Orchestration

Cloud & Infra

AWS (SageMaker, ECS, EKS, Lambda, EC2, S3, RDS, DynamoDB, OpenSearch), Azure (AKS, Functions, Cosmos DB), GCP basics, Docker, Kubernetes, Terraform

MLOps & Observability

CI/CD (Jenkins, GitLab), Model Versioning, A/B Testing, Prometheus, Grafana, ELK Stack, CloudWatch, Automated Retraining, Rollback Strategies

Data & SQL

Python, SQL (PostgreSQL, Oracle, MySQL), PySpark, Apache Spark, Kafka, ETL Pipelines, Redis, MongoDB, Feature Stores

Backend & APIs

FastAPI, Flask, Spring Boot, REST, GraphQL, Docker, Microservices

Security & Compliance

PCI-DSS, HIPAA, PHI/PII Governance, OWASP, AES-256, RBAC, Audit Logging, Guardrails

PROFESSIONAL EXPERIENCE

Senior ML Engineer — GenAI & Fraud Intelligence Citibank San Francisco, CA Aug 2024 – Present

Real-Time Fraud Detection & GenAI Platform Python, XGBoost, LangChain, LangGraph, FastAPI, AWS SageMaker, Kafka, Docker, Kubernetes

•Architected end-to-end GenAI pipeline using LangChain and LangGraph to build MCP-compatible tool-use agentic workflows for fraud investigation, integrating memory, guardrails against prompt injection, and safety filters for PCI-DSS compliance.

•Fine-tuned LLMs using LoRA/QLoRA (PEFT) on transaction metadata and fraud labels; managed experiment tracking with MLflow, achieving 34% improvement in fraud detection accuracy over legacy rule-based systems.

•Deployed inference services on AWS SageMaker and Lambda (GPU-accelerated) with Docker/Kubernetes, achieving sub-100ms latency across 50M+ daily transactions with batching, caching (Redis), and quantization optimizations.

•Established LLM agent benchmarking and testing framework to evaluate tool-use accuracy, task completion rates, and reasoning reliability across agent workflows; built offline and online evaluation pipelines using LLM-as-judge metrics and custom test harnesses; implemented CI/CD for model and prompt updates via Jenkins with automated rollback on performance degradation.

•Implemented PHI/PII data governance, AES-256 encryption, and RBAC access controls; applied SHAP-based explainability to model decisions ensuring regulatory auditability.

•Engineered scalable feature pipelines (Kafka + FastAPI) extracting behavioral, temporal, and NLP signals (spaCy, BERT) from 50M+ daily transactions, cutting feature extraction latency from 2.5s to 400ms.

•Migrated legacy Java batch processing to Python microservices on AWS ECS/EKS, reducing processing time from 4 hours to 12 minutes; achieved 82% test coverage with pytest across all ML pipeline components.

Machine Learning Engineer MetLife New York, NY Sep 2021 – Nov 2023

Claims Fraud Detection & RAG-Powered Knowledge Platform Python, Scikit-learn, PySpark, LangChain, Hugging Face, FastAPI, Azure, Kafka

•Built RAG-powered claims-summary application using vector embeddings, PyTorch, and AWS OpenSearch with MCP-based tool integrations for real-time data access; designed and scoped the full GenAI solution from data prep and chunking strategy through evaluation and production deployment.

•Developed end-to-end ML pipelines for claims fraud detection and customer churn prediction — EDA, feature engineering, model training (Scikit-learn, PyTorch), deployment as FastAPI REST APIs, and automated MLOps monitoring.

•Applied NLP (Hugging Face Transformers, spaCy) to extract clinical entities from unstructured claims notes and implemented OCR pipeline (Tesseract + PyTorch) for scanned medical documents, improving categorization accuracy by 22%.

•Configured CI/CD with Jenkins triggering automated model retraining when performance degraded below threshold; implemented drift detection and anomaly detection pipelines using Scikit-learn and Pandas.

•Designed HIPAA-compliant data architecture across Oracle, Azure Cosmos DB, MongoDB, and Azure SQL; enforced PHI/PII protections with immutable audit logs and encryption at rest and in transit.

•Collaborated with Deloitte consultants and MetLife stakeholders in Agile sprints delivering compliant ML solutions; reduced false positives by 15% through continuous evaluation and feedback loops.

Python Developer CVS Health Hyderabad, India Jun 2020 – Aug 2021

E-Clinic Telehealth Platform Python, Flask, NLP, Pandas, Azure SQL, Redis, React

•Built ETL pipelines using Pandas to extract patient vitals and EHR records from multiple clinic systems into Azure SQL; processed unstructured consultation notes with NLP (text extraction, summarization), reducing documentation time by 20%.

•Developed Flask REST APIs for telehealth platform (patient registration, appointments, virtual consultations); implemented Redis caching reducing DB queries by 35%; achieved 75%+ pytest coverage across all endpoints.

•Designed HIPAA-compliant MySQL schema with encryption and Azure Blob Storage for secure medical document storage; implemented data retention policies with automated archiving.

KEY PROJECTS

Neural Chat — Multi-LLM Agentic Chatbot Python, LangChain, OpenAI API, Claude API, ReactJS

•Built full-stack AI chatbot integrating Anthropic Claude and OpenAI GPT APIs with MCP-based tool connectivity, topic-based context switching, session memory, and dynamic system prompt architecture simulating expert-level AI/ML knowledge across 5 domains.

•Implemented LLM agent benchmarking suite to evaluate response quality, tool-use accuracy, and multi-turn reasoning; deployed as zero-build static application on GitHub Pages/Vercel with Streamlit UI for RAG chunking strategy experiments and multimodal inference.

Smart Autocomplete Engine Python, Trie, LFU Cache, Min-Heap

•Designed product-grade autocomplete engine with composite relevance scoring: frequency (40%) + recency decay (35%) + personalization (25%); O(L) prefix search using Trie + O(1) LFU Cache. 28/28 unit tests passing.

EDUCATION

Master of Science, Computer and Information Sciences — GPA: 3.9 / 4.0 Dec 2023 – May 2025

Southern Arkansas University, Magnolia, AR

CERTIFICATIONS

AWS Certified Solutions Architect – Associate 2024

Oracle Cloud Infrastructure AI Foundations Associate 2025

Contact this candidate