Sai Krishna Kokkula
Jersey City, NJ ***** 848-***-**** **********************@*****.*** LinkedIn GitHub Website PROFESSIONAL SUMMARY
Full-Stack & AI Engineer with 3+ years of experience building production-grade LLM applications, RAG pipelines, and scalable backend systems. Proven track record at Walmart and Centene delivering measurable performance gains 40% reduction in hallucination rates, 3 throughput improvement, and sub-200ms semantic search over 500K+ documents. Expert in Python (FastAPI), React/Next.js, LangChain, and cloud-native deployments. TECHNICAL SKILLS
AI / GenAI: LangChain, OpenAI API, RAG, Prompt Engineering, LLM Fine-tuning (LoRA), FAISS, ChromaDB, Qdrant, Semantic Search
Backend: Python, FastAPI, Flask, Node.js (Express), REST APIs, Microservices, SQLAlchemy, PostgreSQL, MongoDB, Redis
Frontend: React.js, Next.js, TypeScript, Redux Toolkit, Tailwind CSS, HTML5/CSS3
DevOps & Cloud: Docker, Docker Compose, GitHub Actions, GCP (GCS, Cloud Logging), Vercel, CI/CD
Security & Testing: JWT, OAuth2, RBAC, Rate Limiting, Jest, Mocha, Cypress PROFESSIONAL EXPERIENCE
Full-Stack AI Developer Walmart – Bentonville, AR Oct 2025 – Present AI-Powered Product Search & Recommendation Engine
Designed production RAG pipelines (LangChain + OpenAI API + FAISS/ChromaDB), cutting chatbot hallucination rates by 40%
Built semantic search over 500K+ documents using embedding models + FAISS, achieving <200ms query latency
Implemented hybrid dense sparse retrieval with reranking, improving top 5 search accuracy by 35%
Fine-tuned Llama 2/3 and Mistral on domain data using LoRA; served via FastAPI with streaming responses
Architected async FastAPI services handling 5K+ concurrent requests with PostgreSQL connection pooling; reduced DB load 60% via Redis caching
Migrated Node.js/Express + MongoDB API to FastAPI + PostgreSQL, achieving 3 throughput improvement
Optimized SQL queries (indexing, partitioning, CTEs), reducing p95 latency from 2s to 150ms
Built prompt playground with versioned templates and real-time token counting, used daily by 20+ AI engineers
Containerized full RAG stack with Docker Compose; set up GitHub Actions CI/CD reducing deployment time from 30 to 8 min Full-Stack Developer Centene Corporation, USA May 2024 – Sep 2025 Intelligent Customer Support Chatbot (RAG + LLM)
Built an end-to-end RAG-powered customer support assistant using LangChain + OpenAI API over a company knowledge base
Developed scalable full-stack applications with React, Next.js (SSR/App Router), FastAPI, and TypeScript with Redux Toolkit
Designed and maintained versioned RESTful APIs with async processing, proper validation, and JWT/OAuth2 security
Implemented Redis caching and PostgreSQL/MongoDB query optimizations, significantly reducing API response latency
Built and deployed microservices with Docker on GCP; automated CI/CD pipelines using GitHub Actions
Created document ingestion pipelines with embedding generation for knowledge base indexing using FAISS and ChromaDB Front-End Engineer Nisum – Hyderabad, India Dec 2022 – Nov 2023 Web Application Development (Full Stack with Python & React)
Built reusable React/Angular/Vue component libraries integrated with Python backends (Django, Flask, FastAPI) via REST APIs
Implemented state management (Redux, Context API), JWT/OAuth authentication, and responsive cross-browser UI
Improved page load performance via lazy loading, code splitting, and Webpack/Vite optimization
Wrote unit and integration tests (Jest, Mocha, Cypress); participated in Agile sprints, code reviews, and design handoffs EDUCATION
Pace University - Master's degree in computer science New York, Jan 2024 – Dec 2025