Post Job Free
Sign in

Senior AI Engineer with 9+ Years in ML/LLM Deployment

Location:
Edison, NJ
Posted:
January 05, 2026

Contact this candidate

Resume:

Surender Reddy

AI ENGINEER

+1-732-***-**** : ********.***.*@*****.*** LinkedIn

PROFESSIONAL SUMMARY:

• Senior AI / GenAI Engineer with 9+ years of overall software engineering experience and 7+ years of hands-on Machine Learning and Deep Learning expertise.

• Specialized in designing, building, and deploying Large Language Model (LLM) systems in production environments.

• Proven experience delivering Retrieval-Augmented Generation (RAG) solutions for enterprise-scale search, reasoning, and decision support.

• Strong background in agentic AI architectures, including multi-agent workflows using LangChain, LangGraph, and CrewAI.

• Extensive experience working with healthcare, financial services, and retail data, including regulated and compliance-heavy environments.

• Expert in end-to-end AI system lifecycle, from data ingestion and model training to deployment, monitoring, and optimization.

• Demonstrated success optimizing model performance, latency, and cloud cost efficiency for large-scale AI workloads.

• Deep expertise in Python-based AI development, leveraging PyTorch, TensorFlow, and Hugging Face Transformers.

• Hands-on leader in implementing MLOps, CI/CD, and cloud-native AI platforms using Kubernetes and modern DevOps practices.

• Strong collaborator with product managers, data scientists, and engineering teams to translate business problems into production-ready AI solutions.

TECHNICAL SKILLS

• Artificial Intelligence & ML: Machine Learning, Deep Learning, NLP, Transformers, Generative AI, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG)

• LLMs & Prompt Engineering: GPT-4, Gemini-class models, LLaMA-3, Mixtral, Prompt Engineering, Tool Calling, Context Engineering

• Agentic AI Frameworks: LangChain, LangGraph, Semantic Kernel, CrewAI, Multi-Agent Orchestration

• NLP & Document AI: Text Classification, Summarization, Entity Extraction, Question Answering, OCR, Document Understanding

• Vector Databases & Search: Vertex AI Vector Search, Pinecone, FAISS, Weaviate, Embeddings, Semantic Search

• MLOps & Model Lifecycle: MLflow, Model Versioning, Experiment Tracking, Model Monitoring, A/B Testing, Regression Testing

• Deployment & APIs: FastAPI, REST/gRPC APIs, Microservices Architecture, Docker, Kubernetes

• Cloud & AI Platforms: GCP Vertex AI, BigQuery, AWS SageMaker, AWS Bedrock, Azure ML

• Programming & Data: Python, SQL, TypeScript/Node.js, Data Pipelines, Feature Engineering

• DevOps & Governance: GitHub Actions, Jenkins, Terraform, Helm, Responsible AI, Explainability (SHAP, LIME), HIPAA, GDPR

CERTIFICATIONS

• NVIDIA Certified Professional – Agentic AI

• DeepLearning.AI – AI Agents in LangGraph

• DeepLearning.AI – Multi-Agent Systems with CrewAI PROFESSIONAL EXPERIENCE

Elevance Health – Indianapolis, IN

Gen AI Engineer Jul 2024 – Present

• Acted as the primary technical owner for enterprise GenAI initiatives, defining architecture standards, engineering best practices, and production readiness criteria for healthcare AI platforms.

• Designed and deployed LLM-powered healthcare systems supporting evidence extraction, claims analytics, and clinical reasoning used by cross-functional business and clinical stakeholders.

• Architected high-accuracy RAG pipelines over ICD-10 claims, clinical notes, lab results, medications, and problem lists, ensuring reliable, grounded responses in regulated use cases.

• Built AI evidence engines to support CMS HCC risk adjustment and clinical review workflows, improving data accessibility and decision consistency.

• Designed embedding strategies, chunking logic, hybrid retrieval mechanisms, and re-ranking pipelines to improve recall, relevance, and factual accuracy of LLM outputs.

• Developed LLM evaluation frameworks to measure groundedness, factual accuracy, consistency, and response stability across model and prompt versions.

• Implemented hallucination detection and regression testing pipelines that act as mandatory quality gates before promoting models to production.

• Optimized LLM inference performance and cloud costs through batching strategies, GPU utilization tuning, latency optimization, and caching techniques.

• Deployed fault-tolerant, scalable AI microservices using FastAPI, Docker, and Kubernetes to support high- availability production workloads.

• Designed and maintained CI/CD pipelines for AI systems, enabling automated validation, deployment, rollback, monitoring, and incident response.

• Conducted A/B testing and controlled experiments to evaluate prompt strategies, retrieval configurations, and model variants prior to release.

• Championed Responsible AI practices, ensuring explainability, bias mitigation, auditability, and compliance with HIPAA and GDPR regulations.

• Led the translation of clinical and business requirements into GenAI system designs, ensuring alignment between model behavior and real-world healthcare workflows.

• Partnered with data engineering teams to standardize data ingestion and preprocessing pipelines for structured and unstructured clinical datasets.

• Designed fallback and confidence-scoring mechanisms to safely handle low-relevance or low-confidence LLM responses in production.

• Implemented prompt lifecycle management practices, including versioning, rollback, and controlled rollout across environments.

• Built observability dashboards tracking LLM latency, token usage, retrieval hit rates, and hallucination trends in production.

• Conducted failure-mode analysis on GenAI outputs to identify root causes of incorrect or incomplete responses. Environment:

Python, PyTorch, TensorFlow, Hugging Face, Gemini-class LLMs, LangChain, LangGraph, RAG Pipelines, Embeddings, Vertex AI Vector Search, GCP Vertex AI, BigQuery, FastAPI, Docker, Kubernetes, MLflow, GitHub Actions, Jenkins, Terraform

Edward Jones – St. Louis, MO

AI / ML Engineer Feb 2022 – Jun 2024

• Designed and deployed machine learning and deep learning models supporting NLP-driven analytics and decision-support platforms used by financial advisors.

• Built LLM-assisted enterprise knowledge systems that enabled natural language querying of research documents, compliance policies, and portfolio data.

• Architected RAG-based retrieval platforms over financial research, regulatory documentation, and transactional datasets with strict compliance requirements.

• Designed semantic search and embedding pipelines to support high-recall, low-latency retrieval across large document corpora.

• Developed NLP pipelines for entity extraction, summarization, and sentiment analysis using transformer-based architectures such as BERT and T5.

• Built scalable inference services using FastAPI, Docker, and Kubernetes, ensuring secure and reliable integration with core brokerage platforms.

• Integrated AI services into enterprise systems via REST APIs, supporting real-time and batch decision workflows.

• Collaborated with domain experts to encode financial knowledge and business rules into retrieval and prompting strategies.

• Designed document ingestion and indexing pipelines to support large-scale knowledge bases with frequent updates.

• Implemented ranking and scoring logic to prioritize authoritative sources in LLM-generated responses.

• Supported model retraining and re-evaluation cycles as new financial data became available.

• Built internal tooling to allow non-technical users to test and validate AI-driven insights.

• Conducted performance tuning of inference services to meet strict latency requirements for advisor-facing applications.

• Engineered fraud detection and anomaly detection models using supervised and unsupervised ML techniques to identify suspicious activity.

• Implemented model monitoring, explainability, and A/B testing frameworks to improve trust, transparency, and adoption of AI solutions.

• Built CI/CD pipelines for ML workflows, enabling reproducible training, validation, and deployment across environments.

• Collaborated with compliance and risk teams to ensure regulatory-aligned AI deployments in financial services environments.

Environment:

Python, PyTorch, TensorFlow, Scikit-learn, Hugging Face, BERT, T5, RAG, FastAPI, Docker, Kubernetes, Vertex AI, AWS SageMaker, MLflow, Jenkins

Kroger Retail – Cincinnati, OH

Data Scientist Apr 2020 – Jan 2022

• Developed predictive ML models for demand forecasting, pricing optimization, and customer segmentation across large-scale retail datasets.

• Designed and deployed recommendation engines using collaborative filtering and clustering techniques to drive personalized customer experiences.

• Designed customer behavior features using transactional, behavioral, and temporal data.

• Evaluated model performance across different customer segments to identify bias and drift.

• Built forecasting models to support inventory planning and supply chain optimization.

• Implemented data validation checks to ensure model inputs met quality standards.

• Applied NLP techniques such as sentiment analysis and topic modeling to extract insights from customer feedback and survey data.

• Built robust feature engineering pipelines to support large-scale structured and unstructured data modeling.

• Implemented ensemble learning methods and hyperparameter optimization to improve model accuracy and robustness.

• Developed real-time streaming analytics pipelines using Kafka and Spark Streaming for operational insights.

• Created executive dashboards and reports using Tableau and Power BI to communicate insights to leadership.

• Automated ETL workflows using Airflow, Spark, and Databricks to ensure reliable data ingestion.

• Deployed ML models as REST APIs using FastAPI and Docker for enterprise integration.

• Implemented MLOps best practices, including CI/CD pipelines, MLflow tracking, and model versioning.

• Ensured data governance, security, and compliance across analytics and ML systems. Environment:

Python, Scikit-learn, XGBoost, TensorFlow, Spark, Databricks, Airflow, Kafka, AWS, GCP, BigQuery, FastAPI, Docker, Tableau, Power BI

AgFirst – Columbia, SC

Data Science / Data Analyst Oct 2018 – Mar 2020

• Built automated ETL pipelines for structured and unstructured data using Python, SQL, and Spark.

• Developed predictive models for churn analysis, forecasting, and customer retention initiatives.

• Conducted statistical analysis and hypothesis testing to support business decision-making.

• Built time-series models to support planning and forecasting use cases.

• Conducted root-cause analysis on performance anomalies and unexpected trends.

• Created self-service dashboards to reduce ad-hoc reporting requests.

• Applied NLP techniques to analyze customer feedback, surveys, and textual datasets.

• Designed and executed A/B testing and uplift modeling for marketing and pricing strategies.

• Built recommendation systems to support personalization and cross-sell opportunities.

• Developed geospatial analytics dashboards to support regional and location-based insights.

• Deployed containerized ML workflows using Docker and Kubernetes for scalable execution.

• Designed analytical data models and star schemas for reporting and BI use cases.

• Ensured data governance, security, and regulatory compliance across analytics pipelines.

• Supported business stakeholders with ad-hoc analysis and actionable insights. Environment:

Python, SQL, Spark, Docker, Kubernetes, MLflow, Tableau, AWS, GCP S&P Global – Bangalore, India

Data Analyst Sep 2016 – May 2018

• Designed interactive dashboards and KPI reports using Tableau and Power BI.

• Developed complex SQL queries and stored procedures for large-scale relational datasets.

• Conducted exploratory data analysis using Python and R to identify trends and insights.

• Automated ETL workflows using Informatica and Talend.

• Integrated data from ERP and CRM systems for enterprise reporting.

• Performed data validation and quality checks to ensure reporting accuracy.

• Supported business intelligence initiatives across sales and operations teams.

• Optimized database performance through indexing and query tuning.

• Collaborated with stakeholders to translate requirements into analytical solutions.

• Delivered executive-ready reports supporting strategic decision-making. EDUCATION

Bachelor of Technology – Computer Science

St. Martin’s Engineering College, India



Contact this candidate