Generative AI & LLM Engineer, 5y Experience

Location:

Roswell, GA

Salary:

90000

Posted:

December 18, 2025

Contact this candidate

Resume:

Suresh Dasari

+1-409-***-**** ***********@*****.*** Atlanta, GA https://www.linkedin.com/in/sdasari9704/

Professional Summary

Generative AI and Machine Learning Engineer with almost 5 years of experience designing, deploying, and optimizing LLM-driven solutions for large-scale enterprise applications.

Proven expertise in Retrieval-Augmented Generation (RAG) systems, vector databases, and semantic search for delivering accurate, context-aware responses.

Skilled in LLM fine-tuning and prompt engineering, applying advanced reasoning, context optimization, and safety mechanisms to ensure reliable model behavior.

Hands-on experience with multi-agent orchestration frameworks like LangChain, CrewAI, and AutoGen to enable autonomous, goal-driven AI workflows.

Strong background in cloud-native AI infrastructure, including AWS (Bedrock, ECS, Lambda, DynamoDB) and GCP (Vertex AI, Cloud Run, Composer).

Proficient in MLOps and CI/CD automation using Kubernetes, Docker, Terraform, Kubeflow, and MLflow for scalable and reproducible model deployment.

Experienced in developing RESTful API services with Python and FastAPI to integrate AI inference pipelines and enable event-driven automation.

Deep understanding of data engineering and analytics pipelines, leveraging PySpark, Databricks, and Azure Data Factory for large-scale data processing.

Applied Responsible AI and governance frameworks, ensuring explainability, fairness, and compliance across multi-cloud environments.

Strong research and implementation background in transformer-based NLP models, including BERT, GPT-3.5, and custom fine-tuned LLMs for summarization, entity extraction, and content generation.

Technical Skills:

Programming & Frameworks: Python, SQL, R, Bash, TensorFlow, PyTorch, Keras, Scikit-learn, XGBoost, LightGBM, Pandas, NumPy, FastAPI, Flask.

Generative AI / NLP & Search: GPT-4, GPT-3.5 Turbo, BERT, Llama 2, Hugging Face Transformers, LangChain, CrewAI, AutoGen, LangGraph, FAISS, Pinecone, OpenSearch, Elasticsearch, BM25, Dense Embeddings, Semantic Search.

Data Engineering, Analytics & Visualization: PySpark, Databricks, Azure Data Factory, Streamlit, Tableau, Power BI, Matplotlib, Seaborn, ARIMA, Prophet, Data Lakes, ETL Pipelines.

MLOps, Deployment & Cloud Platforms: MLflow, Kubeflow, Jenkins, GitHub Actions, Tekton, Docker, Kubernetes, OpenShift, Terraform, AWS (Bedrock, SageMaker, Lambda, ECS, API Gateway, OpenSearch, DynamoDB, Glue, S3), GCP (Vertex AI, Cloud Run, DataProc, Composer), Azure (Databricks, Data Factory, Blob Storage).

Databases, Computer Vision & Governance: MongoDB, DynamoDB, PostgreSQL, MySQL, OpenCV, Amazon Textract, OCR, CNN, Object Detection, Image Segmentation, AI Guardrails, Explainable AI (XAI), RLHF, Model Drift Detection, Bias Mitigation.

Tools & Environments: Linux, Git, Jupyter, VS Code, Cursor AI, Bash, CI/CD Pipelines, Cloud Infrastructure Monitoring.

Educational Details:

Master of Science (M.S.) in Computer Science

Lamar University, Beaumont, TX, USA May 2024

Bachelor of Technology (B.Tech.) in Computer Science and Engineering

Bharath University, Chennai, India

Professional Experience

Tempus AI Jun 2024 – Current

GEN AI/ML Engineer Austin, TX

Designed and implemented generative AI solutions by integrating Large Language Models (LLMs) with retrieval and reasoning pipelines deployed on cloud-based architectures, enabling intelligent automation and context-aware system interactions.

Built Retrieval-Augmented Generation (RAG) systems using Amazon OpenSearch for semantic indexing, vector databases for dense embedding storage, and custom retrieval mechanisms to deliver accurate and knowledge-grounded responses.

Applied advanced prompt engineering strategies such as Chain-of-Thought reasoning to enhance logical flow, context optimization to preserve multi-turn memory, and guardrails to maintain safe, explainable, and compliant model outputs.

Developed comprehensive AI safety and compliance frameworks by combining content moderation pipelines to detect sensitive material, intent classification models to interpret user goals, and authentication mechanisms to ensure secure access control.

Created automated AI-powered content generation pipelines that performed data scraping, entity extraction, and document summarization using transformer-based NLP models, embedding techniques, and custom text generation algorithms.

Engineered RESTful API services with Python to handle business logic, utilized FastAPI for high-performance API orchestration, and integrated AWS Lambda for event-driven execution and auto-scaling of AI inference workloads.

Leveraged AWS cloud infrastructure — including Bedrock for foundation model integration, ECS for containerized deployment, API Gateway for secure endpoint management, S3 for data storage, DynamoDB for fast key-value retrieval, and OpenSearch for vector-based search — to deliver low-latency, high-availability AI services.

Designed data logging and analytics systems using DynamoDB for chat interaction tracking, OpenSearch Vector DB for contextual knowledge retrieval, and reinforcement learning from human feedback (RLHF) techniques to iteratively fine-tune model behavior.

Developed evaluation and benchmarking pipelines that used automated scoring metrics, structured prompt evaluation frameworks, and Streamlit dashboards to measure LLM quality, track response accuracy, and visualize performance over time.

Implemented multi-agent orchestration frameworks such as CrewAI, LangChain, and AutoGen, allowing multiple AI agents to coordinate reasoning, manage complex workflows, and execute goal-driven decision-making processes autonomously.

Fine-tuned and optimized GPT-3.5 Turbo and other large-scale LLMs using domain-relevant datasets, applying LangGraph for persistent conversation state management and adaptive, context-aware response generation.

Adopted Responsible AI, MLOps, and observability best practices by using Docker for environment consistency, Kubernetes for scalable orchestration, and CI/CD pipelines for continuous integration, automated testing, and production-grade model deployment.

Citadel Feb 2023 - May 2024

Data Scientist/AI/ML Engineer Austin, TX

Engineered Generative AI architectures powered by GPT-3 and GPT-3.5 Turbo, focusing on conversational intelligence, contextual understanding, and natural language summarization for large-scale enterprise use cases.

Designed cloud-native ML ecosystems using Google Cloud Platform (GCP) — orchestrating data pipelines through Cloud Composer, deploying containerized inference workloads via Cloud Run, and managing distributed processing using PySpark.

Implemented Retrieval-Augmented Generation (RAG) frameworks by connecting Large Language Models with vector databases such as FAISS and Pinecone, enabling semantic search and context-aware responses based on stored embeddings.

Automated infrastructure provisioning and environment setup through Terraform, integrating GitHub Actions and Jenkins to establish robust CI/CD pipelines that supported secure, continuous deployment of AI models and backend services.

Fine-tuned and optimized transformer-based NLP models using Hugging Face Transformers, applying BERT and GPT-3.5 for sentiment classification, entity recognition, and text summarization tasks through transfer learning.

Leveraged AWS SageMaker and Google Vertex AI to deploy, monitor, and scale production-grade ML models — optimizing cost, latency, and throughput through model endpoints, auto-scaling strategies, and cloud-based experiment tracking.

Developed semantic and hybrid search solutions that integrated OpenSearch and Elasticsearch for keyword-based retrieval with FAISS for vector similarity ranking, improving document discovery and relevance in large-scale datasets.

Built end-to-end MLOps pipelines with Kubeflow for workflow orchestration and MLflow for experiment tracking and model registry, ensuring reproducibility, observability, and automated retraining upon data drift detection.

Created modular AI orchestration workflows with LangChain, combining memory management, contextual retrieval, and reasoning steps to enable multi-turn, adaptive automation across complex conversational systems.

Advocated Responsible AI and governance principles, ensuring model explainability, fairness, and auditability, while collaborating with DevOps and data engineering teams to deliver secure, production-ready AI systems across multi-cloud environments.

EY (Ernst & Young) May 2020 – Aug 2022

Machine Learning Engineer Hyderabad, India

Designed and implemented a complete end-to-end machine learning data pipeline that automated model training, monitoring, and evaluation, ensuring reliable model deployment and consistent performance tracking.

Built an internal ML platform that enabled data scientists to develop, test, and deploy models seamlessly without requiring DevOps knowledge, leveraging Docker, Kubernetes, and Kubeflow for orchestration and scalability.

Led and managed an offshore data engineering team to extract, curate, and prepare large-scale datasets from data lakes using Azure Data Factory, Databricks, and PySpark for advanced analytics and ML model training.

Executed large-scale data migration from AWS to Azure, designing robust ETL pipelines with Azure Data Factory, Blob Storage, and Databricks Notebooks to streamline data ingestion and transformation workflows.

Developed and deployed a recommendation engine using Python and TensorFlow, implementing A/B and B/A testing to assess model performance and improve personalization accuracy across user segments.

Enhanced NLP models by applying preprocessing techniques with NLTK, regex, and SpaCy to clean, tokenize, and structure unstructured text data, improving classification accuracy and interpretability.

Applied advanced model optimization methods such as dropout, word embeddings (e.g., Word2Vec, GloVe), and transfer learning using pre-trained CNN and RNN architectures to improve model generalization and reduce overfitting.

Researched and integrated innovative ML methodologies from academic literature to improve model performance and computational efficiency, including ensemble methods and feature selection techniques.

Evaluated model performance using multiple metrics such as F1-score, AUC/ROC, Precision, Recall, and Rank-1 to Rank-5 accuracy, and visualized results for interpretability and business impact analysis.

Deployed scalable ML solutions in Docker containers managed via Kubernetes, ensuring high availability, fault tolerance, and compliance with production SLA requirements.

Automated CI/CD pipelines using Jenkins for model deployment, version control, and release management with Git, ensuring reproducibility, traceability, and consistent model delivery across environments.

Contact this candidate