Senior AI/ML Engineer Azure GenAI, MLOps, Python

Location:

Posted:

February 11, 2026

Resume:

Professional Summary

•Over *+ years of experience designing, building, and deploying end-to-end AI, Machine Learning, and Generative AI solutions across enterprise environments, with strong expertise in Microsoft Azure AI and Azure Machine Learning.

•Proven experience as a Generative AI Engineer, building LLM-powered systems using OpenAI, Hugging Face, LangChain, LangGraph, RAG, embeddings, and vector databases

•Strong background as an AI Software Engineer and Python Developer, developing scalable, modular, and production-ready Python services, libraries, and ML/AI APIs.

•Hands-on experience delivering agent-based AI workflows, enabling reasoning, planning, tool-calling, and multi-step decision-making with controlled safety and reliability

•Extensive experience designing and deploying Azure ML pipelines, including model training, evaluation, deployment, monitoring, and lifecycle management

•Deep understanding of MLOps best practices, including CI/CD pipelines, experiment tracking, model versioning, automated retraining, and rollback strategies

•Expertise in low-latency inference and performance optimization, including prompt optimization, token control, caching, and scalable serving strategies

•Strong foundation in data engineering, building scalable ETL/ELT pipelines using PySpark, Spark SQL, Kafka, and cloud data platforms to support ML and GenAI workloads

•Experience applying advanced ML and deep learning techniques, including NLP, Transformers, time-series forecasting, anomaly detection, and predictive modeling.

•Proficient in cloud-native deployment using Docker, Kubernetes (AKS), Terraform, and Azure DevOps/GitHub Actions for reliable and repeatable releases.

•Solid understanding of AI safety, evaluation, and governance, including hallucination reduction, validation checks, and responsible AI practices.

•Experience building and deploying end-to-end ML solutions on AWS using SageMaker for model training and deployment, S3 for data storage, and Lambda with API Gateway for scalable inference.

•Strong hands-on experience with MLOps practices including Docker, Kubernetes (EKS), CI/CD automation, infrastructure-as-code (Terraform), and monitoring using CloudWatch for performance tracking and model reliability.

•Collaborative team player with experience working in Agile/Scrum environments, partnering with product, engineering, and business stakeholders to deliver production-grade AI solutions.

Technical Skills

Programming Languages : Python, PySpark, Java, C, C++, Bash, HTML/CSS, JavaScript, Flask, Fast API

Python Package : Scikit-Learn, NLTK, TensorFlow, PyTorch, Keras, Pandas, NumPy, Seaborn, Matplotlib, Beautiful Soup

Gen AI & LLMs : Hugging Face, LangChain, LangGraph, LangSmith, RAG, Prompt Engineering, OpenAI GPT-3.5/4, Copilot, LlamaIndex, Pydentic AI, Crawl4AI, Chroma DB, FAISS Vector DB, Azure Cognitive Search, Pinecone.

Machine learning : Supervised, Unsupervised and Reinforcement Learning, Linear Regression, Multiple Linear Regression, Logistics Regression, Clustering, Classification, Decision Tree, Support Vector Machines (SVM), Naive Bayes, Ensemble Techniques (Random Forest, XGBoost, Gradient Boosting Machines (GBM)), K- Nearest Neighbors (KNN), Clustering, K-Means Clustering, DB- Scan Clustering, ARIMA.

Natural Language Processing (NLP) and Deep Learning: ANN, CNN, RNN, LSTMs, Transformers, BERT, OpenAI GPT-3.5/4.

Frameworks & Libraries : PyTorch, Tensorflow, Keras, Scikit-learn, NumPy, Pandas, NLTK, Beautiful Soup, Matplotlib, Seaborn

Data & Streaming : Apache Spark, PySpark, Spark SQL, Kafka, ETL/ELT

DataBases : SQL Server, MySQL, SQLite, and MongoDB.

Azure Cloud Services : Azure Data Lake, Azure Data Factory, Azure SQL, Azure ML, Azure Functions, Azure Kubernetes Service (AKS), Azure App Services, Container Apps, Azure OpenAI, Azure AI Studio, Azure Cognitive Services.

AWS Cloud Services : Amazon S3, Amazon Bedrock, Amazon SageMaker, AWS Lambda, Amazon Elastic Kubernetes Service (EKS), Amazon API Gateway, Amazon CloudWatch, AWS IAM, AWS CodeBuild, AWS CloudFormation, AWS Secrets Manager.

Visualization : Power BI, Tablea, Streamlit

Methodologies : Software Development Life Cycle (SDLC), Jira, Agile development,Microsoft SharePoint, Waterfall Model.

Tools : Git, GitHub, LInux, Jupyter Notebook, VS Code, Google Antigravity

Experience

AI Software Engineer Caterpillar Peoria, IL Feb 2025 – Present

•Led the modernization of a critical legacy analytics desktop application into DATK+, migrating statistical and signal-processing workflows into a scalable Python-based analytics platform, improving performance, maintainability, and cross-team usability

•Designed simulation-based statistical analysis procedures to support test engineers in validating data, analyzing noise behavior, and evaluating engineering processes.

•Refactored legacy workflows into modular Python libraries and reusable components, enabling high-performance analytics, faster execution, and improved developer productivity across teams.

•Developed backend services and interactive data visualization modules, significantly enhancing analysis speed, responsiveness, and user experience for large test datasets.

•Implemented robust testing strategies including autoregression testing (Autogressor), unit testing, manual testing, and JSON schema validation, detecting anomalies early and strengthening data integrity across all workflows.

•Applied statistical modeling and time-series analysis techniques to large-scale engineering and test datasets, supporting predictive insights and data-driven decision-making.

•Built and maintained CI/CD pipelines using Git, Jenkins, Azure Repos and GitHub Actions, automating testing, validation, and deployment to support reliable and repeatable DATK+ releases.

•Integrated data validation, regression analysis, and anomaly detection techniques to ensure analytical accuracy and data quality across engineering use cases.

•Contributed to early-stage architecture discussions for introducing LLM-powered and agent-based capabilities into the DATK+ platform to enhance automation and user-driven analytics

•Contributed to defining coding standards, testing practices, and architectural guidelines to support long-term maintainability of the DATK+ analytics platform

•Hands-on exposure to LLM tooling and orchestration concepts (agents, prompt engineering and retrieval strategies) while collaborating with senior engineers on AI platform strategy.

•Assisted in assessing Retrieval-Augmented Generation (RAG) solutions to support conversational access to internal knowledge bases and analytical artifacts.

•Collaborated on Azure-based deployment and integration of analytics services, aligning Python applications with enterprise cloud standards for scalability, security, and reliability

•Actively collaborated in an Agile/Scrum environment, participating in sprint planning, reviews, and retrospectives, and partnering with engineering and product teams to deliver high-quality, data-driven features.

AI/ML Engineer Broadcom San Jose, CA June 2023 – Jan 2025

•Designed and deployed an end-to-end conversational AI platform combining Transformer-based NLP models (BERT, intent classification, entity extraction) with GPT-powered, agent-based workflows, reducing response time by 30% and automating 80% of customer queries

•Architected and implemented a Generative AI customer support system using LangChain, LangGraph, Pinecone, and Retrieval-Augmented Generation (RAG) to enable accurate, multi-turn, domain-aware conversations

•Built and fine-tuned deep learning models using PyTorch and TensorFlow for text classification, intent recognition, and entity extraction, achieving high precision, recall, and F1-score across production workloads.

•Implemented NLP pipelines leveraging spaCy, tokenization, embeddings, and Transformer-based encoders, and applied lightweight RAG patterns to ground chatbot responses using internal knowledge sources.

•Designed and integrated vector-based retrieval using embeddings and vector stores to support semantic search and retrieval in RAG workflows, improving response grounding and reducing hallucinations in conversational outputs.

•Implemented Retrieval-Augmented Generation (RAG) pipelines using embeddings and vector-based retrieval to ground chatbot responses with internal knowledge sources and improve answer relevance.

•Built and optimized RAG pipelines using embeddings and vector search to ground LLM responses in internal knowledge sources and significantly reduce hallucinations

•Developed semantic chunking and hybrid text-splitting strategies to improve embedding quality and retrieval accuracy across PDFs, HTML content, and structured SQL data

•Orchestrated agent workflows and prompt flows using LangChain and LangGraph, enabling modular, multi-step reasoning and controlled tool invocation

•Implemented GenAI evaluation and reliability metrics, tracking retrieval relevance, response accuracy, hallucination rates, and tool-call success to improve production quality

•Optimized LLM inference latency and retrieval performance through prompt optimization, chunk sizing, caching strategies, and efficient vector index configuration

•Developed MLOps pipelines using MLflow, Docker, and Azure Kubernetes Service (AKS) to support experiment tracking, model versioning, automated deployment, monitoring, and rollback strategies.

•Built PySpark pipelines in Azure Databricks to process streaming and batch data (Apache Kafka), performing feature engineering and data preparation to support downstream PyTorch model training.

•Streamlined ML pipelines in Databricks using PySpark and SQL, automating data ingestion, transformation, training, and inference to support large-scale forecasting and customer analytics use cases.

•Leveraged chatbot interaction analytics and predictive modeling (XGBoost, Random Forest) to optimize conversational flows and proactive support strategies

•Created Power BI and Tableau dashboards to track model performance, system KPIs, and operational health, contributing to a 15% improvement in operational efficiency.

Data Scinetist Unilever Dallas, TX Oct 2022 – May 2023

•Developed and optimized machine learning and deep learning models for demand forecasting and dynamic pricing, leveraging Gradient Boosting, LSTM-based time-series models, and sequence modeling techniques to improve forecast accuracy by 18%.

•Applied deep learning architectures including LSTM and Transformer-based models to capture seasonality, trends, and complex temporal dependencies in large-scale sales and demand datasets.

•Built NLP-based feature pipelines using TF-IDF, word embeddings, and PyTorch to extract signals from product descriptions, promotions, and customer data

•Evaluated forecasting and pricing models using business-aligned metrics (MAPE, RMSE, accuracy), ensuring model outputs aligned with revenue and demand planning goals.

•Implemented cross-validation for XGBoost, Random Forest, and LSTM models to minimize overfitting and ensure robust performance on unseen data.

•Deployed trained ML models on Amazon SageMaker for scalable batch and real-time inference.

•Built REST endpoints using Amazon API Gateway integrated with AWS Lambda for serverless inference workflows.

•Stored training data and model artifacts in Amazon S3 with lifecycle management policies.

•Managed metadata and experiment tracking using SageMaker experiments.Integrated MLOps practices using MLflow for experiment tracking, metric logging, and model versioning, supporting cloud-based training and deployment workflows

•Applied clustering and segmentation techniques to identify customer purchasing patterns, validating insights through statistical analysis and translating findings into actionable pricing and revenue strategies.

•Collaborated cross-functionally with sales, pricing, finance, and DevOps teams, deploying models to production using Git, CI/CD pipelines, and Agile workflows, ensuring reliability, scalability, and business alignment.

•Packaged and deployed models using Docker, integrated into CI/CD pipelines with Jenkins, and managed infrastructure with Terraform to ensure consistent and scalable deployments across development, staging, and production environments.

•Instrumented model monitoring using Amazon CloudWatch to track latency, throughput, and error rates.

•Performed feature importance and model explainability analysis to help business stakeholders understand key drivers behind forecasting and pricing decisions

•Monitored model performance post-deployment and supported model retraining and drift analysis to maintain long-term accuracy

•Collaborated cross-functionally with sales, pricing, finance, and DevOps teams, deploying models using Git, CI/CD pipelines, and Agile workflows

•Regularly worked with a modern tech stack including Python, PySpark, SQL, Tableau, Databricks, Prophet, LSTM, and ANN, deploying models with Docker and managing infrastructure via Terraform, while driving business insights via EDA and visualization tools.

Data Scientist Data Idols India Jan 2018 – Aug 2022

•Designed and evaluated machine learning models (SVM, Decision Trees, Logistic Regression, Random Forest) for healthcare outcome prediction and fraud detection, achieving a 10% improvement in model accuracy and improved precision, recall, and F1-score metrics.

•Conducted extensive Exploratory Data Analysis (EDA), feature engineering, and statistical analysis (hypothesis testing, correlation analysis) to identify key drivers and validate modeling assumptions

•Built regression, classification, time-series forecasting, and segmentation models to support healthcare analytics, business planning, and operational decision-making

•Utilized XGBoost for advanced regression and classification tasks, optimizing hyperparameters (e.g., learning rate, max depth, subsample) via grid and random search, achieving a 20% boost in model accuracy.

•Applied statistical techniques including hypothesis testing and correlation analysis to uncover patient care trends and validate model assumptions.

•Conducted thorough Exploratory Data Analysis (EDA) using statistical techniques and visualizations to detect trends, outliers, and data quality issues, ensuring high-quality datasets for modeling.

•Developed Random Forest models for classification, fine-tuning parameters like n_estimators, max_features, and min_samples_split through cross-validation, adhering to ensemble best practices.

•Developed and optimized distributed PySpark / Spark MLlib pipelines on Azure Data Lake and Azure Databricks, applying performance techniques such as partitioning, caching, and optimized joins for scalable data processing.

•Supported data ingestion and preprocessing workflows to prepare large healthcare datasets for downstream analytics and modeling

•Used SQL, Python, and Spark SQL to extract, clean, and transform data from large-scale datasets for analytics and reporting

•Implemented model evaluation pipelines to track accuracy, precision, recall, and F1-score across multiple experiments

•Developed modular, reusable Python codebases following object-oriented programming principles and design patterns, improving maintainability of ML systems

•Created Power BI dashboards to communicate insights, trends, and model outputs to business stakeholders

•Collaborated with analysts, engineers, and domain experts to translate business problems into data-driven solutions

•Ensured data quality and reliability through validation checks, statistical sanity tests, and repeatable analysis workflows

Education

Bachelor of Computer Applications at Sri Venkateswara University India

Contact this candidate