Shivani Boddu
****************@*****.***
PROFESSIONAL SUMMARY:
5+ years of experience as an AI/ML Engineer with designing, deploying, and scaling machine learning solutions on cloud- native platforms, especially Google Cloud Platform (GCP), and deploying in using Python, Kubernetes, and AWS.
Experienced in designing and automating ML workflows on GCP using Tekton, integrating DevSecOps practices for secure, high-quality model deployment and monitoring.
Expertise in end-to-end ML lifecycle using Vertex AI, Big Query ML, Dataflow, and Kubeflow Pipelines. Skilled in processing and integrating JSON-based datasets and API payloads.
Proficient in building CI/CD pipelines, working with MongoDB, Apache SOLR, and LLMs in production environments.
Proficient in Snowflake, Databricks, Python, and SQL, with hands-on expertise in AWS services including Glue,
Strong experience building and orchestrating ETL workflows using Airflow and Databricks Workflows.
Designed and deployed scalable ML pipelines using Kubeflow Pipelines and TensorFlow Extended (TFX) for production-ready workflows.
Integrated Big Query, Dataflow, Pub/Sub, and Cloud Storage for large-scale data ingestion, and built ML models using GCP Vertex AI for feature engineering and real-time analytics.
Deployed models via Vertex AI Endpoints and Cloud Run, implementing A/B testing, canary deployments, and rollback strategies.
Maintained reproducibility and version control using MLflow, Vertex AI Experiments, and Git-based pipelines.
Automated ML workflows and orchestration tasks using Python, Bash, Cloud Functions, and scheduled cron jobs.
Collaborated with cross-functional teams, including data scientists, ML engineers, and DevOps, to streamline model delivery and monitoring.
TECHNICAL SKILLS:
Programming & Scripting:
Python, SQL, Bash, Java, R, Go, C#, .NET
Cloud Platforms & Services:
GCP (Vertex AI, Big Query, Dataflow, Pub/Sub, Cloud Storage, Cloud Run, GKE, Cloud Monitoring, Cloud Functions)
AWS EMR, Databricks Workflows, Databricks SQL, Snowflake SQL, Athena, Azure Logic Apps, Azure Web Apps, IBM Watsonx
AWS: SageMaker, Lambda, API Gateway, S3, Redshift, Glue, EKS, CDK, CloudFormation, CloudWatch
Azure (Data Factory, Synapse Analytics, Data Lake, Azure ML)
Big Query ML, Cloud Scheduler, Cloud Build
NoSQL Databases: DynamoDB, MongoDB, Cosmos DB
MLOps & ML Lifecycle Tools:
Vertex AI, Kubeflow Pipelines, TFX, ML flow, Argo Flow, Weights & Biases, Vertex AI Experiments
Model training, tuning, A/B testing, canary deployments, and rollback strategies
CI/CD for ML: Git, GitHub, Cloud Build, Docker, Kubernetes, Tekton, Terraform, Helm
Vertex AI Pipelines, Explainable AI, Bias Detection, Model Versioning Data Engineering & Pipelines:
Apache Airflow, Google Dataflow, Apache Kafka, Apache NiFi, Apache Beam, Pub/Sub, Kinesis
Data preprocessing, feature engineering, and ingestion automation, Data flow
Modern Data Stack: dbt, Airflow, Dagster
Machine Learning & AI Frameworks:
TensorFlow, Keras, Scikit-learn, PyTorch, Generative AI, Computer Vision, OpenAI API, LLM
Predictive Modeling, Prompt engineering, Time Series Forecasting, LSTM, YOLO, NLP, XGBoost, Real-time inference systems
Linux Admin: Linux shell scripting, system monitoring, Unix Architectural Design:
Solution Architecture, Data Modeling, Technical Design Reviews, Data Platform Architecture Monitoring & Explainability:
Cloud Monitoring, Prometheus, Grafana, Explainable AI, model drift detection,
Automated retraining pipelines with Cloud Scheduler, Airflow, Pub/Sub
DevSecOps: Artifact scanning, vulnerability checks, SonarQube, FOSSA Data Visualization & BI Tools:
Tableau, Power BI, Looker, ggplot2, matplotlib, Apache SOLR (semantic search, document retrieval) Databases & Warehousing:
PostgreSQL, MySQL, MongoDB, Snowflake, Amazon Redshift, Big Query, DataOps & Pipeline Engineering:
Kafka, Kafka Connect, Databricks, Delta Lake, Snowpipe, Debezium, Maxwell, AWS DMS, dbt, Airflow, Glue, Pub/Sub, Jenkins, GitHub Actions, CI/CD for DataOps DevOps & Automation:
Git, Docker, Kubernetes, Jenkins, Cloud Functions, cron jobs Data Governance & Compliance:
Data Quality Management, DCRM frameworks, GDPR, Basel III, CCAR, SOX compliance, risk analytics workflows, data quality assessments, stakeholder communication PROFESSIONAL EXPERIENCE:
Bluescape LLC(Kaiser Permanente), Cumming, GA,
Jun 2024-Present
Data Engineer/ MLops Engineer(Hybrid)
Built dashboards and interactive visuals in Power BI and Looker to communicate risk metrics and data quality KPIs.
Developed and maintained robust ETL pipelines using Apache Airflow, AWS Glue, and Python, optimizing data processing for internal teams.
Integrated data across multiple platforms (SQL, NoSQL, cloud) to streamline data flow and improve efficiency using SQL, PostgreSQL, MongoDB, and Amazon Redshift.
Prioritized and executed ad hoc analytics for Data Controls & Risk Management (DCRM) process optimization, providing actionable insights to stakeholders.
Collaborated with governance and compliance teams to enhance DCRM workflows through model-driven risk classification and remediation strategies.
Designed and implemented real-time data streaming pipelines using Apache Kafka and GCP Dataflow, enabling sub- second latency analytics for event-driven applications.
Utilized Azure Data Lake and Azure Machine Learning to scale data ingestion and model deployment workflows, enhancing performance and reproducibility in production environments.
Designed and developed scalable ETL pipelines using AWS Glue, Snowflake, and Databricks to support batch and streaming data workflows across multiple business domains.
Designed and implemented end-to-end MLOps pipelines on AWS SageMaker, integrating feature engineering, model training, registration, deployment via REST APIs, and automated drift monitoring.
Designed RESTful APIs in Python to serve ML predictions, exchanging data in JSON format between Vertex AI models and upstream applications.
Architected and optimized cloud-native data warehouse solutions on Snowflake and BigQuery for analytical workloads and AI integration.
Leveraged AWS CDK and Terraform to define ML infrastructure as code for repeatable deployments.
Integrated Dataflow and Pub/Sub to enable real-time model inference and streaming analytics.
Designed CI/CD pipelines using Cloud Build, Tekton, GitHub Actions for automated model testing, deployment, and rollback.
Configured and maintained IBM Watsonx and Google Cloud Vertex AI environments, enabling enterprise- grade AI/ML platform adoption.
Automated infrastructure provisioning and ML workflows using IaC (Terraform, AWS CDK) and pipeline automation with Tekton and Cloud Build, embedding DevSecOps checks for security and compliance.
Automated ML deployment workflows using Tekton, Cycode, FOSSA, Cloud Build, incorporating code quality checks with SonarQube and security scanning tools to align with DevSecOps best practices.
Applied predictive modeling and advanced analytics on large-scale operational datasets to optimize supply chain processes, uncover trends, and generate actionable recommendations for business stakeholders.
Served models through RESTful endpoints using Cloud Run and Vertex AI Endpoints, implementing A/B testing and canary deployments.
Implemented bias detection and explainability using GCP Explainable AI tools, ensuring fairness and transparency in model decisions. Designed containerized ML pipelines using Docker and Kubernetes for scalable deployment.
Integrated Apache SOLR for implementing semantic search features in risk analytics and LLM pipelines. Maintained MongoDB for feature storage and model metadata management.
Built an OpenAI-powered summarization service using LLMs, integrated with internal datasets for intelligent document parsing.
Conducted Linux-based system monitoring, container health checks, and process automation. Assisted data scientists in building predictive models using Scikit-learn, TensorFlow, and Keras. Documented ML pipelines and data processes for reproducibility and audit readiness.
Automated model retraining and drift monitoring using Cloud Scheduler, Vertex AI, and Cloud Monitoring.
Used MLflow and Vertex AI Experiments for experiment tracking, model versioning, and reproducibility. Automated ML tasks using Cloud Functions, cron jobs, and Pub/Sub for real-time processing.
Monitored and troubleshooted data pipelines to ensure smooth, continuous data flow with minimal downtime using tools like Datadog, Prometheus, and Grafana.
Environment: Python, SQL, Bash, Apache Airflow, Apache SOLR, AWS Glue, AWS Sage maker, Apache Kafka, Apache Beam, Pandas, NumPy, Scikit-learn, JSON, TensorFlow, Keras, PyTorch, Vertex AI, Open AI, LLM, Big Query, Dataflow, ad hoc, Pub/Sub, MongoDB, Cloud Storage, Cloud Run, GKE, Cloud Monitoring, Cloud Functions, Vertex AI Experiments, CICD, Tekton, Kubeflow Pipelines, Cycode, Fossa, TFX, MLflow, Weights & Biases, Git, GitHub, Docker, Kubernetes, Datadog, supply chain, Prometheus, Grafana, Tableau, Power BI, Looker, PostgreSQL, MongoDB, MySQL, Redshift, Snowflake, Cloud Scheduler, cron jobs, Explainable AI, A/B Testing, Canary Deployments, Responsible AI Practices Capgemini (CBD), Hyderabad, India Aug 2021- Jul 2023 Data Engineer/ MLops Engineer
Collected and prepared transactional, customer, and risk data from core banking systems, APIs, and third-party sources for analysis and regulatory reporting.
Cleaned, transformed, and standardized raw banking data (loans, payments, fraud logs) using R and Python to ensure accuracy and compliance.
Developed and deployed credit and fraud detection models using R and SageMaker; integrated CI/CD and containerized deployment for production use cases.
Integrated ML models into production APIs using Cloud Functions and monitored them via Cloud Monitoring and Prometheus.
Deployed models with Vertex AI Endpoints and implemented versioning strategies with MLflow and Git.
Built API-driven data integration services to connect upstream core banking systems with ML models deployed on Vertex AI, ensuring low-latency predictions and reliable data flow.
Developed Python-based chatbots and LLM-integrated services for compliance automation.
Contributed to defining and supporting DCRM use cases by aligning ML workflows with regulatory data quality requirements.
sed API Gateway and Cloud Functions to expose ML predictions as secure APIs, reducing latency for customer-facing services.
Delivered ad hoc visualizations and insights on credit risk and fraud patterns to compliance and audit teams for decision-making.
Designed interactive dashboards in Tableau and ggplot2 to visualize KPIs like NPA ratios, liquidity risk, and customer segmentation. Automated regulatory reports (Basel III, AML) by processing unstructured data into audit-ready formats.
Used Apache SOLR to enhance search and retrieval within fraud detection systems. Designed SOLR-based pipelines to power contextual semantic search for customer interactions, aligned with NLP-driven fraud detection frameworks.
Designed and deployed CDC architecture using Maxwell and AWS DMS for real-time ingestion of core banking changes into Snowflake and GCP Big Query, reducing batch latency by 60%.
Implemented Snowflake Tasks and Streams for automated data transformations and alerts, ensuring robust DataOps pipelines with recovery strategies.
Defined infrastructure as code (IaC) templates for ML pipelines, ensuring consistent provisioning of AWS services.
Built monitoring dashboards using Looker and Power BI for pipeline health, schema validation, and data latency tracking.
Deployed fraud detection and regulatory compliance models on IBM Watsonx and Vertex AI, integrating CI/CD and AIOps monitoring.
Designed IaC templates for repeatable ML pipeline deployments, aligning with DevSecOps standards and reducing environment setup times.
Built Glue-triggered model retraining pipelines linked to SageMaker and Lambda for scheduled model refresh based on data drift detection.
Developed GCP-native data pipelines using Python and BigQuery for financial risk and customer segmentation analytics, improving processing speed by 30%.
Implemented automated Airflow DAGs to orchestrate daily and weekly ETL jobs, ensuring timely and accurate delivery of regulatory datasets.
Applied predictive modeling and advanced analytics on large-scale operational datasets to optimize supply chain processes, uncover trends, and generate actionable recommendations for business stakeholders.
Engineered data pipelines with Snowflake and Databricks to support regulatory analytics and model scoring.
Developed CI/CD pipelines using Jenkins and GitHub Actions for automated testing and production rollout of ETL workflows.
Developed credit risk and fraud detection models with AWS SageMaker and Lambda, enabling batch inference, version control (MLflow), and monitoring with Prometheus and Cloud Monitoring.
Configured CI/CD pipelines for ML model deployment and rollback using Git, Cloud Build, and GKE.
Conducted model performance audits, drift analysis, and data validation checks for production models. Environment: Python, R, SQL, Bash, Scikit-learn, TensorFlow, MLflow, JSON, Pandas, ggplot2, BigQuery, Dataflow, Pub/Sub, Cloud Monitoring, supply chain, Azure Data Factory, Azure Synapse Analytics, Azure Data Lake, Azure ML, S3, Redshift, Glue, AWS SageMaker, Lambda, Weights & Biases, Ad hoc, Open AI, Apache SOLR Git, GitHub, Docker, Kubernetes, Cloud Build, Tableau, Power BI, ETL, matplotlib, ggplot2, GDPR, Basel III, AML, SOX, Responsible AI, Airflow, Apache NiFi, Cloud Run, CI/CD, Tekton, Model Drift Detection, Audit Logging Wipro (Kelloggs), Hyderabad, India Apr 2020- Jul 2021 Data Analyst/Data Scientist
Gathered and prepared sales data from multiple sources for analysis and reporting. Cleaned, filtered, and transformed data into a specified format using R.
Developed classification, tree map, and regression models in R to evaluate product performance. Created interactive dashboards using ggplot2 and Tableau Tabpy for real-time business insights.
Visualized data using matplotlib in Python and ggplot2 in R to enhance understanding and decision-making.
Developed CI/CD pipelines using Jenkins, Tekton, and GitHub Actions for automated testing and production rollout of ETL workflows.
Integrated SageMaker endpoints within Kubernetes (EKS) to serve real-time recommendation models, scaling ML workloads with containerized microservices.
Automated CI/CD-enabled ML pipelines using Jenkins and Tekton, supporting API-based deployment of product recommendation models for analytics dashboards.
Created ETL jobs to transform marketing and sales datasets in JSON format into structured warehouse tables for business insights.
Designed and implemented machine learning pipelines for automating data preparation, model training, and evaluation, resulting in a more efficient workflow.
Integrated SageMaker endpoints within EKS-based microservices for serving real-time product recommendation models in business analytics dashboards.
Contributed to the foundational development of analytics platforms and dashboards, enabling faster marketing insights and sales optimization decisions.
Provided analytical reports and actionable recommendations to business leaders on sales and marketing performance.
Applied probability, distribution, and statistical inference techniques to identify significant patterns.
Developed RESTful APIs for real-time product recommendations, integrating ML inference endpoints hosted on GCP Cloud Run for scalable serving.
Assisted in transforming batch ETL workflows into modular, CI/CD-enabled pipelines using Jenkins and GitHub Actions, laying the groundwork for future DataOps practices.
Contributed to pipeline optimization tasks in Databricks notebooks, supporting SQL and Python-based data transformations and visualizations for reporting teams. Environment: R, ggplot2, Python, AWS, Matplotlib, Docker, Kubernetes, MLops, DataOps, Tableau, JSON, Tekton, Tabpy, Classification models, Tree map models, Regression models, AWS, Business Intelligence, CI/CD Projects:
Smart Hybrid Solar Helmet (AI/ML):
>Designed a solar-powered safety helmet with AI-based hazard detection (YOLO) and health monitoring (LSTM) for outdoor workers.
>Integrated sensors, voice control, and adaptive ML to predict heatstroke and prevent accidents.
>Reduced heat-related risks with real-time alerts and solar-powered cooling, using TensorFlow/PyTorch.
>Designed predictive ML pipeline and deployed YOLO-based hazard detection model on GCP using TensorFlow and Vertex AI.
>Trained a YOLO-based hazard detection model using SageMaker and deployed with integrated monitoring on Vertex AI and SageMaker endpoints.
Face Detector Application:
>Developed a real-time face detection system using OpenCV, Haar Cascades, and Dlib, enabling accurate face recognition in live video streams.
>Built a document summarization system using OpenAI's GPT-3.5 and Apache SOLR for intelligent retrieval and contextual analysis.
>Used SOLR to index summarized outputs for contextual retrieval; enhanced pipeline by retraining face detection models using SageMaker Ground Truth and SageMaker Training Jobs.
>Containerized and deployed the pipeline on Kubernetes with real-time inference via REST APIs.
>Managed CI/CD and deployment using GitHub Actions and Cloud Build on AWS.
> Enhanced performance with Python and optimized algorithms for varying lighting conditions.
>Gained expertise in Computer Vision, Deep Learning, and real-time processing. Certifications:
Full Stack Data Science and AI
Gained hands-on experience in Python, Pandas, NumPy, Scikit-learn, TensorFlow, and Keras
Built and deployed end-to-end machine learning models with real-world datasets
Learned core concepts of data preprocessing, EDA, model evaluation, and deployment using Flask/Docker
Covered key areas such as NLP, deep learning, SQL, and MLOps practices
AWS APAC Solutions Architecture virtual experience program on Forage - July 2025
Designed a simple and scalable hosting architecture based on Elastic Beanstalk for a client experiencing significant growth and slow response times.
Described my proposed architecture in plain language, ensuring my client understood how it works and how costs will be calculated for it.
EDUCATION:
Master’s in Information technology and management 2024 Concordia University Saint Paul, MN, USA
Bachelor’s 2021
Osmania University, Hyderabad, India