Ajay
*****.********@*****.***
+1-972-***-**** DevOps/MLOps Engineer
CAREER SUMMARY: DevOps & MLOps Engineer with 6+ years of experience in automating infrastructure, optimizing deployments, and managing multi-cloud environments (AWS, Azure, GCP). Proficient in Linux administration, CI/CD pipeline design, and Infrastructure-as-Code using Terraform, Ansible, and CloudFormation. Expertise in containerization and orchestration with Docker, Kubernetes, and OpenShift, as well as monitoring and observability using Prometheus, Grafana, ELK, and Datadog. Hands-on expertise in MLOps, including ML pipeline automation, model deployment, versioning (MLflow, DVC), and production model monitoring. Strong scripting abilities in Python, Shell, and PowerShell, with deep experience in Git, GitHub, GitLab, Jenkins, and Agile/Scrum methodologies.
PROFESSIONAL SUMMARY:
Experienced Engineer specializing in DevOps, Cloud, and MLOps, proficient in AWS, Azure, and GCP, and adept in automated CI/CD processes using Terraform, Ansible, and Kubernetes (EKS, AKS, GKE)
Expert in SRE methodologies, focusing on key metrics and ensuring robust cloud infrastructure that supports high reliability and scalability.
Experienced in utilizing SageMaker, Azure ML, Docker, and Jenkins to construct and manage MLOps pipelines.
Expert in utilizing monitoring and observability tools such as Prometheus, Grafana, ELK, Splunk, CloudWatch, and Azure Monitor for effective system oversight.
Hands on in utilizing microservices, Infrastructure as Code, and automation scripting (Python, Shell) within production environments that require high availability and constant uptime.
TECHNICAL SKILLS
Cloud Environment
AWS, Azure, GCP
Container/Orchestration Tools
Kubernetes, Docker, Docker Swarm, OpenShift
CM & Deployment Tools
Chef, Ansible, Ansible Tower Puppet, Terraform
CI/CD & Version Control
Jenkins, GitLab, Azure DevOps (Repos, Pipelines, Boards, Monitor) GitHub Actions, Bitbucket
Build Tools
Apache Ant, Maven, Gradle, MS Build
Configuration Management & Infrastructure as Code
Terraform, Ansible, CloudFormation, Puppet, Chef
Repositories
JFrog Artifactory, Nexus, Azure Artifacts
Web/Application Server
Apache HTTP 3.x Apache Tomcat, WebLogic, Web Sphere, Nginx.
MLOps & DataOps
MLflow, DVC (Data Version Control), Kubeflow, Apache, Airflow.
AWS SageMaker, Azure ML.
Model Deployment & Monitoring (CI/CD for ML, model serving, drift detection).
Data Pipeline Automation & Workflow Orchestration.
Monitoring & Logging
Prometheus, Grafana, ELK Stack, Splunk, Datadog, Dynatrace, CloudWatch
Databases
MySQL, MongoDB, Redis, DynamoDB, Azure SQL, PostgreSQL.
Programming/Scripting Languages
PowerShell, Java, Python, Groovy, YAML, Ruby.
Operating Systems
RHEL, Ubuntu, CentOS, Windows, MAC.
Education:
Master's degree Information Technology DePaul University Chicago, IL (2023-2024)
Bachelor's degree Computer Science Jawaharlal Nehru Technological University India JNTUA (2016-2020)
Certifications:
AWS Certified DevOps Engineer – Professional, Amazon Web Services (AWS)
Microsoft Certified DevOps Expert – Microsoft Certified
Introduction to Generative AI Learning Path Specialization (Google Cloud)
Introduction to Large Language Models (Google Cloud)
Responsible AI, Applying AI principles with Google Cloud
WORK EXPERIENCE
T-Mobile, Seattle, WA June 2024 - Till Date
DevOps/MLOps Engineer
Responsibilities:
Built CI/CD pipelines using Azure DevOps, GCP Cloud Build, Jenkins, Helm, and Terraform for deployments to AKS, GKE, and EKS, while managing branching, releases, and repo migrations across Azure Repos, GCP Cloud Source Repos, and TFS.
Automated Kubernetes and cloud infrastructure with Terraform, deploying microservices and Docker workloads via ACR, GCR, GAR, implementing Istio, Nginx Ingress, and resolving multi-cloud networking issues across AWS and GCP (VPC peering, routing, firewalls, ALB/NLB).
Supported data engineering pipelines using ADF, Databricks, ADLS, Azure SQL DW, Dataflow, BigQuery, SnowSQL, SnowPipe, and Pub/Sub for ingestion and processing.
Built full observability stacks with Prometheus, Grafana, ELK, Datadog, and GCP Operations, setting alert rules, auto-scaling, and Kubernetes self-healing (liveness/readiness probes).
Executed cloud migrations using Azure Migrate, Site Recovery, and GCP Migrate, and applied SRE practices with SLIs/SLOs across Azure/GCP production environments.
Automated MLOps and cloud workflows using Azure ML, Vertex AI, Python OOP, Azure Functions, GCP Cloud Functions, GitOps deployments via ArgoCD, and strengthened security with Azure Storage Firewalls, VNet rules, GCP VPC Service Controls, and SSO (AAD, Google Identity Platform).
Treevah Feb 2023 – June 2024 Chicago, IL, USA
Software Engineer & IT Management Intern
Responsibilities:
Served as Backend Developer, building and maintaining APIs with Java/Spring and integrating databases, optimizing queries, and ensuring secure third-party payment workflows.
Contributed as Frontend Developer, developing responsive React.js, JavaScript, CSS components, implementing authentication flows with Azure AD, and improving UX with interactive features such as password visibility and reset flows.
Actively implemented IT Management, including Azure DevOps pipeline setup, managing access permissions, and registering applications in Azure Active Directory.
Implemented MLOps workflows for model deployment pipelines, ensuring reproducibility using MLflow & DVC, and automated rollouts with Docker + Kubernetes.
Collaborated with cross-functional teams under Agile methodology, using Git/GitHub for version control and participating in sprint planning, stand-ups, and retrospectives.
Apollo Hospitals, Hyderabad, India Nov 2019 - Dec 2022
DevOps Engineer
Responsibilities:
Designed and deployed AWS infrastructure using CloudFormation, Terraform, and Ansible, delivering bespoke VPC/NAT, multi-node EKS clusters, Auto Scaling configurations, and automated IaC workflows that reduced provisioning time by 30%.
Built end-to-end CI/CD pipelines using Jenkins (Groovy), GitHub Actions, Docker, ECR, Helm, ArgoCD, and Ansible Tower, automating deployments to EKS, OpenShift, and multi-environment TypeScript/React and microservice applications.
Developed microservices and APIs using Golang (TDD, Gorilla Mux), Spring Boot, .NET Core, REST, and MQ, and engineered event-driven solutions using AWS Lambda, Aurora, and DynamoDB, including chat-bot automation for deployment/rollback workflows.
Enhanced cloud security and governance by implementing IAM RBAC, automating S3 provisioning (PowerShell/Octopus), creating SonarQube quality gates, and standardizing AWS resource provisioning with CloudFormation and Terraform.
Implemented monitoring and observability using Prometheus, Grafana, AppDynamics, and Kibana/ELK, integrating logs and metrics across EKS, OpenShift, Java, and Python applications; automated data and container platforms with Kafka, Spark Streaming (Scala), HBase, Docker, Terraform, Jenkins, and Ansible to manage pods, storage, and deployments across EKS, OpenShift, and hybrid Kubernetes clusters for enhanced performance and faster issue resolution.