Subhash Goud
Cloud DevOps Engineer
E-Mail: ***********.***@*****.***
Phone: +1-206-***-****
Professional Summary
Cloud DevOps Engineer with 8 years of experience designing and automating scalable, cloud-native infrastructure across AWS, Azure, and GCP. Proven expertise in Kubernetes, Docker, Terraform, and Ansible to orchestrate resilient systems, streamline CI/CD pipelines, and accelerate application delivery. Adept at driving end-to-end automation, improving system reliability, and enabling DevOps best practices in dynamic enterprise environments.
Proficient in deploying and scaling AI/ML workloads in Azure environments, with hands-on experience using Azure Data Factory, Azure Synapse Analytics, and Azure Machine Learning for intelligent automation and data engineering pipelines.
Extensive experience with Microsoft Azure cloud platform, including services such as Azure Kubernetes Service (AKS), Azure Functions, Azure DevOps, and Azure Blob/File storage for modern cloud-native application deployment and management.
Demonstrated expertise in architecting and implementing Azure-based IaaS and PaaS solutions, with secure networking (VNets, NSGs), automation via ARM templates and Terraform, and integration of on-prem systems into the Azure ecosystem.
Skilled in deploying and maintaining AWS infrastructure using services such as EC2, S3, Lambda, RDS, CloudFormation, and IAM, ensuring secure and scalable cloud environments.
Strong background in managing GCP-based environments, deploying microservices on GKE, and leveraging Google Cloud tools for container orchestration, storage, and serverless computing.
Multi-cloud expertise with proven ability to build, scale, and manage containerized workloads across AKS (Azure), EKS (AWS), and GKE (GCP), enabling resilient, hybrid-cloud architecture strategies.
Experience in orchestrating microservices using Kubernetes clusters across Azure, AWS, and GCP with Helm charts, ConfigMaps, and Secrets to standardize deployment practices.
Skilled in managing private container registries, Docker images, and automating deployments into multi-cloud Kubernetes platforms for streamlined DevOps operations.
Hands-on experience with Terraform for infrastructure provisioning in multi-cloud environments, including use of reusable modules, state management, and remote backends.
Proficient in Python scripting for automation, debugging, log analysis, and building lightweight APIs; experienced in integrating Python with infrastructure workflows and monitoring pipelines.
Expertise in developing Infrastructure as Code (IaC) using Terraform and automating workflows with Ansible for consistent and repeatable cloud deployments.
Served as an on-call engineer in production support teams, efficiently resolving critical system issues, maintaining SLAs, and driving RCA and resolution for system outages.
Applied Site Reliability Engineering (SRE) principles to build resilient, scalable systems with automated failovers, chaos testing, and system reliability metrics.
Advanced knowledge in monitoring and observability tools including Prometheus, Grafana, ELK, and Datadog, ensuring real-time visibility and performance tuning across environments.
Deployed custom monitoring solutions with Nagios and integrated Splunk for log management, alerting, and incident diagnostics in production systems.
Built and managed CI/CD pipelines using Jenkins, GitLab CI/CD, and Azure DevOps (ADO) to ensure seamless and automated software delivery processes.
Integrated Git version control with Jenkins and ADO pipelines to support branching strategies, code reviews, artifact repositories (Nexus), and rollback capabilities.
Enabled end-to-end automation of build, test, and deployment stages across environments using CI/CD best practices, ensuring fast feedback loops and reliable delivery.
Education
Master’s from Union Commonwealth UNIVERSITY, KY Dec 2023
(Master’s in information systems management)
Skills
Cloud: AWS, Azure, GCP
CI/CD: Jenkins, Azure DevOps Pipelines, GitHub Actions, GitOps, Argo CD
Artifactory: Jfrog and Nexus
Documentation: Confluence
Operating Systems: Microsoft Windows XP/ 2000, Linux, UNIX.
Tracking Tools: Jira, ServiceNow, Remedy.
Code Scanning: Sonar Qube, Jfrog X ray, ECR Inspector
IAC Tools: Terraform, Ansible, chef, Puppet.
Messaging System: Kafka.
Build Tools: Maven, Ant, Gradle, MS Build
Source Code Management: GIT, GitHub, GitLab, Bit Bucket, Azure Repos, AWS code commit.
Databases: RDS, Amazon Aurora, Amazon RedShift, Azure PostgreSQL, Cassandra, DynamoDB, MySQL DB, Mongo DB.
Logging: Cloud Watch, Cloud Trail, Azure App Insights, Azure Monitor
Container Platforms: Docker, Kubernetes, Open Shift, Helm, EKS, AKS.
Monitoring Tools: Nagios, Elk, Splunk, Prometheus, Grafana, Dynatrace, Datadog, New relic.
Languages: Python, Java, PowerShell, Shell scripting, YAML, Bash
Certifications
Certified Kubernetes Administrator.
Microsoft Certified Azure Administrator.
Certified AWS Developer.
Certified Terraform Associate.
WORK HISTORY
Cloud DevOps Engineer
Client: Edible Arrangements, Atlanta, GA. Feb 2024 - Current
Project Summary: At Edible Arrangements, led the design, automation, and deployment of scalable cloud-native solutions to modernize their e-commerce infrastructure. Delivered real-time personalization, fraud detection, and cross-cloud data pipelines using Azure and GCP. Spearheaded CI/CD, AKS microservices orchestration, and infrastructure as code to enhance operational resilience and release velocity.
Roles & Responsibilities:
Implemented a real-time product recommendation engine using Azure Machine Learning Studio and Azure Functions, enhancing user engagement on the e-commerce platform based on live user behaviour.
Built and operationalized automated ML model training and deployment pipelines using Azure Data Factory and Azure DevOps, boosting forecasting accuracy by 25% for seasonal product demand.
Integrated Azure Cognitive Services (Form Recognizer and Text Analytics) into the customer support platform, enabling intelligent routing and auto-classification of support tickets, reducing SLA breaches.
Containerized real-time ML scoring services using Docker and deployed them to Azure Kubernetes Service (AKS) to automate fraud detection for online transactions, improving incident response time.
Architected a secure, multi-tier e-commerce infrastructure using Azure VNETs, subnets, NSGs, and Application Gateways, segmenting frontend, application, and data layers with isolated control.
Automated provisioning of Azure Blob Storage and Azure Files using Terraform, reducing image storage setup time by 80% and supporting centralized POS data access across franchises.
Designed and deployed internal and external Azure Load Balancers across regional zones to optimize traffic routing for Edible’s global order processing system.
Migrated legacy on-prem services to Azure, implementing ExpressRoute and site-to-site VPN for low-latency, high-availability hybrid connectivity between data centers and cloud environments.
Deployed 30+ microservices supporting Edible's mobile backend using AKS, Helm charts, and custom namespaces, ensuring high availability and release rollback control.
Built GitHub Actions–based CI/CD pipelines to containerize and deploy services, integrating Docker, SonarQube, and security scans for shift-left testing and rapid release cycles.
Developed backend e-commerce microservices in Java to support real-time personalization and transaction workflows, integrated with Azure ML and AKS for scalable deployment.
Used Java and Maven to build customer-facing features like order tracking and recommendations, deploying artifacts via GitHub Actions pipelines into Docker containers.
Built RESTful APIs using Spring Boot for fraud detection workflows, containerized with Docker, and deployed to AKS via Helm for high availability and resilience.
Engineered a multi-tenant AKS architecture using isolated namespaces, network policies, and resource quotas to support franchise-specific deployments and enforce tenant-level SLAs.
Developed reusable Terraform modules for Azure infrastructure provisioning during new store onboarding, reducing infrastructure setup time from 3 days to 30 minutes.
Wrote custom Python automation scripts to sync order and customer data between Azure SQL and third-party CRM systems, eliminating sync delays and improving data accuracy by 90%.
Implemented monitoring dashboards using Prometheus and Grafana for real-time AKS health metrics, with automated alerts for SLA-bound incidents and resource thresholds.
Configured Azure Monitor and Log Analytics queries to track application latency and error spikes, reducing mean time to resolution (MTTR) by 35% through proactive incident response.
Built a CI/CD pipeline using GitHub Actions to deploy serverless Azure Functions with zero downtime during high-traffic sales periods (e.g., Valentine’s Day, Mother’s Day).
Migrated code repositories from GitLab to GitHub Enterprise, retaining full commit history, branch policies, and existing pipeline compatibility across 70+ microservices.
Authored automation scripts in PowerShell and Python for container tagging, image promotion, and artifact publishing, streamlining the release process with minimal manual steps.
Integrated Snyk and Docker image scanning into GitHub Actions pipelines to identify and remediate vulnerabilities in container builds before promotion to AKS and GKE environments.
Enforced DevSecOps controls using Terraform Sentinel policies and SonarQube rules to prevent misconfigured Azure IAM roles and insecure code from being deployed to production.
Provided L2/L3 production support for mission-critical e-commerce systems, maintaining 99.99% uptime during peak shopping seasons with root cause analysis and automated rollback strategies.
Integrated Google Cloud Pub/Sub with Azure Functions for cross-cloud messaging to facilitate real-time updates across loyalty services and customer engagement platforms.
Developed and deployed containerized microservices across GCP GKE and AKS clusters to balance regional failover requirements and optimize latency.
Created automated GCP BigQuery and Cloud Storage pipelines for sales and customer behavior analysis, with outputs feeding into Azure Data Factory for unified reporting.
Environment & Tools: GCP, Azure DevOps, Azure Kubernetes Service (AKS), Azure ML Studio, Azure Data Factory, Azure Monitor, Azure App Gateway, Azure Blob/File Storage, GitHub Actions, GitHub Enterprise, Docker, Jenkins, Terraform, Ansible, Helm, GCP (GKE), GCP BIGQUERY Prometheus, Grafana, Azure Functions, PowerShell, Python, Kafka, Splunk, Nagios, OpenShift, Maven, Bitbucket, VSTS.
Sr DevOps Engineer
Client: FrankCrum, Tampa, FL Jan 2023- Dec 2023
Project Summary: At FrankCrum, led the end-to-end DevOps implementation for migrating legacy HR and payroll systems to AWS, enhancing scalability and reducing infrastructure costs by 30%. Built and managed CI/CD pipelines using Jenkins and GitHub Actions for containerized Django microservices deployed on Amazon EKS. Implemented infrastructure as code with Terraform and ensured robust monitoring with Prometheus, Grafana, and CloudWatch for high system availability.
Roles & Responsibilities:
Migrated on-prem HR and payroll systems to AWS using AWS MGN, achieving near-zero downtime and reducing infrastructure and licensing costs by 30%.
Architected secure, fault-tolerant VPC infrastructure with subnets, NAT gateways, route tables, and network ACLs to host critical HR applications.
Deployed scalable employee self-service portal using EC2 Auto Scaling, Application Load Balancers, and S3 for static asset distribution under high load.
Containerized Django-based REST APIs and deployed them to Amazon EKS using Helm, improving reliability and reducing release risk for HR services.
Maintained and modernized legacy .NET Core services powering payroll and HR workflows, containerized with Docker and deployed on Amazon EKS using Helm.
Refactored monolithic .NET applications into microservices, improving release agility and enabling independent scaling via Jenkins CI/CD pipelines and ECR.
Built Jenkins CI/CD pipelines with multi-stage Docker builds and ECR push, enabling automated testing, image validation, and zero-downtime deployments.
Maintained and enhanced Jenkins shared libraries to standardize pipeline logic across microservices, reducing pipeline onboarding time by 50%.
Migrated legacy Jenkins pipelines to GitHub Actions for faster feedback cycles, while retaining Jenkins for heavyweight deployment orchestration.
Provisioned AWS environments (VPC, EC2, IAM, RDS, EKS) using Terraform with reusable modules, ensuring consistent infrastructure across dev, test, and prod.
Developed Python and Bash scripts for log aggregation, API uptime monitoring, and automated regression validation for payroll reporting systems.
Integrated SonarQube with Jenkins to enforce code quality gates, with automatic PR checks and fail-fast logic on technical debt and vulnerabilities.
Configured EKS ConfigMaps and Secrets for secure, environment-specific service configurations and centralized logging via CloudWatch and Fluent Bit.
Instrumented Prometheus and Grafana dashboards to visualize pod-level metrics (CPU, memory, latency), enabling SREs to proactively scale workloads.
Created CloudWatch Alarms tied to Lambda-based alerting mechanisms for SLA enforcement and real-time escalation of payroll failures.
Conducted RCA and remediation of production incidents with detailed postmortems, implementing Jenkins-based rollback workflows for fast recovery.
Supported L2/L3 incident triage and release coordination for high-impact HR and onboarding platforms, maintaining 99.98% uptime during critical periods.
Defined Git Flow branching model for 15+ microservices and integrated policy-based approvals into Jenkins and GitHub Actions workflows.
Automated full-stack health checks and deployment validations using PowerShell scripts triggered within Jenkins post-deployment jobs.
Implemented automated security checks with AWS Config, CloudWatch, and IAM Access Analyzer to detect policy violations and enforce least-privilege access across EKS workloads.
Integrated open-source Trivy scanner and SonarQube into Jenkins pipelines to block high-severity CVEs and code-level security issues in Python and .NET services during PR validation.
Environment & Tools: AWS (EC2, S3, VPC, EKS, RDS, Lambda, CloudWatch, IAM), Jenkins GitHub Enterprise, Docker, AWS (EKS), AWS ECR, Terraform, Helm, Python, Prometheus, Grafana, Jenkins, Ansible, Bash, PowerShell, Django, Maven, Jira, ServiceNow, Splunk, ELK Stack.
Cloud DevOps Engineer
Client: Wells Fargo, CA June 2018 - July 2021
Project Summary: At Wells Fargo, led the cloud modernization of legacy banking systems by architecting and migrating infrastructure to Google Cloud Platform (GCP), ensuring security, compliance, and high availability. Built CI/CD pipelines using Jenkins and GitHub to deploy Kubernetes-based microservices across GKE clusters with Helm and Terraform. Enabled site reliability through automated observability with GCP Stackdriver, custom SLIs/SLOs, and proactive alerting for real-time financial analytics.
Roles & Responsibilities:
Led the migration of legacy banking applications to Google Cloud Platform (GCP), architecting secure and scalable infrastructure using GCP Deployment Manager, VPCs, Cloud NAT, and Cloud SQL.
Implemented automated provisioning of GCP resources using Terraform, standardizing infrastructure deployment across environments.
Deployed GCP Cloud Functions for event-driven automation, such as cleanup tasks and log enrichment, reducing manual workflows.
Built and managed microservices using GCP Kubernetes Engine (GKE), deploying Helm-based configurations across GCP multi-zone clusters.
Created reusable Terraform modules for provisioning GCP-native services like GKE, GCS, Cloud Pub/Sub, and IAM roles.
Managed GCP ConfigMaps, Secrets, and Ingress controllers for secure and dynamic Kubernetes workloads.
Integrated GCP Stackdriver (now Cloud Monitoring and Logging) to centralize observability, trace application flows, and enhance incident response.
Developed Python scripts for GCP resource automation (e.g., auto-scaling policies, health checks) and alert remediations using GCP Monitoring APIs.
Designed CI/CD pipelines in Jenkins integrated with GitHub and GCP GKE, automating deployments across dev, test, and prod environments.
Optimized GCP Cloud SQL databases with custom failover settings, backup schedules, and secure access via Kubernetes service accounts.
Leveraged GCP Cloud Spanner for globally distributed transactional systems, supporting real-time banking analytics.
Implemented SRE practices using GCP SLO/SLI monitoring tools, visualized metrics with GCP Cloud Monitoring, and enforced SLAs via alerts and dashboards.
Environment & Tools: GCP (GKE, Cloud SQL, Cloud Spanner, Deployment Manager, Pub/Sub, VPC, Cloud NAT, Stackdriver, Cloud Monitoring, Cloud Logging, Cloud Functions), Docker, Kubernetes, Jenkins, GitHub, GitHub Actions, Helm, Terraform, Bash, Python, Prometheus, Grafana, PostgreSQL, MySQL, Jira, Confluence, Linux, Shell Scripting
DevOps Engineer
Client: GE Healthcare, Waukesha, WI. Jan 2016 – May 2018
Project Summary: At GE Healthcare, engineered secure and scalable Azure-based infrastructure for patient-centric healthcare applications, leveraging AKS, Azure DevOps, and Terraform to enable fault-tolerant microservice deployments. Automated CI/CD pipelines and implemented SRE practices using Azure Monitor, Log Analytics, and Application Insights to enhance observability and reliability. Integrated hybrid-cloud automation with Azure Functions and AWS Lambda for efficient event-driven workflows and system responsiveness
Roles & Responsibilities:
Migrated legacy healthcare workloads to Azure Cloud, designing and deploying secure and scalable infrastructure using Azure Virtual Machines, Azure Load Balancer, Azure SQL Database, and Azure Blob Storage.
Automated the provisioning of cloud resources with Terraform and Azure Resource Manager (ARM) templates, creating reusable modules for multi-environment deployment.
Integrated Azure Active Directory (AAD) and implemented role-based access control (RBAC) to enforce least privilege and comply with HIPAA regulations.
Built custom scripts using Azure CLI and PowerShell to automate resource creation, storage management, and VM lifecycle operations.
Containerized legacy healthcare services using Docker and orchestrated deployments using Azure Kubernetes Service (AKS), enabling fault-tolerant microservice architecture.
Designed Helm charts to standardize AKS deployments and managed service rollouts using blue/green and canary strategies for minimal downtime.
Implemented secure image scanning and stored trusted container images in Azure Container Registry (ACR) integrated with CI pipelines.
Developed Python and Shell scripts to automate monitoring alerts, log archiving, and provisioning tasks, reducing manual ops time by 60%.
Created automation utilities to validate Azure VM configurations and perform periodic patch compliance checks using Azure Automation.
Scripted storage account cleanup, scheduled scaling, and cost-saving automation via Azure Functions and Logic Apps.
Implemented observability using Azure Monitor, Log Analytics, and Application Insights, capturing performance metrics, system logs, and dependency maps.
Defined SLIs and SLOs for patient-critical systems, applied SRE practices to improve reliability, and implemented alerting via Azure Alerts and email/webhooks.
Provided L2 production support during critical incident triage, reducing incident response time through automated runbooks and alert correlation dashboards.
Designed and implemented CI/CD pipelines using Azure DevOps, integrating GitHub, Terraform, and AKS for automated build, test, and deploy of .NET and Python apps.
Managed artifact repositories using Azure Artifacts and Nexus, and enabled automated testing and deployment with Maven, SonarQube, and Jenkins shared libraries.
Enabled GitHub Actions for pre-merge checks, infrastructure linting, and container image validation prior to pipeline execution.
Provisioned Azure SQL Database and Cosmos DB for scalable healthcare data solutions and set up automated backups with geo-redundancy.
Integrated on-prem hospital systems with Azure via Site-to-Site VPN and ExpressRoute, ensuring secure and consistent hybrid connectivity with minimal latency.
Built data ingestion pipelines using Azure Data Factory to transfer structured/unstructured data from on-prem to Azure for analytics.
Integrated AWS S3 buckets for secure and scalable backup storage of healthcare data alongside Azure Blob Storage, ensuring data redundancy and compliance.
Automated infrastructure provisioning using AWS CloudFormation templates to complement Azure ARM deployments, enhancing multi-cloud resource management.
Leveraged AWS Lambda functions for serverless event-driven automation to process patient data streams, improving system responsiveness in hybrid cloud environments.
Environment & Tools: AWS, Azure (VMs, AKS, ACR, Load Balancer, Azure SQL, Cosmos DB, Blob Storage, App Gateway),AWS S3, Azure DevOps, Terraform, ARM Templates, AWS Cloud formation, AWS Lambda function Docker, Kubernetes, GitHub, Jenkins, Helm, Azure Monitor, Log Analytics, Prometheus, Python, Bash, PowerShell, Nexus, SonarQube, Jira, Confluence, Azure Data Factory.