Hamza Zaidi
Senior Cloud DevOps Engineer Azure & Kubernetes Specialist Cloud
Infrastructure Engineer
(917-****-*** ***************@*****.*** Oakland, CA, 94661 Summary
Lead DevOps Engineer and SRE with more than 10 years building enterprise cloud platforms on Azure and AWS. Expert in Terraform IaC, Kubernetes at scale (AKS/EKS), and self-service CI/CD using GitHub Actions, Jenkins, and Azure DevOps. Drove migrations from monoliths to microservices, implemented GitOps with ArgoCD and Helm, and secured regulated environments (HIPAA, SOC 2). Reliability is SLO/SLI-driven: 99.99 percent uptime, automated incident response, DR orchestration, and deep observability with Prometheus, Grafana, and ELK. Optimize spend via FinOps, right-sizing, and savings plans. Hands-on leader and mentor who aligns platform roadmaps with business goals, accelerates delivery significantly, and elevates security, reliability, and developer productivity.
Skills
Cloud & Platforms
AWS (EKS, Lambda, CDK), Azure (AKS, Bicep), GCP
(GKE), Hybrid/Edge (Anthos, Azure Arc, Outposts)
IaC & Config Management
Terraform, Pulumi, CloudFormation, Bicep, AWS CDK, Crossplane, Ansible, Packer
Observability & Logging
Prometheus, Grafana, OpenTelemetry, Jaeger,
ELK/EFK, Fluent Bit, Datadog, CloudWatch, Splunk,
Sentry
SRE & Resilience
SLO/SLI, Error Budgets, Incident Response, Chaos
Engineering (Litmus, Chaos Mesh, Gremlin), DR
(Velero/Restic)
Programming & Scripting
Python, Go, Bash, PowerShell, TypeScript/JavaScript, YAML/JSON, HCL
FinOps & Cost Control
Kubecost, AWS Budgets, Azure Cost Management,
CloudHealth, Cloud Custodian, Granulate
Serverless & Microservices
AWS Lambda, Azure Functions, GCP Cloud Functions,
Knative, Fargate, Firecracker
Containers & Orchestration
Kubernetes, Helm, Kustomize, Istio/Service Mesh,
Operators, Docker/Podman, Velero
CI/CD & GitOps
GitHub Actions, Jenkins, GitLab CI, Argo CD, FluxCD, Spinnaker, Tekton, CircleCI, Harness, DroneCI
Security & Compliance
DevSecOps, OPA/Policy as Code, Kyverno, Vault,
Trivy, Snyk, Aqua, Checkov, IAM/RBAC, HIPAA/SOC
2
Streaming & Messaging
Kafka, RabbitMQ, NATS, SQS/SNS, EventBridge,
Azure Service Bus, Pub/Sub
Testing & Performance
k6, JMeter, Selenium, Playwright, Pact, JUnit, TestNG, Cucumber
Platform Engineering & DX
Backstage, Developer Portals, Platform APIs, Tilt, Skaffold, Telepresence
Architecture Patterns
gRPC, Envoy, GraphQL, Event-Driven, DDD
Professional Experience
Lead DevOps Engineer, Aetna Digital
•Designed and deployed HIPAA/GDPR-compliant Azure environments using AKS, Azure Policy, and secure networking.
•Built multi-zone Apache Kafka clusters for real-time patient data streaming, improving reliability and throughput by 40 percent. 01/ 2021 – Present
•Drove DevOps transformation with GitOps (ArgoCD) and Terraform, reducing deployment time by 50 percent and eliminating manual provisioning.
•Led and mentored a team of 8 DevOps engineers, fostering automation, collaboration, and continuous improvement.
•Managed AKS clusters with secure CI/CD pipelines (Azure DevOps), enabling zero- downtime blue-green deployments and 99.95 percent uptime.
•Partnered with architects to deliver a cloud adoption roadmap, reducing cloud spend by 30 percent via right-sizing and reserved instances.
•Implemented disaster recovery for healthcare APIs using Azure Site Recovery, achieving RPO less than 5 mins and RTO less than 15 mins. Senior DevOps Engineer, Palo Alto Networks
•Designed and optimized hybrid cloud infrastructure (Azure & AWS) handling more than 1M transactions/min for global cybersecurity platforms.
•Built scalable Kafka-based pipelines for real-time security telemetry, cutting threat detection latency to less than 10s.
06/ 2016 – 09/ 2020
•Automated deployments with Jenkins and Ansible, boosting release frequency from bi-weekly to daily and improving success rates by 25 percent.
•Implemented SRE practices with Prometheus/Grafana, SLIs/SLOs, and automated runbooks, reducing MTTR by 60 percent.
•Containerized monoliths into microservices on Kubernetes (EKS), lowering infra costs by 20 percent and improving utilization.
•Defined and maintained SLOs/SLAs for more than 15 core services, enhancing transparency and customer satisfaction.
•Led incident response and post-mortems for outages, driving systemic fixes and fostering a blameless culture.
Cloud Infrastructure Engineer, SiriusXM Media
•Built and maintained CI/CD pipelines (Jenkins, TeamCity), standardizing releases and cutting deployment errors by 35 percent.
•Drove Docker adoption and supported the first production Kubernetes cluster rollout. 02/ 2015 – 04/ 2016
•Provisioned and managed Azure infrastructure (VMs, VNets, Storage) using ARM templates and Terraform (IaC).
•Integrated SonarQube into CI, shifting security left and catching vulnerabilities earlier in the SDLC.
•Automated on-demand dev/stage environments, reducing setup time from 3 days
<30 minutes.
•Implemented centralized logging with the ELK Stack, enabling self-service access and reducing debug time by 40 percent.
Projects
Enterprise Healthcare Cloud Migration & Modernization
•Migrated 50 on-prem apps to Azure AKS, refactoring for cloud-native microservices.
•Built the Azure landing zone with Terraform; enforced compliance controls and RBAC.
•Implemented Kafka for event-driven interservice communication; ArgoCD for GitOps deployments.
•Achieved HIPAA-aligned security posture with network segmentation, secrets management, and audit trails. Enterprise-Wide Data Mesh & Governance (Security Analytics)
•Architected streaming platform on AWS EKS with Kafka, processing more than 2TB /day security logs.
•Wrote Go operators for automated cluster lifecycle and policy enforcement.
•Delivered real-time observability via Prometheus/Grafana dashboards; enabled proactive threat insights. CI/CD Acceleration & Self-Service Platform
•Centralized pipelines using GitHub Actions and reusable Helm templates.
•Enabled developer self-service to build/test/deploy into scoped namespaces.
•Standardized tooling across more than 30 teams, cutting pipeline setup from days hours. Multi-Cloud Disaster Recovery Strategy
•Designed DR across Azure + AWS; infra replicated with Terraform, apps configured via Ansible.
•Orchestrated region failover drills completing in less than 30 minutes, exceeding RTO targets.
•Automated validation and runbooks to keep DR posture continuously testable and auditable. Certificates
•Certified Kubernetes
Administrator (CKA)
•HashiCorp Terraform Associate
certified professional
•Confluent Certified Developer
for Apache Kafka
•AWS Certified DevOps Engineer
- Professional
Education
Bachelor of Computer Science 2014