Post Job Free
Sign in

Devops Engineer Reliability

Location:
United States
Salary:
65
Posted:
October 30, 2025

Contact this candidate

Resume:

ALI BANGAS

DevOps Engineer Site Reliability Engineer Cloud Infrastructure Engineer AWS & GCP

Richmond, Texas, United States 77407 • (914) 677- 0480 • ****.********@*****.*** SUMMARY

Versatile and results-driven engineer with 9+ years of experience building and managing scalable, reliable, and secure cloud infrastructures across AWS, GCP, and hybrid environments. Proven expertise in DevOps automation, CI/CD pipeline design, Infrastructure as Code (Terraform, CloudFormation, Ansible), and Kubernetes orchestration (EKS, GKE, OpenShift). Adept at applying SRE practices defining SLIs and SLOs, managing error budgets, and implementing observability frameworks with Prometheus, Grafana, ELK/EFK, and CloudWatch to ensure system resilience and performance. Skilled in migrations, security and compliance enforcement (SOC2, HIPAA, CIS), and cost optimization strategies. Recognized for mentoring teams, streamlining SDLC processes, and driving automation- first initiatives that accelerate delivery while reducing operational risks. Experienced in deploying service mesh solutions (Istio, Linkerd, App Mesh) for secure service-to-service communication in Kubernetes environments. Strong background in multi-region fault-tolerant architectures, policy-as-code enforcement, and self-service automation to improve developer productivity and operational excellence. SKILLS

DevOps Practices: GitOps (ArgoCD, FluxCD), DevSecOps (SAST/DAST, secrets scanning, compliance pipelines), Blue-Green & Canary Deployments, Automated Patch Management, Observability-Driven Development, Incident Response Automation

Cloud Platforms: AWS (EKS, ECS, Lambda, API Gateway, CloudFront, Route 53), Azure (AKS, Functions, App Gateway), GCP (GKE, Cloud Run, Pub/Sub), Hybrid and Private Cloud, Bare Metal Deployments CI/CD & Automation: Jenkins, GitHub Actions, GitLab CI, Azure DevOps, ArgoCD, Spinnaker, Tekton, CloudBees CI, Buildkite, Bamboo, TeamCity, HarnessAzure DevOps, ArgoCD, Spinnaker, Tekton, CloudBees CI, BuildAzure DevOps, ArgoCD

Infrastructure as Code: Terraform, CloudFormation, Pulumi, AWS CDK, Ansible, Chef, SaltStack, ARM/Bicep Containers & Orchestration: Docker, Kubernetes (EKS/AKS/Openshift), Helm, Istio, Rancher, App Mesh Monitoring, Logging & SRE: Prometheus, Grafana, ELK/EFK, Splunk, Datadog, OpenTelemetry, Chaos Engineering Security & Compliance: Vault, AWS Secrets Manager, Aqua, Snyk, SonarQube, HIPAA, SOC2, CIS Benchmarks AWS Secrets Man

Programming & Scripting: Python, Bash, Go, PowerShell, YAML, Groovy, TypeScript, C++, Java, Shell scripting Collaboration Tools: Git, Bitbucket, Jira, Confluence, Slack, MS Teams, Notion, Miro PROFESSIONAL EXPERIENCE

Lead DevOps Engineer Aug 2023 – Present

Appinventive

● Led multi-cloud adoption initiatives across AWS and Azure, achieving 30% faster deployments and reducing infrastructure costs by 25%.

● Architected and deployed Kubernetes-based production platforms on Hetzner bare-metal and AWS EKS, ensuring high availability, auto-scaling, and enterprise-grade reliability.

● Implemented progressive delivery strategies (Blue-Green and Canary deployments), reducing release rollback incidents by 40%.

● Automated multi-cloud provisioning and configuration management using Terraform modules and Ansible workflows, standardizing infrastructure delivery across environments.

● Improved observability by deploying Prometheus, Grafana, Loki, and Jaeger, reducing Mean Time to Resolution (MTTR) by 35%.

● Directed CI/CD pipeline optimization with GitHub Actions and Azure DevOps, increasing release frequency while improving deployment success rates by 20%.

● Mentored and guided a team of DevOps engineers, driving best practices in infrastructure automation, container orchestration, and security compliance. Senior DevOps Engineer June 2020 – July 2023

IBM

● Designed, implemented, and maintained enterprise-scale CI/CD pipelines supporting 300+ microservices across hybrid cloud environments, increasing deployment efficiency and reliability.

● Migrated legacy monolithic applications to AWS EKS and Azure AKS, reducing hosting costs by 30% and improving application scalability.

● Championed DevSecOps practices by automating patch management, compliance scanning, and secrets management, strengthening security and regulatory compliance.

● Built and managed centralized observability platforms (Prometheus, Grafana, Splunk, ELK), enhancing system monitoring and reducing incident detection time by 50%.

● Mentored and coached junior DevOps engineers in Terraform, Kubernetes, and GitOps workflows, fostering a culture of automation and continuous learning.

● Automated disaster recovery (DR) strategies with infrastructure-as-code and cloud-native backup solutions, improving recovery point and time objectives (RPO/RTO).

● Collaborated with cross-functional teams to implement GitOps-driven deployments with ArgoCD, reducing configuration drift and improving release consistency. Infrastructure Engineer SRE June 2016 – May 2020 Skyline Technologies

● Designed and implemented CI/CD pipelines using Jenkins, GitLab CI, and Azure DevOps, reducing build and release cycle times by 50%.

● Migrated legacy monolithic applications into containerized microservices with Docker and Kubernetes, improving scalability, resilience, and deployment efficiency.

● Automated provisioning and configuration of 500+ cloud resources using Terraform, AWS CloudFormation, and AWS CDK across multi-cloud environments.

● Integrated DevSecOps practices including SAST/DAST scanning, IAM policy enforcement, and Vault- based secrets management to strengthen security and compliance.

● Built and managed monitoring and observability frameworks with ELK Stack, Datadog, and AWS CloudWatch, ensuring 99.9% uptime and proactive incident detection.

● Optimized incident response through on-call rotations, runbooks, and automated alerting, reducing Mean Time to Resolution (MTTR) by 35%.

● Partnered with development teams to conduct chaos engineering experiments and load testing, validating reliability and ensuring system performance during peak traffic. EDUCATION

University of Houston



Contact this candidate