Post Job Free
Sign in

Senior SRE & DevOps Engineer - Multi-Cloud Infra, IaC, CI/CD

Location:
Denver, CO
Posted:
March 30, 2026

Contact this candidate

Resume:

Sainithin Sindhe

469-***-**** ********************@*****.*** https://www.linkedin.com/in/sainithin-sindhe-damaragidda-239326174/

PROFESSIONAL SUMMARY

• Results-driven Site Reliability Engineer (SRE) and Senior DevOps Engineer with 11+ years of experience designing, automating, and scaling mission-critical cloud infrastructure across enterprise environments. Known for bridging the gap between development velocity and operational stability — delivering platforms that are secure, observable, self-healing, and built to last.

• Architected and managed multi-cloud environments across AWS, Azure, and GCP — including compute, networking, storage, and managed services — supporting high-availability, enterprise-scale workloads.

• Infrastructure as Code (IaC) expert with deep hands-on experience in Terraform, OpenTofu, Pulumi, AWS CloudFormation, Puppet, and Ansible — enabling consistent, auditable, and repeatable deployments across all environments.

• Expert in containerization and orchestration using Docker, Kubernetes, Podman, and Helm — managing microservices at scale with zero-downtime rollouts, auto-scaling, and self-healing configurations.

• Built and maintained enterprise-grade CI/CD pipelines using Jenkins, GitHub Actions, Azure DevOps, ArgoCD, and GitLab CI/CD — integrating automated testing, security scanning, and compliance gates to accelerate release cycles without sacrificing quality.

• Champion of the SRE philosophy — defined and owned SLIs, SLOs, and error budgets across production services, driving accountability between engineering and operations teams and measurably improving system reliability.

• Designed full-stack observability platforms using Prometheus, Grafana, ELK Stack, OpenTelemetry, Datadog, Dynatrace, and New Relic — providing end-to-end visibility into application performance, infrastructure health, and business-critical service metrics.

• Embedded security into every phase of the software delivery lifecycle — integrating SAST, DAST, container image scanning, secrets management, and policy-as-code into CI/CD pipelines to achieve shift-left security at scale.

• Partnered with engineering, security, product, and business stakeholders to align infrastructure strategy with organizational goals, regulatory requirements, and long-term scalability needs.

• Mentored junior engineers and SREs in DevOps best practices, IaC standards, and reliability engineering principles — fostering a culture of operational excellence and continuous improvement.

EXPERIENCE

Best Buy April 2023 — Present

Site Reliability Engineer Richfield, United States

• Architected mission-critical multi-region AWS infrastructure using EC2 Auto Scaling Groups, Elastic Load Balancers, Route 53, S3, RDS Multi-AZ, ECS, and EBS supporting a $200M+ e-commerce platform — reducing global latency by 40% across 5 regions and sustaining 99.92% uptime through Black Friday and Cyber Monday peak traffic.

• Consolidated infrastructure-as-code strategy using 150+ reusable Terraform modules and AWS CDK constructs (TypeScript) covering VPCs, EKS clusters, RDS instances, and IAM roles — cutting provisioning time from 8 hours to 25 minutes and eliminating 70% of manual configuration tasks through Sentinel policy enforcement.

• Architected enterprise disaster recovery for a $100M+ daily revenue platform using Route 53 health checks, RDS cross-region replication, and S3 replication — achieving RTO under 15 minutes validated through quarterly DR drills with zero revenue loss during two AWS regional degradation events.

• Led end-to-end on-premises to AWS cloud migration for 60+ legacy workloads using the 6R framework — rehosting and replatforming monolithic Java applications to containerized ECS/EKS microservices, decommissioning 4 on-premises data center racks, and reducing infrastructure operational costs by $1.8M annually.

• Drove container platform migration from Docker Swarm to Amazon EKS across 80+ services — redesigning orchestration with Helm charts, Kustomize overlays, and ArgoCD GitOps, reducing deployment failures by 50% and enabling self-service releases for 6 product development teams.

• Modernized secrets management by migrating hardcoded credentials to AWS Secrets Manager and HashiCorp Vault Enterprise — implementing dynamic secret injection, automated rotation, and least-privilege IAM policies achieving zero hardcoded secrets across all environments in compliance with PCI-DSS and SOC2.

• Led database modernization initiative migrating Oracle on-premises workloads to Amazon RDS PostgreSQL and Aurora — executing schema conversion, parallel validation pipelines, and phased cutovers achieving zero data loss with under 30-minute maintenance windows across 12 production databases.

• Standardized multi-environment Kubernetes deployments across 100+ microservices using Kustomize overlays and ArgoCD GitOps — separating base manifests from environment-specific patches across Dev, QA, Stage, and Production, eliminating configuration drift and enabling fully automated progressive delivery.

• Configured EKS autoscaling using HPA, KEDA event-driven scaling, and Cluster Autoscaler — dynamically scaling 50 to 500 pods during peak seasonal traffic, maintaining sub-200ms response times under 10x load, and preventing 25+ capacity outages annually.

• Engineered Istio service mesh across 150+ microservices with mutual TLS, intelligent traffic shaping, and circuit breaker patterns — reducing inter-service latency by 30% and enabling zero-downtime canary deployments that cut rollback incidents by 75%.

• Designed enterprise CI/CD platform integrating Jenkins, GitHub Actions, and AWS CodePipeline across 100+ microservices — automating multi-stage build, security scan, and deployment workflows from Dev through Production, reducing release cycles from 4 days to 6 hours with a 98.5% deployment success rate.

• Engineered Jenkins shared libraries with Groovy DSL standardizing build and deployment patterns across 12 development teams — cutting pipeline build failure rates from 18% to under 3% and eliminating 90% of environment-specific configuration issues.

• Implemented Blue/Green, Canary, and Rolling deployment strategies using Kubernetes, Argo Rollouts, and AWS CodeDeploy — eliminating 100% of customer-facing downtime during releases and enabling 400+ safe monthly production changes.

• Embedded shift-left security into CI/CD pipelines integrating SonarQube SAST, OWASP ZAP DAST, and Trivy container image scanning as mandatory quality gates — blocking 95% of critical vulnerabilities before reaching staging and achieving continuous PCI-DSS and SOC2 audit compliance.

• Architected Internal Developer Platform (IDP) using Backstage — publishing a centralized service catalog covering 100+ microservices with owner metadata, runbooks, SLO dashboards, and one-click Terraform-backed environment provisioning, reducing new engineer onboarding from 6 weeks to 2 weeks.

• Designed golden path infrastructure templates standardizing Terraform module patterns, Dockerfile conventions, Helm chart structures, and CI/CD pipeline configurations into reusable GitHub repository templates — enabling teams to deploy net-new services to production in under 2 hours.

• Established DORA metrics tracking (deployment frequency, lead time, change failure rate, MTTR) using Datadog dashboards and engineering scorecards — providing weekly reliability snapshots to leadership that drove a 3x improvement in deployment frequency over 18 months.

• Built enterprise observability platform using Datadog APM, CloudWatch, and Prometheus/Grafana across 300+ services — implementing distributed tracing, log correlation, custom dashboards, and SLA burn-rate alerting achieving 96% monitoring coverage and reducing MTTD by 70%.

• Integrated OpenTelemetry instrumentation and Jaeger distributed tracing across 150+ EKS microservices — standardizing W3C trace context propagation, surfacing 40+ critical bottlenecks in payment and order management flows, and reducing MTTR for performance incidents by 60%.

• Defined SRE framework with SLIs, SLOs targeting 99.95% uptime, and error budgets — enabling data-driven reliability decisions that reduced production incidents by 65% and balanced feature delivery velocity against platform stability commitments.

• Led end-to-end incident management program using PagerDuty and ServiceNow — conducting blameless post-mortems, incident trend analysis, and automated runbook remediation, reducing MTTR from 4.5 hours to 45 minutes and repeat incident rates by 40%.

• Architected AWS data factory platform using EMR, Apache Kafka, Databricks, and Terraform — enabling daily production-ready pipeline delivery, reducing data ingestion latency from 24 hours to under 30 minutes, and improving inventory forecasting accuracy by 28%.

• Built real-time analytics pipelines using Amazon Kinesis Data Streams and Kafka processing 5TB of daily retail data — reducing business intelligence latency from 24 hours to under 30 seconds and contributing to a 12% increase in checkout conversion rates.

• Architected highly available Amazon RDS (PostgreSQL, MySQL) and Aurora infrastructure across Multi-AZ deployments — implementing automated backups, CloudWatch alerting, and read replicas achieving 60% query performance improvement and 99.95% availability SLA.

• Managed DynamoDB for high-throughput e-commerce workloads powering customer profiles and session management — achieving sub-10ms read latency at peak traffic through on-demand capacity mode and DAX caching layer integration.

• Supported Java/Spring Boot microservices powering order management, payments, loyalty, and identity services — performing JVM heap tuning, GC analysis, and thread dump diagnostics during Black Friday peak traffic events to prevent OOM-related production degradations.

• Conducted performance testing using JMeter and chaos engineering (pod termination, network latency injection, CPU throttling) across payment and order management microservices — validating system resilience and preventing 20+ potential production failures before peak season launches.

• Built 200+ reusable automation scripts using Python, Bash, and Golang covering infrastructure health checks, alert auto-remediation, deployment orchestration, and capacity reporting — reducing manual engineering workload by 40% and saving 1,200+ engineering hours annually.

• Drove cloud cost optimization through EC2 rightsizing, Spot Instance migration, S3 lifecycle policies, and RDS reserved capacity planning — reducing annual AWS infrastructure spend by $2.4M (32% reduction) while maintaining all production uptime SLAs.

• Collaborated within Agile Scrum teams across sprint planning, backlog grooming, and retrospectives — maintaining a prioritized DevOps reliability backlog aligned to platform health and contributing to a 3x improvement in deployment frequency over 18 months.

Environment: AWS, Kubernetes (EKS), Red Hat OpenShift, Helm, Terraform, CloudFormation, AWS CDK, Ansible, Jenkins, GitHub Actions, GitLab CI, AWS CodePipeline, Tekton, Harness, Octopus Deploy, Prometheus, Grafana, Datadog, New Relic, AppDynamics, OpenTelemetry, ELK Stack, F5 AspenMesh, Istio, JFrog Artifactory, HashiCorp Vault, Databricks (DABS), AWS CloudWatch,SonarQube, Python, Bash, PowerShell, Golang, .NET, Node.js, Hadoop EMR, Cloudera Hadoop, Spark, Hive, Kafka, JIRA, Confluence, ServiceNow, OpsRamp.

Bank of America June 2020 — March 2023

Site Reliability Engineer / Cloud Infrastructure Engineer Charlotte, United States

• Directed an Agile SRE team architecting cloud-native digital banking services across Azure and GCP — implementing PCI-DSS/SOX-compliant secure architectures supporting 5,000+ transactions/second across retail banking, commercial lending, and payment processing platforms serving 2M+ daily active users.

• Implemented Azure Entra ID (formerly Active Directory), Azure Key Vault, Azure Storage, Azure SQL Database, Azure Service Bus, and Azure App Service to secure customer identity, manage secrets and certificate lifecycle, enable encrypted storage, and support high-availability PaaS hosting — guaranteeing 99.99% availability during peak financial events including quarter-end reporting, tax season surges, and high-volume trading days.

• Managed enterprise Terraform state using Azure Blob Storage remote backends with state locking via Azure Table Storage — enforcing workspace isolation, version-controlled infrastructure repositories, and Azure Policy-backed IaC governance across dev, UAT, and production banking environments spanning 40+ Azure subscriptions, implementing compliance guardrails for PCI-DSS and SOX controls across all resource deployments.

• Architected secure GCP VPC networks with custom subnets, firewall rules, Cloud NAT, and VPC Service Controls — isolating workloads handling payment processing, fraud detection, and customer PII data across 40+ microservice environments to meet zero-trust and Federal Reserve network segmentation requirements.

• Deployed and operated 120+ banking microservices on Google Kubernetes Engine using Helm charts, Terraform automation, KEDA event-driven autoscaling, and Horizontal Pod Autoscaler — enabling rolling upgrades, zero-downtime deployments, and sub-100ms response times for customer-facing banking APIs (account inquiry, fund transfers, bill payments).

• Established SRE framework for digital banking platform defining SLIs, SLOs targeting 99.95% availability, and error budget policies across 120+ microservices — enabling data-driven reliability decisions that reduced production incidents by 60%, balanced feature velocity with platform stability, and provided measurable compliance evidence for Federal Reserve operational risk assessments.

• Designed multi-cloud CI/CD ecosystem integrating Jenkins, Azure DevOps, GitLab CI, GitHub Actions, Tekton (GKE-native), and AWS CodePipeline across 40+ banking application repositories — automating build, test, SAST gate enforcement, and deployment workflows for online loan origination, credit card processing, and Apache Spark ML credit risk scoring models with 98%+ deployment success rate.

• Pioneered progressive delivery for regulated banking APIs by migrating from Spinnaker to Harness for automated Canary and Blue/Green rollouts — reducing deployment risk by 85% and eliminating customer-facing downtime. Extended with Octopus Deploy for mobile/internet banking portals and ArgoCD GitOps workflows for Kubernetes compliance services with automated rollbacks and version-controlled delivery pipelines.

• Built unified observability platform across Azure and GCP integrating Azure Monitor, Log Analytics (custom KQL queries), Dynatrace SaaS distributed tracing, AppDynamics APM, and OpsRamp — monitoring .NET core banking apps, Java trading platforms, and Node.js fraud detection services, reducing MTTD from 12 minutes to under 90 seconds and detecting transaction flow anomalies before customer impact.

• Configured end-to-end observability for Red Hat OpenShift and GCP GKE clusters using Prometheus, Grafana, OpenTelemetry distributed tracing, and Splunk SIEM — enabling real-time monitoring of critical banking transactions, canary release validation for credit card processing apps via Datadog, and PCI-DSS operational SLA compliance dashboards across 120+ microservices.

• Led enterprise incident management program integrating PagerDuty and ServiceNow — establishing on-call runbooks, escalation policies, and automated incident creation for core banking outages via OpsRamp integration, reducing MTTR from 3 hours to 35 minutes and conducting post-incident reviews that eliminated 30+ repeat failure patterns across payment and authentication services.

• Implemented chaos engineering and resilience testing program using Gremlin across payment processing, fund transfer, and fraud detection microservices — simulating network latency, dependency failures, and pod terminations to validate system behavior under failure conditions, preventing 15+ potential production outages and meeting Federal Reserve operational resilience testing requirements.

• Led service mesh migration from F5 AspenMesh to Istio across 120+ banking microservices — implementing mutual TLS encryption, policy-based access control, circuit breakers, and distributed tracing to achieve zero-trust networking architecture that passed Federal Reserve security assessments and reduced inter-service security incidents by 75%.

• Centralized secrets management using HashiCorp Vault Enterprise across Azure and GCP environments — securing API keys, service accounts, database credentials, and encryption keys for payment and compliance systems with dynamic secret injection, automated rotation, and audit logging meeting PCI-DSS and SOX control requirements.

• Embedded Checkmarx SAST and OWASP ZAP DAST into Azure DevOps and Jenkins CI/CD pipelines across 40+ banking repositories — enforcing automated quality gates blocking high-severity vulnerabilities pre-deployment, reducing critical post-deployment security findings by 85%, and ensuring continuous PCI-DSS and SOX compliance for core banking, loan origination, and payment processing systems.

• Enforced compliance governance across 40+ Azure subscriptions using Azure Policy initiatives and Microsoft Defender for Cloud — automating security posture assessments, remediating misconfigurations, and maintaining 95%+ secure score across all banking workloads meeting PCI-DSS and SOX regulatory standards.

• Integrated Dynatrace with Grail-powered analytics for compliance log analysis — correlating CloudTrail and Azure Activity Logs to detect unauthorized access attempts across banking systems, generating real-time security alerts supporting SOC operations, PCI-DSS quarterly audits, and regulatory examinations, reducing audit preparation time by 45%.

• Centralized structured logging with ELK Stack (Elasticsearch, Logstash, Kibana) across Azure and GCP environments — ingesting and indexing 30TB+ monthly from banking microservices to enable sub-minute troubleshooting of payment failures, transaction errors, and authentication anomalies with Kibana dashboards mapped to PCI-DSS log retention requirements.

• Developed 30+ PowerShell automation runbooks in Azure Automation for resource provisioning, scheduled maintenance windows, encryption key rotation, and compliance evidence collection — eliminating 40+ hours of monthly manual operational overhead and enabling self-service infrastructure operations for banking platform teams.

• Managed JFrog Artifactory repositories for Docker images, Node.js, and .NET binaries across 40+ critical banking applications — integrating with Jenkins, Azure DevOps, and GitHub Actions pipelines for secure artifact versioning, controlled promotion workflows, Xray vulnerability scanning, and full audit compliance supporting regulated release processes.

• Architected Spark and Kafka streaming pipelines on Cloudera and GCP Dataflow for real-time fraud detection and transaction scoring — provisioned via Terraform and Ansible, reducing cluster setup time by 55% and improving fraud prevention accuracy by cutting false positives, alongside managing hybrid Hadoop Cloudera clusters for customer segmentation and financial analytics models.

• Managed Azure SQL Databases and GCP Cloud SQL for high-availability OLTP banking systems — implementing Multi-AZ configurations, automated backups, point-in-time recovery, and cross-region disaster recovery with <10-minute RTO and RPO through automated multi-region failover, validated quarterly DR testing, and PCI-DSS compliant recovery runbooks.

• Drove multi-cloud cost optimization across Azure and GCP banking infrastructure through VM rightsizing, committed use discount planning, GKE node auto-provisioning, Azure Reserved Instances, and storage lifecycle tiering — reducing annual cloud spend by $1.8M (28%) while maintaining all PCI-DSS compliance and 99.99% availability SLAs.

• Led capacity planning and performance engineering for digital banking APIs — conducting JMeter and k6 load testing simulating 10x traffic surges for tax season and trading day peaks, tuning Linux server configurations and JVM settings for Java trading platforms to achieve ultra-low latency, and establishing autoscaling thresholds validated against historical transaction volume patterns.

• Profiled and optimized .NET core banking applications and Java wealth management systems using AppDynamics — analyzing response times, throughput, and error rates, developing Golang utilities to automate reconciliation checks between payment microservices and database records, reducing manual audit time by 50% and improving transaction accuracy reporting.

• Strengthened ITSM governance by integrating ServiceNow change management with CI/CD pipelines for automated change request creation, CAB approval workflows, and deployment tracking — collaborating with risk, compliance, and audit teams using JIRA and Confluence to document infrastructure decisions, release notes, and compliance evidence for Federal Reserve examiners and external auditors.

• Established internal developer platform on GKE using Backstage and Terraform module registry — providing self-service infrastructure provisioning, standardized golden path templates for banking microservices, and automated compliance guardrails, reducing new service onboarding from 3 weeks to 4 days and eliminating 80% of DevOps team toil on repetitive provisioning requests.

Environment: Azure, GCP, Kubernetes (AKS/GKE), Red Hat OpenShift, Helm, Terraform, ARM, Ansible, Chef, Jenkins, Azure DevOps, GitLab CI, GitHub Actions, ArgoCD, Octopus Deploy, Prometheus, Grafana, Datadog, New Relic, Dynatrace, OpenTelemetry, ELK Stack, F5 AspenMesh, Istio, JFrog Artifactory, HashiCorp Vault, CheckMark, Microsoft Defender, Python, Bash, Golang, .NET, Node.js, Cloudera Hadoop, Spark, Hive, Kafka, JIRA, Confluence, ServiceNow, OpsRamp.

UnitedHealth Group August 2017 — May 2020

Senior DevOps Engineer Minnetonka, United States

• Architected and executed HIPAA-compliant cloud migration of legacy healthcare applications from on-premise IBM WebSphere and JBoss to Azure cloud-native infrastructure — containerizing workloads with Docker, deploying onto AKS, automating configuration management with Puppet and Ansible, and provisioning core services (VMs, AKS, Azure Functions) via ARM templates and Terraform, reducing infrastructure provisioning time by 50% and enabling secure cloud operations for critical patient data systems.

• Developed Puppet scripts and Ansible playbooks for automated server installation, OS patching, configuration management, and security hardening on Azure — implementing RBAC and NSGs to enforce HIPAA-compliant security posture and eliminating manual configuration drift across healthcare application environments.

• Provided hands-on middleware administration and 24/7 support for IBM WebSphere and JBoss enterprise application servers — managing JVM parameters, connection pools, thread configurations, and performance tuning for critical healthcare services, and developing Puppet scripts for automated patching reducing manual deployment time by 50%.

• Built multi-stage CI/CD pipelines using Azure DevOps (integrated with Git and Bitbucket), Jenkins with Terraform for Azure infrastructure provisioning, and Harness deployment workflows for AKS-based healthcare microservices — implementing traffic splitting, rolling updates, health checks, and automated gate approvals, cutting release time by 60% and reducing infrastructure setup time by 40%.

• Developed UNIX Shell, Perl, and Python automation scripts for code deployments, scheduled maintenance, and self-healing recovery systems on Red Hat Linux and Solaris UNIX — automating detection and remediation of common infrastructure failures using Ansible, reducing manual intervention and improving system resilience for production healthcare workloads.

• Designed Kafka streaming architectures for real-time healthcare data ingestion — building Kafka Streams API pipelines and Kafka-to-Hadoop/Spark batch integration with data engineering teams, enabling real-time patient data processing and supporting ML model deployments on Spark clusters for clinical analytics and claims processing.

• Deployed comprehensive observability stack using Splunk, Grafana, Azure Monitor with Log Analytics, Datadog APM, New Relic, AppDynamics, and Dynatrace — providing real-time visibility into healthcare system performance, identifying bottlenecks in patient data workflows on AKS and Azure App Services, and reducing production incident response time by 50%.

• Implemented HashiCorp Vault for encryption of healthcare data at rest and in transit — configuring key rotation, dynamic secret injection, and HIPAA-compliant access policies; supplemented with Azure Security Center policy enforcement and Ansible-driven security automation to maintain continuous compliance posture.

• Engineered highly available healthcare platforms with automated backup, recovery mechanisms, and HIPAA-aligned resilience architecture — authoring infrastructure runbooks, operational procedures, and architecture documentation in Confluence to standardize deployments and accelerate knowledge sharing across engineering teams.

• Built healthcare data solutions using SOA and event-driven architecture patterns — leveraging Django, Spring, and Azure Functions to develop scalable serverless APIs and microservices for patient data processing and claims integration, and collaborated with data scientists to deploy ML models on Spark clusters using secure VPCs, subnets, and VNet peering.

Environment: Azure, Kubernetes, Docker, Ansible, AppDynamics, New Relic, Azure Monitor, Dynatrace, Splunk, Grafana, RedHat, CentOS, Jenkins, Terraform.

AT&T January 2016 — July 2017

DevOps Engineer Dallas, United States

• Architected and executed cloud migration of AT&T telco workloads from on-premise network management systems to AWS (EC2, VPC, RDS, Route53) — building reusable Terraform modules, CloudFormation templates, and automated Jenkins pipelines that standardized the migration playbook, reduced per-service migration time by 40%, and ensured high availability of telecom network monitoring platforms.

• Managed and optimized core AWS services (S3, EBS, ELB, Auto Scaling, API Gateway, Network Load Balancers, IAM, DynamoDB, SES, SQS, SNS, Lambda) for AT&T telco application backends — supporting network management platforms, billing services, and customer-facing systems with high-availability architecture and automated scaling policies.

• Combined Chef, Puppet, and Terraform to automate provisioning and configuration of AT&T's cloud infrastructure — creating reusable Terraform modules for network components, IAM policies, and containerized workloads, enforcing configuration consistency and compliance across all telecom environments.

• Managed container-based deployments using Docker, Amazon ECS, and Kubernetes on EC2 via kops — orchestrating containerized telco microservices, maintaining Docker images and Jenkins pipeline integration with Docker Hub, and streamlining delivery of network management services without EC2 overhead using AWS Fargate for serverless container workloads.

• Engineered CI/CD pipelines using Jenkins (master-slave configuration via EC2 Container Service plugin) and GitLab CI/CD — automating build, test, and zero-downtime deployment of Java/J2EE web applications to Apache Tomcat clusters and containerized microservices, supporting scalable builds for AT&T's software development and network engineering teams.

• Implemented monitoring and logging using CloudWatch, ELK Stack (Elasticsearch, Logstash, Kibana), Prometheus, Alertmanager, and Grafana — enabling centralized visibility into AT&T network performance, proactive incident detection for telco services on EC2 and Kubernetes, and reducing alert noise through tuned Alertmanager routing and inhibition rules.

• Deployed serverless workloads using AWS Lambda and configured SQS/SNS event-driven architectures for AT&T billing and messaging services — managing AWS API Gateway and Network Load Balancers for low-latency, high-volume traffic routing to Apache Tomcat clusters, ensuring resilient service delivery and cloud resource cost optimization.

• Configured and managed Elastic Load Balancing (ELB and NLB) to eliminate single points of failure in AT&T's service delivery architecture implementing Auto Scaling groups with dynamic policies and CloudFormation-automated provisioning for new network services to support high-availability telecom workloads.

• Administered GitHub and GitLab repositories for enterprise source code management — enforcing branching strategies, code review workflows, and JIRA-automated release tracking for large-scale Java/J2EE application deployments across AT&T's internal and customer-facing platforms, improving release cycle predictability and audit traceability.

Environment: AWS, CloudFormation, Chef, Terraform, Puppet, Python, Jenkins, GitLab CI/CD, GitHub, Bitbucket, Docker, Kubernetes, Prometheus, Grafana, JIRA, Apache Tomcat.

Zoho Corporation August 2014 — November 2015

Build and Release Engineer Chennai, India

• Engineered end-to-end CI/CD pipelines using AWS CodePipeline and Jenkins, automating build, test, and deployment stages for consistent multi-environment releases.

• Managed code integration via Git and Bitbucket, configuring GitHub webhooks to trigger automated builds, eliminating manual errors and accelerating delivery.

• Developed Maven and Gradle build scripts for Java-based microservices, optimizing performance and minimizing build failures.

• Automated configuration management with Ansible Playbooks and Roles, ensuring consistent environment provisioning and streamlining IaC deployments on AWS.

• Containerized applications using Docker,



Contact this candidate