Sowmya Jampala
***********@*****.***
LinkedIn: https://www.linkedin.com/in/sowmya-jampala-57b086b7/
Senior AWS DevOps / SRE Engineer with 9+ years of experience in designing, automating, and operating highly available cloud-native and hybrid infrastructure. Strong background in Infrastructure as Code, CI/CD automation, Kubernetes platforms, observability, and reliability engineering. Experienced in defining SLOs and SLIs, leading incident response, and embedding security-first practices into delivery pipelines. Known for hands-on ownership of production systems, strong troubleshooting skills, and collaboration across engineering, security, and operations teams to deliver resilient, cost-efficient platforms.
CORE COMPETENCIES
•Cloud Infrastructure & Migration: AWS, Azure, GCP, Pivotal Cloud Foundry (PCF) On premises to cloud migration strategies VMware vSphere virtualization
•DevOps & Automation: CI/CD pipeline development (Jenkins, GitHub Actions, GitLab CI, AWS CodePipeline) Infrastructure as Code (Terraform, CloudFormation, Ansible, Chef)
•Containerization & Orchestration: Docker, Kubernetes (EKS, AKS), ECS Fargate, Harbor, Helm charts, container vulnerability management
•Site Reliability Engineering: 24/7 production support, incident/problem management, SLO/SLI tracking, error budget management
•Scripting & Development: Python, Bash, Shell scripting
•Monitoring & APM: CloudWatch, Datadog, AppDynamics, Prometheus/Grafana, ELK Stack, Splunk, New Relic
•Infrastructure Management: Linux/Windows administration, NGINX, Apache Tomcat, WebLogic, WebSphere
•Database Administration: PostgreSQL, MongoDB, RDS, DynamoDB, Redshift
•Security & Compliance: IAM controls, secrets management, SAST/DAST integration, vulnerability scanning
•Storage: NetApp, Hitachi USP, HP 3PAR, SAN, NAS
•Virtualization: VMware vSphere, ESXi, vCenter, vMotion, DRS, Site Recovery Manager (SRM)
PROFESSIONAL EXPERIENCE
AWS DevOps Engineer
Capital One – Plano, TX
Dec 2024 – Present
Responsible for architecting and operating reliable, secure, and scalable AWS infrastructure supporting business-critical platforms. Designed and implemented automation, CI/CD pipelines, and Infrastructure as Code to improve deployment consistency and reliability. Applied DevOps and SRE practices to enhance system availability, monitoring, and incident response. Worked closely with engineering and security teams to ensure platforms met performance, scalability, and compliance requirements.
Designed and supported highly available AWS environments using EC2, VPC, ALB, Auto Scaling, Route53, S3, Lambda, RDS, DynamoDB, SNS, and SQS.
Implemented Infrastructure as Code using Terraform and AWS CloudFormation to enable repeatable, auditable, multi-environment deployments.
Built reusable Terraform modules and managed remote state using S3 and DynamoDB state locking.
Developed and maintained CI/CD pipelines using Jenkins and GitHub Actions, supporting automated build, test, and deployment workflows.
Implemented deployment strategies including blue-green and canary releases to minimize risk and enable zero-downtime production deployments.
Designed and supported AWS Lambda functions for event-driven and API-based workloads, integrating with API Gateway, SQS, and DynamoDB to enable scalable serverless processing.
Built and managed Docker images published artifacts to AWS ECR, and deployed workloads to ECS and EKS.
Provisioned and operated Amazon EKS clusters, including IAM roles, security groups, networking, and cluster scaling.
Supported Kubernetes platform operations by managing cluster upgrades, node scaling, and workload health to ensure long-term platform stability.
Managed containerized workloads using ECS Fargate, eliminating server management while improving scalability, availability, and operational efficiency.
Supported applications deployed on AWS Elastic Beanstalk, managing environment configurations, scaling policies, and deployment updates for web-based services.
Supported event-driven architectures leveraging Kafka for asynchronous data processing, ensuring reliable message flow and observability across producer and consumer services.
Integrated AWS CodePipeline and CodeBuild into CI/CD workflows for AWS-native deployments, supporting consistent and automated release processes.
Defined and tracked SLIs and SLOs for critical services and used error budgets to guide deployment velocity and reliability decisions.
Integrated security-first practices into CI/CD pipelines, including secrets management, dependency scanning, and policy-based controls.
Implemented centralized observability using CloudWatch, Datadog, Prometheus, Grafana, ELK Stack, and Splunk.
Developed Python scripts to automate cloud infrastructure provisioning, configuration validation, and operational workflows across environments.
Supported application data layers using PostgreSQL and MongoDB, working closely with application teams on schema changes, performance considerations, and operational stability.
Implemented NGINX as a reverse proxy and load balancer, integrating with AWS ALB for SSL termination and traffic routing.
Participated in capacity planning and performance analysis to ensure systems scale reliably under peak load.
Enhanced application observability using cloud APM tools such as Datadog and New Relic to track latency, error rates, and customer impact across distributed systems.
Designed and managed cloud infrastructure using AWS services with exposure to multi-cloud concepts across Azure and GCP environments.
Supported cloud-native and microservices-based applications running on container platforms, ensuring scalability and resilience.
Used Python-based tooling to support Kubernetes operations, including health monitoring and deployment validation.
Worked with 5G network data, including RAN KPIs and performance metrics. My role involved supporting ingestion and processing pipelines within AWS environments.
Provided deep Linux systems support, including performance tuning, resource analysis, and troubleshooting production issues on container hosts and EC2 instances.
Site Reliability Engineer
Charter Communications – Maryland Heights, MO
May 2022 – Nov 2024
Supported cloud modernization initiatives and large-scale infrastructure migrations from on-premises environments to AWS. Assisted in planning and executing migration activities while ensuring minimal disruption to production workloads. Played a key role in driving automation and container adoption to improve deployment consistency and scalability. Led enterprise workload migrations from on-premises environments to AWS using SMS, DMS, Snowball, and Direct Connect, reducing migration time and ensuring zero downtime
Designed AWS infrastructure with EC2, VPC, IAM, RDS, S3, Auto Scaling, and Load Balancers, delivering a scalable and secure environment for critical applications
Automated infrastructure provisioning using Terraform and CloudFormation to standardize environments.
Built and maintained CI/CD pipelines using Jenkins and GitLab CI, enabling faster code releases and improving deployment reliability
Supported production Kubernetes platforms by managing deployments, scaling, and workload health to ensure high availability and operational stability.
Built and operated containerized applications using Docker, standardizing application packaging and improving portability, scalability, and deployment reliability.
Used Ansible and Chef for configuration management and OS standardization, achieving consistent server configurations across the fleet
Used AWS SAM to package, deploy, and version serverless applications, simplifying Lambda deployment and environment management.
Implemented monitoring with CloudWatch, Datadog, Prometheus, and Grafana, providing real-time visibility and early detection of issues
Built centralized logging using ELK Stack, CloudTrail, and VPC Flow Logs, streamlining troubleshooting and audit compliance
Supported FinOps initiatives by analyzing cloud usage and implementing rightsizing and cost optimization strategies.
Built Python-based utilities to standardize deployment processes and improve consistency across cloud platforms.
Applied cloud security best practices, including IAM controls, network segmentation, and secure configuration standards.
Leveraged Python scripting to support incident response by automatically collecting logs, metrics, and system diagnostics, speeding up root-cause analysis
Used Ansible to automate configuration management and application deployments, ensuring consistent, repeatable environments across development and production.
Assisted in managing cloud database services, including backups, monitoring, and availability planning for relational and NoSQL databases.
Contributed to operational playbooks, troubleshooting guides, and deployment standards, enhancing team knowledge and reducing mean time to resolve
Mentored junior engineers on cloud automation, reliability, and operational best practices.
Applied analytical and problem-solving skills to troubleshoot infrastructure issues and improve platform stability.
Build and Release Engineer
Metamor Software Solutions – Hyderabad, India
Jul 2019 – Nov 2021
Owned build, release, and deployment pipelines across multiple environments to ensure stable and repeatable application delivery for enterprise platforms. Designed and maintained CI/CD workflows to automate build, test, and deployment processes. Continuously improved pipeline reliability and performance to support frequent and predictable deployments.
Managed CI/CD and release pipelines across Development, QA, UAT, and Production environments, ensuring consistent delivery across all stages
Built Jenkins pipelines using Maven, Ant, and Gradle, which streamlined builds and shortened build time
Automated build and deployment processes using Shell, Python, and Perl scripts, reducing manual effort and errors
Managed Git and SVN repositories, branching strategies, and release tagging, enabling clear version control and smoother releases
Integrated automated testing using Selenium, JUnit, and Cucumber, increasing test coverage and catching defects early
Supported deployments on Apache Tomcat, WebLogic, and WebSphere, ensuring successful application launches with minimal downtime
Implemented monitoring for CI/CD infrastructure using Prometheus, Grafana, and ELK Stack, providing visibility that helped quickly identify pipeline issues
Supported production releases, hotfix deployments, and incident resolution, maintaining system stability and meeting service-level expectations
Storage Engineer
CTRLS Data Centers – Hyderabad, India
Jun 2016 – Jun 2019
Provided infrastructure support for data center environments, covering compute, virtualization, storage, and operating systems with a focus on reliability and performance.
Managed enterprise infrastructure across Linux, Windows, VMware, SAN, and NAS platforms, resulting in stable operations that met service-level targets
Administered VMware HA, DRS, VMotion, and Site Recovery Manager (SRM), enabling seamless failover and minimizing service interruptions
Provisioned and managed storage using Hitachi USP, NetApp, and HP 3PAR, delivering reliable capacity for critical applications
Performed P2V and V2V migrations, capacity planning, and performance tuning, which accelerated workload consolidation and improved resource utilization
Automated routine administration with Shell scripts and CRON jobs, cutting manual effort and speeding up daily tasks
Supported infrastructure incidents and production troubleshooting, restoring services promptly and reducing mean time to resolution
Collaborated with application teams to ensure system reliability and performance, fostering smoother deployments and higher user satisfaction
EDUCATION
Bachelor of Technology (B.Tech) in Computer Science Engineering
Jawaharlal Nehru Technological University Hyderabad (JNTUH)
June 12, 2012 – May 16, 2016
PROFESSIONAL CERTIFICATIONS
•AWS Certified Solutions Architect – Associate April 2019 Expired
•Microsoft Certified: Azure Administrator Associate July 2022 Expired
•ITIL v3 Foundation August 2017 Expired
•Linux Essentials Certification February 2020 Active