Ram Polarapu ***************@*****.***
470-***-**** https://www.linkedin.com/in/ram-polarapu-316a911b1
SUMMARY
Senior AWS DevOps & Site Reliability Engineer with 5+ years of experience delivering incident management, platform support, and reliability engineering for large-scale AWS environments hosting high-volume applications. Strong expertise in AWS (EC2, S3, VPC, EKS, IAM, CloudWatch, CloudFormation), Terraform-based infrastructure automation, and Kubernetes operations. Proven ability to manage production incidents, perform root cause analysis (RCA), optimize operational processes, and implement automation to improve scalability, security, and efficiency. Experienced in on-call rotations, ticket-based support models, and cross-functional collaboration with developers and stakeholders across organizational levels.
Education
Master’s of Science, Indiana Wesleyan University (Sep 2023 – May 2025)
Bachelor of Technology (B.Tech), JNTUA (2013 – 2017)
Certifications: AWS Certified Solutions Architect Associate
Certified Kubernetes Administrator (CKA)
Core Skills
AWS: EC2, S3, VPC, EKS, IAM, CloudWatch, CloudFormation
Infrastructure as Code: Terraform
Containerization: Kubernetes (EKS), Docker
Incident & Operations: Incident Management, RCA, Ticketing Systems, SLA/SLO Governance
Automation & CI/CD: GitHub Actions, Jenkins
Monitoring: CloudWatch, Prometheus, Grafana
Experience
Senior DevOps Engineer
SoftTechers LLC (Client KPMG US) May 2025 – Present
• Deliver advanced-level incident management and operational support for AWS platforms hosting distributed financial applications.
• Serve as initial point of contact for application teams via ticketing systems, troubleshooting infrastructure, networking, and Kubernetes-related issues.
• Operate and optimize AWS infrastructure including EC2, S3, VPC, IAM, CloudWatch, and EKS clusters supporting 99.9%+ SLA workloads.
• Implement Terraform automation to standardize infrastructure provisioning, reducing manual configuration errors by 40%.
• Conduct deep-dive root cause analysis (RCA) for production incidents and implement preventive controls to eliminate recurring issues.
• Automate monitoring, alerting, and remediation workflows using CloudWatch and scripting, reducing MTTR by 25%.
• Optimize operational processes to improve reliability, scalability, and cloud security posture.
• Train developers on AWS best practices, self-diagnosis techniques, and troubleshooting methodologies to accelerate issue resolution.
• Strengthen IAM least-privilege policies and enforce secure access controls aligned with compliance standards.
DevOps Engineer
Infosys Limited(Client: Molina Healthcare) May 2023– Aug 2023
• Supported production cloud workloads in regulated healthcare environments with focus on reliability and incident response.
• Provisioned and maintained infrastructure using Terraform and CloudFormation following Infrastructure-as-Code principles.
• Managed Kubernetes-based containerized applications and resolved performance and deployment-related incidents.
• Enhanced monitoring and alerting processes to improve operational visibility and system uptime.
• Collaborated with cross-functional teams to minimize downtime and ensure compliance-driven operational standards.
AWS DevOps Engineer Nov 2021 – May 2023
Wipro Limited (Client: US Bank)
• Provided AWS platform support for Linux-based applications running on EC2 and EKS in banking environments.
• Designed and managed AWS networking (VPC, subnets, route tables, ALB) for secure and scalable microservices deployments.
• Handled production incidents, event management, and change coordination in Agile delivery environments.
• Built CI/CD pipelines using Jenkins and Git workflows, reducing deployment cycle time by 35%.
• Implemented centralized logging and monitoring using CloudWatch and ELK to support incident investigations.
• Performed proactive capacity planning and operational tuning to enhance system reliability and cost optimization.
CLOUD SECURITY ENGINEER Aug 2019 – Nov 2021
Microland Limited (Client: Qualcomm)
• Supported secure Linux-based distributed systems with focus on operational resilience and vulnerability remediation.
• Automated compliance and security controls using scripting and infrastructure automation tools.
• Improved monitoring dashboards and alerting accuracy, reducing incident response time by 50%.
• Conducted investigations into security and performance issues and documented preventive strategies.