Post Job Free
Sign in

AWS Cloud Production Support Engineer (L3) Resume Summary Title

Location:
Glen Allen, VA
Posted:
February 26, 2026

Contact this candidate

Resume:

Ishaaq Sameer Mohammed

AWS Cloud Production Support Engineer (L3) AWS Operations & Incident Response

E-mail: ***********@*****.*** Phone: +1-217-***-****

Professional Summary

AWS Cloud Production Support Engineer (L3) with 8+ years of experience supporting mission-critical, cloud-based production systems. Specializes in real-time incident response, outage triage, root cause analysis (RCA), and stabilizing distributed applications in high-availability environments. Strong in diagnosing infrastructure, application performance, and networking/dependency failures, with a focus on improving reliability, reducing repeat incidents, and restoring service within SLAs.

Technical Skills

Cloud Platforms:

AWS EC2, VPC, IAM, S3, RDS/Aurora, DynamoDB, ELB/ALB, Route 53, Lambda, CloudTrail, Multi-AZ architectures, distributed system support

Incident Response & Reliability:

L2/L3 production support, incident management, outage triage, root cause analysis (RCA), post-incident reviews (PIR), SLA/MTTR improvement, escalation handling, runbooks

Monitoring & Observability:

Datadog Incident Management, AWS CloudWatch, Splunk, New Relic, Kibana, log and metric analysis, alerting and diagnostics

Automation & Scripting:

Python, Bash, AWS CLI, JSON, SQL, automated diagnostics, recovery scripts, health checks

Containers & Orchestration:

Kubernetes (EKS), pod and container troubleshooting, resource utilization analysis, application log inspection

Data Platforms:

Databricks, Snowflake, AWS EMR, batch and analytics workload support

CI/CD & ITSM:

GitHub Actions, Jenkins, AWS CodePipeline, ServiceNow, JIRA / JSM, PagerDuty, Opsgenie, Confluence

Work Experience

Senior AWS Cloud Production Support Engineer Feb 2024 – Present

Capital One Bank – Richmond, VA

●Owned L3/L4 production incident response for AWS-hosted applications, restoring service within defined SLAs during high-severity outages.

●Led real-time outage bridges, coordinating cross-functional recovery efforts and stabilizing distributed cloud systems under incident conditions.

●Diagnosed complex failures across compute, networking, identity, load balancing, database, and application dependency layers to identify root causes.

●Performed root cause analysis (RCA) and post-incident reviews (PIR), driving corrective actions that reduced recurrence and improved platform reliability.

●Analyzed logs, metrics, and alerts using enterprise monitoring tools to accelerate detection and resolution of production issues.

●Supported Kubernetes (EKS) workloads by troubleshooting pod failures, resource constraints, and service connectivity issues during incidents.

●Investigated data pipeline and batch processing failures involving Databricks, Snowflake, and EMR, ensuring timely recovery of downstream reporting systems.

●Automated diagnostic checks and recovery tasks using scripting and cloud tooling, reducing manual effort and improving MTTR.

●Monitored deployments and coordinated rollback activities during production releases and incident scenarios.

●Authored and maintained incident runbooks and operational documentation to improve on-call readiness and knowledge sharing.

AWS Cloud Infrastructure Engineer Oct 2020 – Jan 2024

Capital One Bank – Richmond, VA

●Supported high-availability AWS production environments across EC2, S3, EFS, RDS, and VPC, diagnosing infrastructure and application issues as part of cloud escalation and recovery efforts.

●Utilized CloudFormation and Terraform to support infrastructure changes and assisted with deployment validation and recovery during high-impact production releases.

●Built Lambda-based automation for proactive alerting, scheduled recovery jobs, and data validation tasks used during production incident investigation.

●Supported IAM, CloudTrail, and S3 policy configurations during production incidents, investigating access, audit, and encryption-related issues impacting cloud workloads.

●Investigated EKS/Kubernetes failures, analyzing pod logs, resource constraints, and service dependencies to stabilize workloads during infrastructure-related outages.

●Leveraged GitHub Actions, Jenkins, and AWS CodePipeline to monitor deployments and coordinate rapid rollback during production incidents and recovery efforts.

●Analyzed logs, metrics, and alerts using AWS CloudWatch and Splunk to identify performance anomalies and early indicators of production service degradation.

AWS Platform Support Engineer Sept 2019 – Sept 2020

Capital One Bank – Richmond, VA

●Provided L2/L3 production support for AWS environments including EC2, EMR, S3, and IAM, triaging and resolving incidents impacting big-data pipelines and analytics workloads.

●Investigated EMR-based ETL failures using Python and Lambda automation, contributing to root cause analysis and recovery during data platform production incidents.

●Monitored large-scale distributed systems using logs, metrics, and health checks to detect anomalies and support timely incident response.

●Created SOPs and operational runbooks to standardize technical response during incident escalations and reduce recovery time.

Software Engineer – AWS May 2019 – June 2019

Magtech Solutions– Jersey City, NJ

●Executed deployments of AWS services including IAM, EC2, S3, Lambda, RDS, VPC, and SNS using CloudFormation and Terraform, supporting consistent and repeatable infrastructure setup.

●Automated deployment and configuration tasks using Ansible to ensure consistency and reduce manual errors during environment setup.

●Performed deployment validation and integration testing to verify infrastructure readiness and prevent post-deployment issues.

Software Analyst – Salesforce Aug 2018 – May 2019

Veridic Solutions LLC – Jersey City, NJ

●Developed and customized Salesforce Lightning and Visualforce components, implementing application-level enhancements to support business workflows.

●Built Apex triggers and classes to automate workflows and ensure consistent data updates across Salesforce applications.

●Executed bulk data migrations involving 10,000+ records using Data Loader and REST APIs, validating data integrity and successful integration.

●Created reports and dashboards to monitor application data consistency and support operational visibility.

Software Systems Analyst – Salesforce May 2017 – July 2018

ITDEA Technologies LLC – Tampa, FL

●Designed and implemented Salesforce data models using custom objects and fields, supporting application logic and structured data relationships.

●Automated business processes using workflow rules, validation rules, and Apex triggers to ensure data consistency and process reliability.

●Developed Salesforce Lightning components to support application functionality and improve system usability.

Business Analyst Aug 2016 – Apr 2017

ASTA CRS – Greenbelt, MD

●Documented system workflows and requirements using UML and BPMN, translating business needs into technical specifications for development teams.

●Supported UAT and regression testing cycles, validating system behavior and assisting QA teams during release readiness activities.

Project Engineer June 2012 – Jan 2015

Wipro Technologies – Hyderabad, India

●Performed manual and automated testing for enterprise applications (Siebel CRM and JDE ERP), validating functionality, integrations, and data accuracy prior to production releases.

●Prepared technical documentation and supported project delivery activities by tracking defects, test outcomes, and release readiness metrics.

Education

●M.S. in Management Information Systems May 2016

University of Illinois at Springfield

●B.E. in Electrical & Electronic Engineering April 2012

GITAM University, India

Certifications & Professional Development:

●Salesforce Platform Developer I (PD-401)

●Salesforce Administrator (ADM-201)

●Certified IT Project Management

●Certified Business Process Management



Contact this candidate