Suresh S Email: **********.***@*****.***
Phone: +1-646-***-****
Professional Summary:
Accomplished DevOPs/CloudOPs Engineer with 15+ years of IT experience, specializing in on-premises and cloud infrastructure, automation, and reliability engineering. Expert in AWS, Azure DevOps, Kubernetes, and SRE practices, with a proven track record of architecting scalable, secure, and highly available systems. Adept at driving automation, implementing CI/CD pipelines, and leading cross-functional teams to deliver robust cloud-native solutions.
Core Competencies:
DevOPs & CloudOPs strong technical expertise
AWS Cloud Architecture and Operations
Terraform & Infrastructure as Code
CI/CD (Azure DevOps, Jenkins, Udeploy, GitHub Actions)
Monitoring (Datadog, Dynatrace, Splunk, Grafana)
Automation (Ansible, Shell/Python scripting)
ITIL Processes (Incident, Change, Problem, Release Management)
Containerization (Docker, Kubernetes)
CIAM through FusionAuth Cloud
Configuration Management
Hands-on Linux administration skills with strong technical expertise
Load Balancing & Security (F5 DCS, SSL/TLS, IAM, AWS-WAF)
WebSphere – ND, Tomcat, IHS, Nginx, Apache, JBoss.
Certifications:
AWS Certified Solutions Architect – Associate
DevOps SRE Foundation Certified
Professional Experience:
NRG Energy, Inc.
Sr. DevOps Engineer
Oct 2022 – Present
Developed Terraform modules for key AWS infrastructure components like VPC, S3 buckets (to host static front-end apps), CloudFront (CDN for distribution and SSL), APIs and API gateways, IAM for security policies and enabled AWS WAF policies for both CloudFront and API gateways
Leveraged Terraform Infrastructure as Code (IaC) modules to automate the integration and management of FusionAuth Cloud-based CIAM solutions, enhancing secure authentication and authorization workflows for frontend applications.
Architected scalable and secure AWS infrastructure using Terraform modules, including VPC, EC2, IAM, AWS WAF, S3, RDS, Lambda, CloudFront, Application Load Balancer (ALB), and API Gateways, ensuring resilient and compliant cloud environments.
Deployed and integrated monitoring solutions including Dynatrace to deliver end-to-end observability; implemented automated alerting and troubleshooting workflows, reducing Mean Time to Recovery (MTTR) and enhancing system reliability
Implemented and managed CI/CD pipelines in Azure DevOps integrated with GitHub and leveraging branch protection policies to enforce code quality and secure collaboration workflows.
Developed automated Azure Pipelines for building Docker images, pushing them to artifact repositories such as Azure Container Registry, and deploying containers to various environments using infrastructure-as-code practices.
Configured code scanning tools and secure gateway enforcement within DevOps pipelines to enhance security operations, alongside applying advanced branching strategies including feature branching, pull requests, and environment-specific branch policies to streamline CI/CD processes.
Led cross-functional release management efforts, coordinating infrastructure changes and Go-Live operations to ensure smooth deployments, minimize downtime, and align teams for successful production rollouts.
Amphora Technologies Pvt Ltd.
DevOps Lead
Mar 2022 – Aug 2022
Designed, developed, and maintained Jenkins CI/CD build pipelines using both Declarative and Scripted Groovy pipelines to automate containerized application deployments, improving release efficiency and reducing manual intervention.
Authored and optimized Groovy scripts and shared libraries in Jenkins pipelines to perform complex build logic, environment validations, and deployment orchestration while following best practices like Pipeline as Code and Version Control.
Created reusable shell scripts to manage middleware services, automating routine tasks including start, stop, restart operations, log rotations, and health checks, resulting in reduced manual errors and faster incident response.
Developed and maintained Ansible playbooks for SSL/TLS certificate deployment, middleware configuration, and day-to-day service management (start, stop, restart), enabling consistent and repeatable environment configuration and patching.
Led incident, change, and problem management processes in production and pre-production environments, conducting root cause analyses (RCA), coordinating cross-team resolutions, and implementing preventive measures to enhance system stability.
Worked extensively with IaaS and SaaS cloud models, deploying and managing infrastructure and application workloads on AWS and Azure, aligning environments with compliance and security standards.
Integrated and utilized Splunk and Datadog for centralized logging, metrics collection, and alerting, establishing automated monitoring dashboards and proactive alert mechanisms to reduce Mean Time to Detection (MTTD).
Provided production release support and assurance, coordinating pre-release validation, risk assessments, rollback planning, and post-release monitoring to ensure seamless and reliable deployments.
Citi Group Inc.
Senior Infra Tech Analyst (AVP) / DevOps Lead
Mar 2011 – Mar 2022
Engineered and optimized CI/CD pipelines across Azure DevOps, Jenkins, and GitHub Actions for containerized and cloud-native applications, automating build, test, and deployment stages using integrated workflows with Nexus for artifact storage, Docker manifests, and advanced versioning strategies for reliable releases.
Implemented comprehensive automated testing in deployment pipelines by orchestrating Cucumber (BDD) and Selenium test cases, directly embedding them into Azure DevOps and Jenkins workflows to ensure code quality through seamless test integration and faster feedback during the delivery lifecycle.
Developed and maintained Helm charts and Kubernetes manifest files to standardize and automate application deployments on Kubernetes, supporting scalable, reproducible, and secure infrastructure as code practices; managed configuration rollouts and environment promotions using GitOps methodologies.
Authored production-grade Ansible playbooks and shell scripts to automate middleware service management (start/stop/restart), handle SSL configuration, support environment provisioning, and automate day-to-day operations, including version upgrades and patch management for DevOps tools (Jenkins, Nexus, Helm, and Kubernetes) standardizing upgrade procedures, minimizing downtime, and ensuring platform reliability.
Managed and monitored systems using Splunk and Datadog for log analysis, metrics visualization, and automated alerting, enabling proactive incident management, root cause analysis (RCA), and rapid Mean Time to Recovery (MTTR).
Supported production releases and change management by coordinating deployments, performing risk assessments, leading incident response, and driving post-release assurance activities, with a focus on continuous improvement and customer satisfaction.
AMEX (TCS)
Assistant System Engineer
May 2010 – Mar 2011
Primarily involved in Installation of WebSphere Application Server, Weblogic & IBM HTTP Server and configuring it on AIX/Linux and Windows.
Having good knowledge at configuration of was v8.5 including backup configurations, profiles management and upgrades.
Experienced in installing and upgrading fix packs and migrating to latest versions. investigated memory leaks in was v7.0/6.x and obtained appropriate fixes from IBM to upgrade the installation
Deployed enterprise applications from the admin console and defined virtual hosts and environment variables
Worked on various environments like PAT, production and DR as part of the application deployments.
Performing Load test reviews on development and stage servers with the help of HP performance center.
Configuring of JDBC Providers, Data Source, Virtual Hosting. Global Security, SSL, LDAP at Application server level.
Analyzing Heap dumps and Core dumps and investigated memory leaks in was v7.0/6.x and obtained appropriate fixes from IBM to upgrade the installation and diagnosed application server problems.
Actively involved in assisting team members performing problem determination whenever needed and extended support.
Integrating monitoring tools with WebSphere Application Server in conjunction to support the issues with the environment.
Implemented SSL, SSO (SiteMinder), and monitoring (WILY, AppDynamics).
Technical Skills
Cloud & DevOps AWS, Azure, GCP, Terraform, Ansible, Docker, Kubernetes, Helm, Jenkins, Azure DevOps, GitHub Actions
Monitoring & SRE Grafana, Prometheus, Datadog, Dynatrace, Splunk, CloudWatch, SLOs, Error Budgets, Runbooks
Programming/Scripting Python, Shell, YAML, Groovy
Middleware & Servers WebSphere, Tomcat, Nginx, WebLogic, Gemfire, F5
Other Tools Git, Nexus, JFrog, SonarQube, Maven, Gradle, Kafka, Octopus Deploy
Education
Master of Computer Application (MCA) – JNT University, 2006
B.Sc (Computers) – Nagarjuna University, 2003
Selected Achievements
Migrated on-premises web applications to AWS environments and the whole clod infrastructure leveraged through terraform
Led cloud automation initiatives, saving significant manual effort and operational costs.
Reduced MTTR by 30% through end-to-end automation of monitoring, alerting, and incident response.