Post Job Free
Sign in

senior devops engineer

Location:
Ashburn, VA
Posted:
March 20, 2025

Contact this candidate

Resume:

Likith Popuri

Senior DevOps Engineer

Mail: **********@*****.***

Phone: 445-***-****

LinkedIn:

Professional Summary

Accomplished DevOps and Site Reliability Engineer (SRE) with 12+ years of experience in designing, automating and optimizing cloud infrastructures to deliver highly available, secure and scalable solutions that enhance operational efficiency and business agility.

●Multi-cloud experience specializing in AWS, Azure and Google Cloud Platform (GCP), with a focus on optimizing resource allocation using AWS Cost Explorer, Azure Cost Management and Google Cloud Billing to improve cost efficiency.

●Expertise in Infrastructure as Code (IaC) using Terraform, AWS CloudFormation and Google Cloud Deployment Manager, ensuring automated, consistent and repeatable cloud infrastructure deployments.

●Experienced in containerization with Docker and orchestration using Kubernetes (EKS, AKS, GKE, OpenShift and Docker Swarm) to modernize monolithic applications into microservices architectures.

●Optimized cloud storage and backup solutions using AWS S3 and AWS FSx, ensuring efficient data retention, disaster recovery and cost-effective storage strategies.

●Proficient in building and managing CI/CD pipelines using Jenkins, GitLab CI/CD, Azure DevOps, Argo CD, AWS CodeDeploy and Google Cloud Deployment Manager, enabling automated, zero-downtime deployments across diverse cloud environments.

●Designed and executed progressive deployment strategies such as Canary Deployments, Blue-Green Deployments and Rolling Updates, ensuring controlled releases with minimal disruption.

●Extensive experience in configuration management tools like Ansible, Puppet and SaltStack, enabling seamless provisioning, compliance enforcement and system consistency.

●Developed automation frameworks and system utilities using Python, Go and Shell scripting, streamlining infrastructure management, system monitoring and backup operations.

●Deep knowledge of cloud networking and security, including Amazon VPC, AWS IAM, HAProxy and AWS Secrets Manager, ensuring robust, scalable and secure cloud architectures.

●Expert in DevSecOps and Site Reliability Engineering (SRE), integrating security into CI/CD pipelines and enhancing system reliability using shift-left security, SLIs, SLOs and error budgets.

●Managed FTP, SFTP for encrypted transfers, ensuring compliance with ISO 27001, NIST and GDPR in DevOps workflows.

●Implemented monitoring solutions with Prometheus, Grafana, AWS CloudWatch, Google Cloud Operations Suite, ELK Stack and Splunk, enhancing system resilience and proactive issue resolution.

●Experienced in Amazon SQS, Apache Kafka, Google Cloud Pub/Sub and RabbitMQ for asynchronous processing and real-time data streaming.

●Proficient in Git, GitLab, Bitbucket, AWS CodeCommit and Google Cloud Source Repositories, ensuring efficient collaboration, branching strategies and code management.

●Expertise in test automation and code quality analysis, utilizing Selenium, JUnit, TestNG, SonarQube, ESLint and CodeClimate to enforce robust testing and maintain high software reliability.

●Skilled in database management for both relational and NoSQL databases, including MongoDB, Azure CosmosDB, Cassandra, Google Cloud Firestore, PostgreSQL, Amazon RDS and Microsoft SQL Server, ensuring high availability and optimized performance.

●Proficient in incident response and real-time collaboration, leveraging Slack, Microsoft Teams, AWS SNS and PagerDuty for proactive monitoring, alerting and streamlined incident resolution.

●Led cross-functional initiatives using Confluence, Microsoft Teams, Chef and SaltStack, promoting transparency, learning and continuous DevOps improvement.

●Collaborated with developers, security engineers and product managers to align DevOps strategies with business objectives and enhance team synergy.

●Proficient in SDLC methodologies, including Agile (Scrum, Kanban) and Waterfall, ensuring seamless integration, deployment and collaboration between development and operations teams.

●Committed to continuous learning and research, staying up to date with evolving DevOps technologies, cloud innovations and automation trends to drive digital transformation and operational excellence.

Certifications

●AWS Certified DevOps Engineer Professional

●Microsoft Certified: Azure DevOps Engineer Expert

●Google Cloud Certified Cloud DevOps Engineer

Professional Experience

Client: Discover Financial Services, Riverwoods, IL Aug 2021 - Till now

Role: Senior DevOps Engineer

Responsibilities:

●Led end-to-end cloud migration initiatives, transitioning the loan management module from on-premises infrastructure to Azure and Google Cloud Platform (GCP). Redesigned cloud architectures to enhance scalability, security compliance and cost efficiency while ensuring regulatory adherence to financial industry standards.

●Developed and maintained serverless architectures using Google Cloud Functions and Azure Functions, automating critical workflows in the loan management system, improving performance and ensuring high availability in cloud-native environments.

●Automated cloud infrastructure provisioning and management using Google Cloud Deployment Manager and Azure Automation, streamlining the deployment of the loan management module and enhancing operational efficiency to support business continuity.

●Optimized cloud spending using Google Cloud Billing and Azure Cost Management, implementing cost controls, monitoring usage trends and identifying savings opportunities.

●Designed, deployed and managed containerized applications using Docker and Kubernetes on GKE and AKS, ensuring scalability, high availability and efficient orchestration of microservices.

●Implemented and maintained configuration management using Chef and SaltStack, automating infrastructure provisioning, enforcing compliance and improving system reliability.

●Facilitated cross-functional collaboration and incident response using Microsoft Teams, streamlining communication for efficient troubleshooting and decision-making in production environments.

●Managed container images using Docker Hub, Google Container Registry (GCR) and Azure Container Registry (ACR), ensuring secure storage, version control and seamless deployment in CI/CD pipelines.

●Managed artifact storage and distribution using JFrog Artifactory, ensuring efficient dependency management, version control and integration with CI/CD pipelines.

●Integrated and enforced code quality standards using ESLint, automating static code analysis to improve maintainability, security and development efficiency.

●Maintained technical documentation and knowledge sharing using Confluence, improving collaboration, onboarding and incident resolution efficiency.

●Implemented Canary Deployment strategies using Azure Traffic Manager, gradually rolling out releases to minimize risk, ensure system stability and enable real-time performance monitoring.

●Designed and automated CI/CD pipelines with Jenkins, GitHub Actions, GitLab CI/CD and Azure DevOps, optimizing build, test, deployment and infrastructure workflows for speed and reliability.

●Integrated continuous testing into CI/CD pipelines using Selenium and TestNG, automating end-to-end testing to ensure application stability, performance and reliability.

●Implemented and managed Infrastructure as Code (IaC) using Terraform, automating provisioning, improving infrastructure scalability and ensuring configuration consistency across cloud environments.

●Developed and maintained monitoring and logging solutions using Prometheus and Grafana, enabling real-time metrics collection, proactive alerting and performance tuning for cloud infrastructure and applications.

●Implemented and managed log management solutions using ELK Stack (Elasticsearch, Logstash and Kibana) and Splunk, enabling centralized logging, real-time analysis and proactive incident detection.

●Managed and optimized data storage solutions using Cassandra and Google Cloud Firestore, ensuring high availability, scalability and performance for distributed systems.

●Implemented monitoring and observability solutions using Google Cloud Operations Suite, enabling real-time performance tracking, log analysis and proactive incident response for cloud infrastructure.

●Implemented container security scanning using TwistLock, identifying vulnerabilities, enforcing compliance policies and strengthening cloud-native application security.

●Managed secrets and sensitive data using Azure Key Vault, ensuring secure storage, access control and compliance with encryption standards.

●Managed security and identity access using Google Cloud IAM and Azure Active Directory (AAD), enforcing least privilege principles, role-based access control (RBAC) and compliance policies.

●Implemented and managed messaging systems using Google Cloud Pub/Sub and RabbitMQ, optimizing event-driven architectures for scalability, reliability and fault tolerance.

●Developed automation scripts and infrastructure tooling using Python, optimizing deployment processes, system monitoring and configuration management.

●Managed version control and collaborative development using GitLab and Google Cloud Source Repositories, ensuring code integrity, CI/CD integration and efficient release management.

Client: Anthem, Indianapolis, IN Sept 2018 - Aug 2021

Role: Senior Site Reliability Engineering (SRE)

Responsibilities:

●Led the migration of infrastructure from Microsoft Azure to AWS, ensuring a seamless transition with minimal downtime while optimizing performance for health insurance applications.

●Designed and implemented AWS-native frameworks to replace Azure components, enhancing scalability, security and cost-efficiency for Eligibility & Benefits, Claims Adjudication and Prior Authorization systems.

●Automated infrastructure provisioning and configuration management using Terraform, CloudFormation and AWS CDK, ensuring compliance, consistency and reducing manual efforts in managing cloud-based Health Plan Administration, Billing & Payments and Utilization Management platforms.

●Established inter-cloud networking frameworks leveraging VPC Peering, Transit Gateway and hybrid connectivity to maintain seamless service communication during the transition.

●Orchestrated deployment and management of applications through AWS ECS, EKS, Lambda and serverless technologies, ensuring resilience and uptime.

●Developed and maintained software delivery pipelines with Jenkins, GitLab CI/CD and AWS CodePipeline, streamlining the release and deployment process.

●Containerized applications using Docker, Kubernetes, Helm and AWS Fargate to improve scalability and resource efficiency.

●Enforced security and compliance automation using AWS IAM, Security Groups, KMS, GuardDuty and AWS Config, aligning with SOC2, HIPAA and GDPR guidelines.

●Established centralized logging and monitoring utilizing AWS CloudWatch, ELK Stack (Elasticsearch, Logstash and Kibana), Fluentd and Prometheus, improving system visibility.

●Designed proactive alerting mechanisms using Grafana, AWS CloudWatch Alarms, SNS and PagerDuty to mitigate incidents efficiently.

●Optimized data storage and retrieval solutions using AWS S3, EBS, EFS and RDS, ensuring data integrity, availability and cost control.

●Integrated messaging and queuing systems with AWS SQS and RabbitMQ to enhance event-driven workflows and asynchronous processing.

●Improved database performance and scalability for Amazon Aurora, PostgreSQL and DynamoDB through indexing, caching and partitioning techniques.

●Automated security compliance monitoring and vulnerability assessments via AWS Inspector, AWS Security Hub and AWS Lambda, proactively reducing risks.

●Managed sensitive data and credentials using AWS Secrets Manager, HashiCorp Vault and SSM Parameter Store, bolstering security.

●Implemented progressive deployment methodologies such as Blue-Green and Canary releases to ensure seamless software updates.

●Architected, managed and optimized cloud environments on AWS and Azure, ensuring scalability, cost-effectiveness and high availability while monitoring cloud spending through Azure Cost Management and AWS Cost Explorer to optimize resource allocation and implement cost-saving strategies.

●Administered software artifact repositories through Nexus, JFrog Artifactory and AWS CodeArtifact, ensuring secure and efficient software distribution.

●Monitored and optimized cloud expenditure using AWS Cost Explorer, AWS Budgets and automation scripts to reduce operational costs.

●Developed and maintained automation scripts leveraging Bash, Python and PowerShell to streamline routine DevOps processes and system health checks.

●Documented cloud architectures, migration procedures and automation strategies using Confluence, Notion and Markdown for efficient knowledge sharing.

●Devised disaster recovery and backup mechanisms with AWS Backup, S3 Versioning, RDS Snapshots and multi-region failover strategies.

●Fostered cross-functional collaboration between development, operations and security teams to enhance DevOps, automation and cloud security practices.

Client: NatWest Group, San Francisco, CA Jul 2016 - Sept 2018

Role: DevOps Engineer

Responsibilities:

●Streamlined infrastructure provisioning within the Corporate & Institutional Banking (CIB) domain using CloudFormation templates, optimizing deployment workflows across diverse banking environments and reducing manual intervention.

●Ensured consistent and compliant server configurations by leveraging Ansible for configuration management, minimizing configuration drift and enhancing system stability within CIB’s critical banking operations.

●Automated Cassandra database provisioning, scaling and monitoring by integrating Ansible and Terraform within CIB’s infrastructure, ensuring seamless alignment with CI/CD pipelines for high-performance, scalable data management to support critical banking applications.

●Designed, deployed and managed highly available and secure cloud environments on AWS, utilizing services such as AWS S3 for scalable storage, AWS FSx for high-performance file systems, AWS IAM for access control and VPC for advanced networking solutions.

●Configured robust and secure networking architectures within Amazon VPC, ensuring compliance with industry best practices and enhancing the security of cloud-based applications.

●Developed and implemented advanced security policies with AWS IAM, bolstered by HAProxy integration for secure, high-availability load balancing.

●Enhanced application security by managing sensitive credentials with AWS Secrets Manager, effectively reducing the risk of unauthorized access.

●Established and maintained efficient CI/CD pipelines using Jenkins and Argo CD, integrating with AWS CodeDeploy to streamline application delivery timelines, ensure consistent deployments and enable seamless updates across environments.

●Streamlined the deployment process by utilizing Artifactory to manage and store binaries, build artifacts and dependencies, enabling seamless application rollouts.

●Managed version-controlled repositories in Bitbucket, implementing robust branching strategies, structured merging workflows and rigorous code review processes to foster collaboration and ensure code quality.

●Improved real-time incident response and deployment efficiency by integrating Slack with monitoring tools and CI/CD workflows, fostering team collaboration and reducing downtime.

●Orchestrated and deployed containerized applications using Docker and Kubernetes, enabling scalability, fault tolerance and streamlined management of distributed systems.

●Monitored system health and application performance using Grafana, providing actionable insights to improve system reliability.

●Implemented centralized logging solutions with ELK Stack (Elasticsearch, Logstash and Kibana), enabling efficient log analysis and troubleshooting.

●Automated builds and managed dependencies with Maven to ensure consistent and error-free software builds.

●Conducted code quality and security analysis using CodeClimate, ensuring adherence to best practices and identifying vulnerabilities.

●Designed and implemented efficient backend solutions using Python, ensuring scalable and maintainable code for deployment pipelines.

Client: Health First, New york,NY Dec 2013 - July 2016

Role: DevOps Engineer

Responsibilities:

●Designed, provisioned and managed infrastructure using Terraform, ensuring scalable, reproducible and secure environments.

●Implemented configuration management solutions using SaltStack to automate the deployment and configuration of systems and services, incorporating AWS Systems Manager for centralized management and orchestration.

●Automated database provisioning, monitoring and scaling for MS SQL on AWS RDS using Terraform, integrating it into CI/CD pipelines to ensure reliability, performance and seamless database management.

●Developed and maintained automation scripts and tools using Go and Python to streamline workflows and infrastructure management.

●Designed and managed cloud infrastructure on AWS, leveraging services such as AWS S3 for storage, AWS Lambda for serverless computing and VPC for networking.

●Configured and managed secrets and sensitive information securely using HashiCorp Vault, implementing robust access control mechanisms.

●Secured and managed credentials, tokens and sensitive data using HashiCorp Vault to ensure compliance and mitigate risks.

●Built and maintained CI/CD pipelines using GitLab CI/CD, integrating with AWS CodeDeploy to enable rapid, reliable and secure deployment of code to production environments across cloud infrastructure.

●Established and documented CI/CD workflows and infrastructure provisioning processes in Confluence, improving collaboration and transparency across teams.

●Automated and secured file transfers using FTP/SFTP protocols, integrating them into CI/CD pipelines for seamless data exchange between systems.

●Deployed, managed and scaled containerized applications using OpenShift, ensuring seamless orchestration and operational efficiency.

●Set up and managed monitoring systems with Prometheus to track system metrics, identify bottlenecks and ensure optimal performance.

●Implemented and maintained logging solutions using Splunk for log aggregation, analysis and troubleshooting across environments.

●Managed source code and versioning using Git, implementing branching strategies and maintaining repositories for efficient collaboration.

●Documented incident response playbooks in Confluence, enabling teams to respond quickly to incidents with clear and structured guidelines.

Client: PB Systems, India Jan 2013 - Oct 2013

Role: Linux Administrator

Responsibilities:

●Set up, configured and maintained Linux servers, including hardware and software installations and managed package tools such as apt and yum, along with initial system configuration.

●Configured and managed network interfaces, firewalls and VPNs to secure Linux systems and ensure network reliability.

●Applied system patches, kernel updates and managed software upgrades using tools such as yum, dnf, apt-get and unattended-upgrades, maintaining system security and performance.

●Monitored system performance using Grafana and implemented performance optimization strategies with utilities such as htop, vmstat and iotop.

●Diagnosed and resolved hardware, software and system-related issues using tools such as dmesg and tcpdump, ensuring minimal downtime.

●Leveraged configuration management tools such as Chef to automate complex and repetitive tasks, streamline system configuration processes and ensure consistency across environments.

●Developed custom scripts in Python to automate system administration tasks such as log rotation, user account management and process monitoring.

●Developed, implemented and managed backup and recovery procedures using Veeam, minimizing data loss and ensuring business continuity.

●Collected and analyzed system logs using Graylog to identify potential security threats or performance bottlenecks.

●Collaborated with development, security and operations teams to support application deployment by optimizing server configurations, managing build tools such as Jenkins and ensuring system scalability for CI/CD pipelines.

●Documented system configurations, standard operating procedures and troubleshooting guides using Confluence, ensuring streamlined operations and team collaboration.

Education

●Bachelor of Computer Science.



Contact this candidate