Post Job Free
Sign in

Devops Engineer Cloud Infrastructure

Location:
Dallas, TX, 75207
Posted:
May 29, 2025

Contact this candidate

Resume:

Manisha Gopu

972-***-**** ********@*****.***

Professional Summary

DevOps Engineer with 9 years of experience in designing, automating, and managing highly available and scalable cloud infrastructure. Strong skills in Amazon Web Services (AWS) such as IAM policy setup, security groups, VPC, Route 53, and KMS, and Microsoft services including Azure Active Directory (AAD), Azure Virtual Networks (VNet), Network Security Groups (NSGs), Azure DNS, Azure Key Vault, and Azure Storage, while following best practices for security, networking, and access control. Actively involved in cost optimization, backup planning, and disaster recovery planning in AWS and Azure environments. Additionally, I have a strong experience in deploying infrastructure as code (IaC) using Terraform, ARM templates, CloudFormation, and Ansible. Having an established history of implementing and operating CI/CD pipelines using Jenkins, GitHub Actions, and GitLab CI/CD. Enabling fast, regular, and zero-downtime deployments. Experienced in Docker containerizing an application and managing multi-container environments using Docker Compose, with experience in building, hardening, and deploying container images and Kubernetes orchestration, deploying microservices and workloads on EKS, ECS, Azure Kubernetes Service (AKS), Azure Container Instances (ACI) and self-managed clusters. I have strong experience in Site Reliability Engineering (SRE) best practices such as error budgets, incident response, monitoring and designing efficient alerting systems using Prometheus, Grafana, CloudWatch, Azure Log Analytics and ELK Stack. Skilled at designing efficient observability and logging systems to facilitate system health and speed up incident resolution. I have a strong knowledge in Linux system administration, with RHEL, CentOS, and Ubuntu, with hands-on experience in system performance optimization, management, security hardening, and system-level problem troubleshooting. Skilled in scripting with Bash, Python, and Shell to automate tasks, and optimize system provisioning and monitoring.

Education Details

●Bachelors - Kakatiya Institute of Technology and Sciences, India, 2014.

●Masters - University of North Texas, USA, 2019.

Skills

SDLC Methodologies

Waterfall, Agile/Scrum

Operating Systems

Windows, Mac, RHEL/CentOS, Ubuntu, Solaris, AIX

Cloud/IaaS/SaaS/PaaS

Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform,

VMware.

SCM, Build, CI/CD Tools

Git, SVN, Maven, Gradle, Jenkins, Harness, SonarQube

Monitoring & Alert Tools

Datadog, Prometheus, Grafana, Splunk, Nagios and ELK

Containerization/ Orchestration

Docker, Kubernetes, EKS, AKS

Database Servers

Teradata, Oracle, MS SQL Server, MySQL, MongoDB, Postgres SQL

Ticketing Tools

Atlassian JIRA, ServiceNow, Redmine, Confluence.

Scripting Languages

PowerShell, Shell, Java, Python, and SQL

Web Servers

Apache Tomcat, Apache HTTP Server, Nginx, WebSphere, Web logic

Infrastructure as Code

Terraform, Ansible, Docker

Professional Experience

Optional Clearing Corporation – Dallas, TX Mar 2024 – Till Date Role: DevOps Engineer

●Designed and configured AWS services: EC2, S3, ELB, ECS, RDS, VPC, Route53, CloudWatch, Lambda, and IAM.

●Set up secure VPC environments with public/private subnets, NACLs, route tables, and security groups for traffic control.

●Automated infrastructure deployments with Terraform for AWS resources like EC2, VPC, ELB, and IAM.

●Deployed Kubernetes clusters using Terraform scripts and AWS EKS for containerized applications.

●Configured Docker containers, built serverless computing engines with AWS Fargate, and managed repositories in ECR.

●Used CloudWatch for event monitoring and SNS for alert notifications. Created Lambda jobs to integrate

CloudWatch logs into ELK.

●Managed application rollouts using Kubernetes rolling updates and adjusted pod definitions for deployment.

●Worked with product and development teams to understand business requirements for the Java application and defined technical specifications.

●Developed a Java application using Spring Boot to create RESTful web services with CRUD operations connected to a database (e.g., MySQL or PostgreSQL).

●Designed and implemented end-to-end CI/CD pipelines on AWS using Jenkins, GitHub Actions, and GitLab CI to automate code integration, testing, and deployment.

●Configured Secrets Management using AWS Secrets Manager and HashiCorp Vault in deployment pipelines.

●Automated infrastructure provisioning using Terraform integrated into CI/CD workflows for consistent environment setup on AWS.

●Deployed containerized applications on Amazon EKS and ECS using CI/CD pipelines, ensuring zero- downtime rollouts and blue-green deployments.

●Implemented CI/CD pipeline monitoring with CloudWatch, Prometheus, and Grafana for rapid incident detection and resolution.

●Managed Helm based Kubernetes deployments using CI/CD pipelines for versioned, repeatable application releases.

●Dockerized the Java application by creating a Docker file and pushing the container image to Amazon ECR.

●Set up an EKS cluster using Terraform to manage Kubernetes workloads and deploy Helm charts for standard services like Ingress, Prometheus, and Grafana.

●Set up Jenkins pipelines for continuous integration (build and test) and continuous delivery (deploy to Kubernetes), integrating with GitHub for source code management. Configured automated testing for the Java application during the CI phase in Jenkins.

●Implemented GitOps with Argo CD to synchronize Kubernetes manifests from GitHub to the EKS cluster during deployment.

●Built a pipeline that pulls code from GitHub, builds the Java application using Maven/Gradle, and runs unit and integration tests.

●Used Helm charts to deploy the Dockerized Java application to EKS, integrating with Argo CD for automatic synchronization.

●Configured Kubernetes Horizontal Pod Autos Caler (HPA) to scale the application based on CPU or memory usage and set up AWS Auto Scaling for EKS worker nodes.

●Integrated Prometheus to monitor application metrics (e.g., response time, error rate) and set up Grafana

dashboards to visualize performance.

●Set up Fluentd or Logstash to collect and store logs in Elasticsearch, with Kibana for log visualization and troubleshooting.

●Configured Splunk for centralized log collection and monitoring across AWS services and created Splunk dashboards for in AWS infrastructure monitoring and alerting.

●Deployed applications using Harness for automated and secure delivery on AWS.

●Implemented Harness workflows for blue-green and canary deployments on AWS EKS/ECS.

●Integrated SonarQube into AWS CI/CD pipelines to enforce code quality and security checks while managing

SonarQube scans in Jenkins pipelines for code analysis in AWS hosted repos.

FANNIE MAE – DALLAS, TX JAN 2022 – MAR 2024

Role: Cloud Engineer

●Experienced in Azure Development, working on web applications, App Services, Azure Storage, Azure SQL Database, Virtual Machines, Azure AD, and Notification Hub.

●Deployed Infrastructure as Code (IaC) using ARM Templates via Azure Portal and PowerShell.

●Automated Azure VM provisioning and managed Azure Blob Storage, file blobs, and disks.

●Integrated and troubleshooted on-premises AD sync with Azure AD.

●Utilized Kubernetes for containerized application deployment, and scaling and developed microservices in

Java.

●Built CI/CD pipelines using Cloud Bees, Docker, and Kubernetes, leveraging Helm charts for deployments.

●Implemented secure Docker deployments by integrating image vulnerability scanning into CI/CD pipelines.

●Integrated Git with Azure DevOps for source control and automated CI/CD pipelines.

●Managed Git branching strategies to streamline Azure infrastructure and application deployments.

●Used GitHub Actions to deploy IaC and applications to Azure services like AKS and App Services.

●Automated CI/CD pipelines using Azure DevOps for cloud infrastructure, app deployment, and container management (AKS). Implemented IaC with Terraform for streamlined infrastructure provisioning.

●Created and maintained build/release pipelines for K8S and serverless apps in Azure DevOps.

●Enforced container security with Aqua, Snyk, and SonarQube.

●Integrated security scanning tools (SAST, SCA, DAST) like Check Marx, and Nexus IQ.

●Automated disaster recovery and backup processes to ensure high availability.

●Deployed Azure Load Balancer, App Gateway, and Traffic Manager for traffic optimization.

●Led migration to Azure using Terraform, automating VM, networking, and storage.

●Designed and deployed Azure services (Web Apps, VMs, SQL, Notification Hub) with ARM templates and

terraform.

●Synced AD with Azure AD, resolving sync issues. Configured secure AKS clusters with Terraform and

Azure CNI.

●Implemented AKS security with Key Vault, ACI, and Defender.

●Automated provisioning and scaling using Azure DevOps and Helm. Deployed Istio for traffic management and observability in AKS.

●Optimized AKS clusters with auto-scaling and resource quotas. Mentored teams on AKS, Terraform, and

Azure security best practices.

American Airlines – Dallas, TX Jan 2020 – Dec 2021 Role: Site Reliability Engineer

●Implemented end-to-end CI/CD pipelines using Jenkins to deploy microservices into Kubernetes, enabling consistent, zero-downtime deployments.

●Orchestrated Kubernetes infrastructure using KOPS and Helm, including automated cluster provisioning, AMI

updates, and configuration templating for seamless multi-environment deployments.

●Built and maintained observability stacks integrating Prometheus, Grafana, ELK, and Datadog for comprehensive monitoring, alerting, and real-time system health visualization.

●Engineered GitOps workflows for infrastructure and application deployments using Terraform, Ansible, and

GitLab CI/CD, streamlining environment consistency and rollback capability.

●Implemented highly available architecture on Kubernetes to ensure fault tolerance and seamless failover during deployment and node failures.

●Automated Docker image creation and multi-environment deployment with Dockerfiles, Jenkins pipelines, and private container registries to ensure rapid and secure application delivery.

●Integrated Terraform with CI/CD tools like Jenkins and GitHub Actions to manage and provision infrastructure components across AWS cloud, reducing manual intervention.

●Established performance testing workflows using JMeter and Postman, identifying key bottlenecks and improving response times under production-like load conditions.

●Deployed and managed cloud-native applications on AWS services (EKS, EC2, S3, Route 53, DynamoDB, RDS), with focus on cost optimization, scalability, and disaster recovery planning.

●Integrated Veracode scans into CI/CD pipelines to automate security compliance checks for Java and Node.js

applications, enabling secure SDLC practices.

●Utilized DynamoDB Streams and Global Tables to implement real-time event-driven processing and multi- region data replication for high availability systems.

●Used Ansible and Python to automate cluster creation, software installations (Kafka, Cassandra), and backups, improving efficiency and reducing configuration drift.

●Implemented centralized secrets management using HashiCorp Vault for managing access to sensitive credentials, tokens, and certificates across applications.

●Successfully migrated legacy applications from on-prem data center environments to Kubernetes clusters, ensuring minimal downtime and improved system resilience.

●Developed incident response playbooks and MOPs (Methods of Procedure) for staging and production environments, aligning with SRE best practices for availability and reliability.

●Integrated Splunk for centralized logging, real-time alerting, and automated log correlation to reduce MTTR

and improve incident response.

●Configured AWS CloudWatch for monitoring, alerting, and real-time dashboards across EC2, RDS, Lambda, and EKS resources.

●Utilized AWS CloudTrail for auditing API activity, ensuring compliance, and supporting incident investigations.

DXC Technology – Hyderabad, India Mar 2015 – Aug 2018 Role: Linux Administrator

●Installed, configured, and maintained RHEL, CentOS, and Ubuntu servers on physical and virtualized environments.

●Utilized Bash and Python scripting to automate day-to-day admin work, improving time efficiency and reducing human errors.

●Configured system integration with LDAP and Active Directory for a centralized user administration.

●Performed system updates, patching, and kernel updates from time-to-time package managers like yum, dnf, and apt.

●Implemented and secured essential services such as Apache, MySQL, DNS, and NFS, according to enterprise security standards.

●Monitored system performance using Nagios and Zabbix, and tuned servers to use resources efficiently.

●Conducted security hardening of Linux systems including SELinux configuration, iptables/firewalld, and SSH

policy management.

●Handled disk storage using LVM, file system enlargements, and supported RAID configurations for troubleshooting.

●Automated routine processes such as backups and log rotation using cron jobs to ensure consistency and reliability of processes.

●Deployed and managed virtual machines on VMWARE and KVM for test and live environments.

●Maintained detailed documents for systems, processes, and troubleshooting guides for audit support as well as disaster recovery procedures.

●Used Git for version control of configuration files and infrastructure scripts, ensuring trackability and rollback.

●Supported incident, change, and problem management processes as part of 24/7 production support team.



Contact this candidate