Keerthi Ambati
DevOps/Cloud Engineer
Mail Id:*******.**********@*****.*** Phone: 908-***-****
PROFESSIONAL SUMMARY
●Professional 10+ years of IT experience in optimizing and streamlining the software development and deployment process,6+years of Extensive experience in cloud platforms like AWS, Azure and in Systems Administration, Continuous Integration, Continuous Deployment and Software Configuration Management (SCM),4 years in Linux Administration.
●Expertise in management performing duties such as monitoring, automation, deployment, documenting, support, and troubleshooting.
●Excellent experience in documenting and automating the build and release process, used Stack Driver and AWS cloud monitoring extensively to monitor and debug the cloud-based AWS EC2 services.
●Experienced in writing CloudFormation templates to automate AWS environment creation along with the ability to deploy on AWS, using build scripts (AWS CLI) and automating solutions using Shell and Python.
●Experienced with AWS Lambda, AWS CodePipeline, and Using Python including Boto3 to supplement automation provided by Ansible and Terraform for tasks such as encrypting EBS volumes backing AMIs, and scheduling Lambda functions for routine AWS tasks.
●Experienced in using Repository Managers like Nexus/JFrog, integrating container scanning with JFrog X-Ray, and, implementing Terraform Modules for Azure VM, AKS Cluster, Azure Data Factory integrated with Azure DevOps.
●Experienced in leveraging cloud technologies to develop scalable and resilient applications for cloud environments, ensuring high performance, reliability, and security.
●Highly skilled in Building and Releasing Microservices Enterprise Applications for both Non-Prod and Prod Environments, using CI/CD Jenkins, Azure DevOps Pipelines, and GitHub Actions and deploying them to AKS Clusters using HELM Charts.
●Extensively worked on Jenkins for continuous integration and End-to-End automation for all builds and deployments, created Jenkins Pipelines using Groovy script to automate the deployment process.
●Extensively used build utilities like Maven, ANT, and Gradle for the building of .jar, .war and. ear files.
●Highly skilled in the principles and best practices of Software Configuration Management (SCM).
●Experienced in Worked with Ansible Playbooks for virtual and physical instance provisioning, Configuration management, patching, and software deployment.
●Expertise in server provisioning, automation (Puppet/Chef/Ruby), maintenance, and performance tuning Hands-on experience in installing and administering CI tools like Hudson/Jenkins, TeamCity, and Bamboo.
●Experience in monitoring tools like Nagios, Splunk, and Syslog.
●Implemented Terraform modules for deployment of various applications across multiple cloud providers and managing infrastructure.
●Experienced in implemented monitoring with Grafana visualization infrastructure in the Kubernetes cluster.
●Experienced in Implemented ANSIBLE Playbooks for Linux and Windows Host server management and automation of server configurations.
●Team player with excellent interpersonal skills, self-motivated, dedicated, and understanding of the demands of 24/7 system maintenance and has good customer support experience
EDUCATION:
●Bachelor’ s in computer science from RGUKT Basar in 2013.
AREAS OF EXPERTISE
Cloud Platforms
Microsoft Azure, AWS Cloud.
AWS Services
RDS, EC2, VPC, IAM, Cloud Formation, EBS, S3, ELB, Auto Scaling, Cloud Trial, SQS, SNS, SWF, Cloud Watch.
Azure Services
App Services, Key vault, function app, Blob storage, Azure Active Directory (Azure AD), Service Bus, Azure Container Registry (ACR) and Azure Kubernetes Service (AKS), Azure SQL, Azure Cosmos DB.
ArtiFactory
Jfrog and Nexus
Web Servers
Nginx
Documentation
Confluence, SharePoint
Operating Systems
Microsoft Windows XP/ 2000, Linux, UNIX.
Tracking Tools
Jira, Azure Boards, Service now.
Code Scanning
SonarQube, Jfrog X-ray, ECR Inspector
Databases
RDS, Cosmos DB, My SQL DB.
Logging
Cloud Watch, Cloud Trail, Azure App Insights, Azure Monitor
Version Control Tools
GIT, Bit Bucket, Azure Repos
CI/CD
Jenkins, ADO, GitHub Actions
Configuration
Ansible
Container Platforms
Docker, Kubernetes, OpenShift.
Monitoring Tools
Nagios, Splunk, Dynatrace
Languages
Shell scripting, Java, Python, Selenium
CERTIFICATIONS:
●Microsoft Certified Azure Administrator
●Certified Kubernetes Administrator
●AWS Developer – Associate
WORK EXPERIENCE:
Client: Amex, Arizona Aug 2023 to till date
Role: DevOps/Infrastructure Engineer
●Worked on Managing Windows and Linux servers, resolved IP issues, and supported multiple application teams.
●Worked on providing on-call support 24*7 weekends for the migration.
●Proficient in Installing agents like Microsoft Monitoring Agent MMA, and OMS Linux agents using Automation scripts.
●Worked on demonstrated extensive expertise in Azure App Services, including configuration, log management, and effective use of Kudu.
●Proficient in utilizing various Azure cloud services such as Azure Virtual Machines, Azure App Service, Azure Functions, Azure Logic Apps, Azure Storage, and Azure SQL Database.
●Experience in deploying .NET applications on various platforms, as well as integrating CI/CD pipelines using tools like Azure DevOps, and GitHub Actions.
●Proficiently executed Azure (IaaS) migrations, including creating Azure VMs, storage accounts, VHDs, and storage pools, and migrating on-premises servers to Azure. Also, implemented availability sets and ensured VM hardening and disk encryption using the KEK key in MS Azure.
●Worked on Utilizing Docker and container orchestration platforms like Kubernetes to containerize applications and scale them efficiently.
●Worked on developing and managing infrastructure as code (IaC) using tools such as Terraform, CloudFormation, or Ansible to automate infrastructure provisioning and management.
●experience in deploying pipelines to hyperconverged infrastructure, preferably with Azure Local.
●Worked on efficiently collecting, storing, analyzing logs generated by containerization applications using clusters.
●Worked on implemented monitoring and alerting mechanisms for Databricks workloads and resources. This includes setting up monitoring tools, defining metrics and alerts, and integrating with Azure Monitor or other monitoring solutions.
●Worked on built centralized logging and monitoring systems using Azure Log Analytics, Google Cloud Logging, and ELK Stack, improving system observability and troubleshooting efficiency.
●Create and maintain clear and comprehensive documentation for infrastructure, release procedures, configurations, and troubleshooting steps to facilitate knowledge sharing and collaboration within the team.
Environment: Azure, Azure DevOps, Active directory users, and computers, Azure CLI, ACS, AKS, Datadog, Git, Maven, Splunk, Nagios, Java/J2EE, Linux, Cosmos DB, RedHat, Ubuntu, Microsoft SQL Server, Microsoft Visual Studio, Data Factory, Docker, Kubernetes, Function Apps, Web Apps, Automation Accounts, Logic Apps, IAM, Windows PowerShell, JSON, GitHub Actions, Ansible, YAN
Client: Charter Communications, Denver Jan 2020 To Feb 2022
Role: SRE Engineer
●Worked on Manage and hire in-house SREs and contractor resources, handle and prioritize incidents, ensuring timely resolution and effective communication.
●Establish and manage key metrics for reliability, set up and maintain alerting systems, automate tasks and manage infrastructure using Infrastructure as Code (IaC) tools and techniques.
●Worked on ensuring application scalability and identifying performance bottlenecks to optimize system performance.
●Integrated GitLab with monitoring tools like Prometheus and Grafana, enabling automated deployments and monitoring in a seamless DevOps pipeline.
●Configured AWS CloudWatch as a telemetry export destination for OpenTelemetry metrics and traces, enabling unified monitoring across cloud-native and legacy systems.
●Implemented secure handling of application secrets by storing them in AWS Secrets Manager, ensuring sensitive data remains protected and accessible only to authorized users and services.
●Developed Python scripts to automate the deployment and management of AWS CloudFormation stacks, streamlining infrastructure as code.
●Deployed and monitored AI/ML models in production using Kubernetes, ensuring real-time scalability and performance.
●Implemented secure file transfer mechanisms and automated backups using rsync, scp, and cron on Debian-based servers.
●Orchestrated the end-to-end deployment process from Dockerization to deployment on EKS using CI/CD pipelines orchestrated with GitLab, enabling automated testing, integration, and deployment of applications with minimal manual intervention.
●Set up and manage monitoring and logging solutions using Prometheus, Grafana, and the ELK stack for OpenShift clusters.
●Developed and optimized shell scripts to manage cron jobs, log rotation, disk utilization, and system performance monitoring across Debian servers.
●Worked on define, measure, and maintain SLOs and SLAs to meet service performance expectations, ensure the security of applications through best practices and conduct regular penetration tests to identify and mitigate vulnerabilities.
●Worked on utilized Terraform providers to integrate with third-party services like Datadog, ELK, GitHub, and monitoring tools, enabling end-to-end observability and automation.
●Proficient in infrastructure as code (IaC) practices using Terraform to automate the provisioning, configuration, and management of AWS resources, resulting in increased efficiency and consistency.
●Designed and implemented Terraform scripts to create and manage AWS infrastructure components such as AWS services VPCs, subnets, EC2 instances, RDS databases, IAM roles, and security groups, ensuring adherence to best practices and security standards.
●Automated log aggregation and monitoring by setting up Splunk connectors for AWS CloudWatch, S3, and Kubernetes, ensuring comprehensive tracking of infrastructure events and application logs for faster incident response.
●Built and managed ELK pipelines for centralized log aggregation and analysis, supporting multi-environment setups across AWS and OpenShift.
●Worked on lead problem tickets and improvements to major software components, systems, and features to improve the availability, scalability, latency, and efficiency of the Symbotic System.
●Experienced in troubleshooting of VMware, Kubernetes, Custom Software, and infrastructure performance incidents.
●Worked on configured custom alerts and thresholds in Dynatrace to proactively detect and address performance issues.
●Created custom Dynatrace dashboards to visualize key performance indicators (KPIs) and provide actionable insights to stakeholders.
●Managed Linux servers via CLI, including software installation, process monitoring, and network configuration
●Customized Grafana dashboards using Grafana Query Language (GQL) and JSON data sources to create dynamic and interactive visualizations tailored to specific monitoring requirements and user preferences.
Environment: Microsoft Windows Azure, Windows Server 2008/2012R2/2016, RedHat, Ubuntu, Microsoft SQL Server, Microsoft Visual Studio, Data Factory, Jenkins, Docker, Kubernetes, Ansible, Terraform, Function Apps (.Net Core and C#), Web Apps, Automation Accounts, Logic Apps, IAM, Windows PowerShell, JSON, Shell, Pester.
Client: JP Morgan Chase, Plano, TX, USA Feb 2017 To Dec 2019
Role: Software Engineer
●Worked on Agile Methodology and tools area (Code Management and build and release automation and service and incident management.
●Worked on build and maintain APIs, databases, and server-side logic with technologies like Node.js, Python, Ruby, or .Net.
●Worked on design, deploy, and manage AWS infrastructure using services like EC2, S3, VPC, RDS, Lambda, and more.
●Worked on implementing and maintaining infrastructure as code (IaC) using tools like AWS CloudFormation or Terraform.
●Worked on automated cloud infrastructure and application deployments using CI/CD pipelines (e.g., AWS CodePipeline, CodeBuild).
●Worked on monitor and troubleshoot cloud infrastructure and application performance using Amazon CloudWatch.
●Worked on implementing and maintaining security best practices within the AWS environment, including IAM, security groups, and network access control lists (NACLs).
●Experienced in collaboration with development teams to optimize applications for cloud deployment and scalability.
●Worked on a managed service that makes it easy for you to use Kubernetes on AWS without needing to install and operate the Kubernetes control plane, manage and maintain containerized applications using Amazon ECS or EKS.
●Implemented AWS Elasticsearch Cluster using Terraform Module. Developed Python code by using Boto3 and NumPy Libraries to Onboard all the Team projects from the GitLab to check the metrics in the Kibana dashboard for log analytics and integrated with AWS lambda and CloudWatch logs by setting up Cron jobs.
●Integrated AWS DynamoDB with Lambda to scan the Onboarding and Decommission Accounts, created Cron jobs for initiating our daily batch data pulls, executed our continuous integration tests done under Gitlab CICD & backed up the DynamoDB streams.
●Worked on Black Duck to scan the Vulnerability for security risk, license risk, operational risk, etc, Integrated AWS Lambda with Amazon Kinesis. Worked on AWS Data Pipeline to configure data loads from S3 to Redshift.
●Worked on AWS API Gateway for custom domain and Record sets in Amazon Route53 for applications hosted in AWS Environment. Utilized Cloud Watch to monitor resources such as EC2, CPU memory, Amazon RDS DB services, and Elastic Block Store (EBS) volumes to set alarms for notification or automated actions; and to monitor logs for a better understanding and operation of the system.
●Worked on creating Ansible roles in YAML and defined tasks, variables, files, handlers, and templates. Configured the Ansible files for parallel deployment in Ansible for automating the Continuous delivery process and used Ansible for configuring and
●managing multi-node configuration management over SSH and PowerShell.
●Utilized Azure Data Explorer (formerly known as Kusto) for real-time analytics and log data analysis, enabling insights
●Managed security groups on AWS focusing on high availability, fault-tolerance, and auto-scaling using Terraform templates. Utilized TLS/SSL Certificates and Encryption using AWS Certificate Manager, AWS KMS, and Vault.
●Leveraged Terraform and Ansible for provisioning and managing GCP resources, ensuring consistent and scalable infrastructure.
Environment: AWS (Lambda, DynamoDB, AWS CodePipeline, Code Star, RedShift, CloudWatch, ELK, S3, CloudFormation), Terraform, Kubernetes, Docker, GitLab CICD, Java, .Net, Git, Python, YAML, Nagios, JIRA, Linux, SDLC.
Client: Qualcomm, Hyderabad, India Feb 2015 To May 2016
Role: Cloud DevOps Engineer
Design, implement, and manage scalable, secure, and resilient cloud infrastructure on AWS using services like EC2, S3, RDS, Lambda, and VPC
Automate infrastructure provisioning and configuration management using tools like AWS CloudFormation, Terraform, and AWS CDK
Build and maintain CI/CD pipelines using AWS CodePipeline, CodeBuild, CodeDeploy, and GitHub Actions or Jenkins
Monitor cloud infrastructure and applications using AWS CloudWatch, CloudTrail, X-Ray, and third-party observability tools like Datadog or New Relic
Optimize AWS cloud costs through resource right-sizing, Reserved Instances/Savings Plans, and cost monitoring via AWS Cost Explorer and AWS Budgets
Ensure security best practices by managing IAM roles/policies, enabling encryption, configuring security groups, and implementing GuardDuty/Inspector
Implement containerized application workflows using Docker and manage container orchestration on Amazon ECS or EKS (Kubernetes)
Create and manage scalable serverless architectures using AWS Lambda, Step Functions, API Gateway, and DynamoDB
Configure and manage logging and alerting systems using CloudWatch Logs, SNS, and third-party integrations
Perform regular backups and disaster recovery strategies using AWS Backup, S3 versioning/lifecycle policies, and cross-region replication
Manage and automate deployments using blue-green or canary strategies to minimize downtime and risk
Enforce compliance and governance using AWS Config, AWS Organizations, SCPs, and tagging policies
Troubleshoot infrastructure and application issues in production using diagnostic tools and log analysis
Collaborate with development teams to integrate DevOps practices into the SDLC and improve deployment velocity
Stay up-to-date with the latest AWS services and DevOps tools to continuously improve infrastructure, reliability, and security
Environment: AWS, EC2, S3, RDS, Lambda, VPC, CloudFormation, Terraform, CDK, CodePipeline, CodeBuild, Jenkins, CloudWatch, X-Ray, Datadog, IAM, EKS, ECS, Docker, API Gateway, DynamoDB, SNS, AWS Backup, GuardDuty, Config, GitHub Actions, Prometheus, Kubernetes, SQS, Step Functions, AWS CLI
Client: TCS, Hyderabad, India July 2013 To Feb 2015
Role: Build &Release Engineer
Proficient in Linux System Administration & Monitoring – Manage, upgrade, and optimize Linux-based CI/CD environments, automate system tasks using Bash/Ansible, monitor system health, and troubleshoot disk, process, and log-related issues to ensure high availability.
Experience with monitoring tools (Splunk, Dynatrace, Prometheus, Grafana) for CI/CD observability.
Worked on developing and maintaining Ansible playbooks (or similar scripting solutions) for automation and system administration.
Worked on build and installation procedures of RHEL, SuSE, CentOS etc., and was able to handle multiple Unix environments.
Worked on understanding Virtualization technologies on Unix environments like KVM containers on Linux or Docker and familiar with VMware hosted environments.
Worked on develop, execute, and improve documentation for installation, configuration, hardening, and operations and maintenance tasks document activities, status, and issues worked
Worked in collaborating with VMware engineers when changes to the virtual environment are needed.
Worked on managing and implementing end-to-end Linux based infrastructure that hosts many critical core components such as Database, GRID Computing, Enterprise Job Schedulers Messaging Platforms.
Worked on Operational skills, troubleshooting Production issues and identifying Root Cause Prevention.
Worked in UNIX shell scripting languages like python, shell, bash
Extensive knowledge of Linux/Windows-based systems including hardware, software, and applications.
Worked on project Management for various UNIX/Linux/Windows system integration projects.
Worked on ensuring adherence to IT infrastructure standards, policies, and procedures.
Worked in assisting in performing root cause analysis and resolving faults and errors in systems and applications.
Worked on Assist in performing root cause analysis and resolving faults and errors in systems and applications.
Environment: UNIX, LINUX, python, shell, Bash, RHEL, CentOS, VMWare, GRID.