Post Job Free

Resume

Sign in

Devops Engineer Site Reliability

Location:
Chicago, IL, 60616
Posted:
January 15, 2024

Contact this candidate

Resume:

Mahesh T

+1-872-***-**** ad2spa@r.postjobfree.com

LinkedIn : www.linkedin.com/in/mahesh-t-4690a7281

PROFESSIONAL SUMMARY: Accomplished Cloud Site Reliability & DevOps Engineer with expertise in AWS cloud, building automated infrastructure, CI/CD pipelines, and resilient applications. Equipped with strong technical knowledge and cross-functional collaboration abilities to boost velocity, reliability, and innovation.

●Strong experience in AWS, including EC2, S3, EBS, VPC, ELB, AMI, SNS, RDS, IAM, Route 53, Auto Scaling, CloudFront, CloudWatch, CloudTrail, CloudFormation, OpsWorks, Security Groups.

●Experience in Google Cloud Platform (GCP) components, container builders, GCP client libraries, and cloud SDKs.

●Worked on OpenStack manuals, Security Guide to the OpenStack Community, and GCP IoT.

●Proficient in essential DevOps tools like Chef, Vagrant, Puppet, Ansible, Docker, Subversion (SVN), Git, Hudson, Jenkins, Ant, Maven.

●Integrated Jenkins with various tools such as Nexus, SonarQube, Puppet, CA Nolio, HP ALM, and HP QTP.

●Experienced in infrastructure automation tools like CloudFormation, Terraform, Chef, Puppet, and Ansible.

●Experience with VS Build Pro, Apache Ant, Apache Tomcat, Subversion, Git, Maven, Jenkins/Hudson.

●Worked with build tools such as Apache Ant, Maven, Atlassian Bamboo, Cruise Control.

●Executed XML, Ant Scripts, Ruby, Shell Scripts, Perl Scripts, PowerShell scripts, and JavaScript.

●Designed and deployed container-based production clusters using Docker, Kubernetes, Docker Swarm, and knowledge in Apache Mesos.

●Collaborated with development support teams to set up a continuous delivery environment using Docker and Kubernetes on AWS.

●Extensive experience in installing and monitoring performance parameters through JON 2.4 and JConsole for JBoss.

●Deployed and designed pipelines through Azure Data Factory (ADF) and debugged processes for errors.

●Hands-on experience with monitoring tools like Prometheus, Grafana, Dynatrace, Nagios, AppDynamics, Datadog, Splunk.

●Worked with version control systems like GIT, Perforce, and Subversion.

●Experience in writing Docker files to build microservice applications.

●Resolved conflicts during merging branches in Bitbucket.

●Hands-on experience with Selenium, QTP, and HP Load Runner Testing Tools.

●Managed SVN repositories for branching, merging, and tagging, and developed Shell/Groovy Scripts for automation.

●Deployed Active Directory domain controllers to Microsoft Azure using Azure VPN gateway.

●Efficient collaboration with development, QA, product, and business owner teams for high-quality and timely delivery of builds and releases.

TECHNICAL SKILLS:

Operating Systems

Red Hat Enterprise Linux 4.x/5.x/6.x, Windows NT/XP/2003/2008, MacOS, Android

Cloud services

AWS - EC2, VPC, S3, IAM, CloudWatch, CloudFormation. Azure, GCP

Version Control Tools

Subversion (SVN), GIT, GitHub and Bitbucket

Scripts and Languages

Shell Script, ANT Script, Batch Script, Perl Script, C, C+, Java, SQL, Bash, PYTHON, Power Shell, Yaml, Groovy

Programming language

C, C++, SAS, SQL, Core Java

Web Technologies

HTML, JavaScript, XML, Servlets, JDBC, Cloud technologies, Nginx, SOAP

Application Server

Oracle WebLogic 8.1/9.x/10.x/12.x, Tomcat 5.x,6.x, JBoss AS 7.0/6.0/5.1,

CI Tools

Jenkins/Hudson, Anthill Pro

Web Servers

Apache Web Server 2.x/2.2.x, IIS5.x/6.x/7.x, Oracle HTTP Server, IBM HTTP Server

SCM Tool

Subversion, GIT, Tortoise SVN, Perforce, Clear case

Project Management Tool

Base Camp, MS Project, Atlassian Tools, Demandware

Build and Automation Tools

Jenkins/Hudson, Bamboo, Kafka, ANT, Dockers, Ansible, Chef, Maven, Waterfall, Kubernetes, Selenium, Junit, GitLab CI, SonarQube, Sonar, Nexus and Gradle

Defect Tracking Tools

Jira, Rally, Bug Zilla, Confluence, HP ALM/Quality center, Redmine, Junit

Network Protocols

DHCP, DNS, SNMP, SMTP, POP3, IMAP, Ethernet, Net stat, VPN, NFS, NIS, RDP, TCP/IP, TOTP, FTP, TFTP, HTTP & HTTPS

Repository managers

JFrog Artifactory, Nexus RPM, YUM, GitHub, Dockers Hub and Bitbucket

Containerization Tools

Docker, Kubernetes, Docker Swarm

Database Systems

Oracle 9i/10g-PL/SQL, MS SQL Server 2005, TOAD, SQL Navigator, SQL Plus, MS Access, DB2, Oracle database 12

Platforms

UNIX, Linux 4/5, Ubuntu, Fedora, Windows 98/NT/XP/Vista/7/8, iOS

PROFESSIONAL EXPERIENCE:

Fifth third bank, Chicago, IL

Aug 2021 – Present

AWS DevOps/ Site Reliability Engineer

Roles and Responsibilities:

●Managed AWS and Azure cloud infrastructure, optimizing a mission-critical AWS application and achieving a 20% cost reduction.

●Reduced MTTR by 20% through robust incident management framework aligned with SRE best practices of uptime, reliability optimization.

●Spearheaded IT security and compliance measures, achieving a 15% reduction in security incidents.

●Led deployment and management of Java/Spring-Boot applications on AWS, improving performance by 25%.

●Spearheaded the implementation of an advanced monitoring system using the EFK stack (Elasticsearch, Fluentd, Kibana), significantly improving troubleshooting capabilities and metrics generation.

●Achieved a 20% reduction in incident response time through the efficient correlation of logs from various sources using EFK.

●Worked on Python scripting for automating deployment and configuration tasks, streamlining operational workflows and enhancing team productivity.

●Orchestrated the implementation of distributed storage technologies in AWS, optimizing data accessibility and resilience.

●Spearheaded the integration of dynamic resource management frameworks, enhancing system scalability and performance.

●Demonstrated proficiency in leveraging AWS services to efficiently manage and store vast amounts of data.

●Applied hands-on experience with Hazelcast for efficient caching solutions, enhancing system performance.

●Emphasized practical expertise in deploying, observing, altering, logging, and monitoring systems.

●Demonstrated proficiency in leveraging Databricks for efficient data processing, resulting in improved data analytics capabilities.

●Showcased scripting skills through the implementation of automated data pipelines, streamlining data processing and analysis.

●Led observability initiatives, implementing monitoring solutions with Dynatrace to optimize system performance and ensure reliability, including integration with GCP cloud services.

●Implemented and optimized Dynatrace for performance monitoring, ensuring proactive issue detection and resolution in a complex microservices architecture.

●Spearheaded the integration of synthetic monitoring tools, enhancing the overall reliability and user experience of critical applications.

●Spearheaded the migration of on-premises applications to Azure, ensuring seamless integration with cloud-native services and optimizing performance.

●Implemented and managed CI/CD pipelines using Azure DevOps, streamlining the software development lifecycle and facilitating efficient deployment processes.

●Enhanced security measures specific to Azure environments, implementing best practices and ensuring compliance with Azure security standards.

●Demonstrated proficiency in Java, contributing to the development of applications on GCP cloud infrastructure and adhering to best practices.

●Played a key role in the migration of on-premises applications to Azure, leveraging cloud-native technologies for improved scalability and resource efficiency.

●Developed automation scripts in Java, contributing to the efficiency of routine operational tasks and promoting a culture of automation.

●Developed robust customer data pipelines on the Azure stack, optimizing data engineering processes and improving overall system efficiency.

●Reduced incident response times by 40% through optimal implementation of Dynatrace APM, establishing baselines, triaging alerts and dashboards.

●Achieved 40% better application performance by re-architecting monoliths into containerized microservices.

●Spearheaded the implementation of observability practices, integrating tools like Prometheus and Grafana into the DevOps pipeline, enhancing visibility into system performance.

●Led efforts to establish a robust incident response framework, aligning with Site Reliability Engineering (SRE) principles, resulting in a 20% reduction in mean time to resolution (MTTR).

●Collaborated with cross-functional teams to integrate machine learning models into the DevOps pipeline, optimizing processes for development, testing, and deployment.

●Modernized 3-tier web application from single Java codebase into containerized microservices improving scalability by 50%.

●Automated build, deployment of CI/CD pipelines through Jenkins and Spinnaker, reducing release cycles by 40%.

●Designed Aurora RDS clusters with parallel replication for high availability and disaster recovery.

●Leveraged Full Stack experience to contribute to end-to-end development processes.

●Orchestrated containerized environments using AWS ECS, reducing deployment times by 20%.

●Collaborated on the development of a microservices architecture for Java applications.

●Collaborated with cross-functional SRE and DevOps teams to align APM insights and reduce MTTR by 20%.

●Integrated AppDynamics, automated APM tool configurations, and streamlined incident response.

●Orchestrated the integration of comprehensive logging systems, such as ELK Stack, to centralize log data and facilitate in-depth analysis for troubleshooting and optimization.

●Championed the implementation of synthetic monitoring solutions, like Selenium, to simulate user interactions and identify potential performance bottlenecks in critical user pathways.

●Actively participated in incident post-mortems, applying observability data to drive continuous improvement and prevent recurrence of critical issues.

●Leveraged AppDynamics to streamline incident response, reducing MTTR by 20%.

●Led a critical database migration project, improving data retrieval times by 20%.

●Managed the migration from Jira on-prem to Jira Cloud, ensuring a smooth transition and training the team on new features.

●Experienced in using Bitbucket for version control, Kubernetes for container orchestration, and tools like PagerDuty and New Relic.

Environment: AWS (EC2, VPC, ELB, S3, RDS, EBS, IAM, Cloud Trail, Cloud watch, Cloud Formation, and Route 53), VDI, Linux, Ansible, Git version Control, Splunk, AWS CLI, AWS Auto Scaling, Maven, Nagios, Subversion, Jenkins, Unix/Linux, Shell scripting.

Fidelity Investments, Durham, NC.

Jan 2019 – July 2021

Sr. DevOps AWS Engineer

Roles and Responsibilities:

●Oversaw public/private AWS infrastructure, emphasizing EC2, S3, and RDS.

●Utilized Splunk, Dynatrace, and Datadog for system and application performance monitoring.

●Applied advanced Windows optimization techniques, resulting in a 25% improvement in system performance.

●Provisioned infrastructure from ground up using Terraform and Ansible for microservices, decreasing lead time by 40%.

●Engineered solutions for dynamic resource management in Azure, ensuring optimal allocation and utilization of resources.

●Leveraged Azure services to design and deploy scalable storage solutions for diverse project requirements.

●Tracked SLOs/SLIs to quantify system performance per SRE principles, optimizing reliability.

●Led the integration of Tealium within the AWS Cloud, streamlining the collection and management of customer data.

●Led the design and implementation of end-to-end CI/CD capabilities for both infrastructure and applications, leveraging Jenkins for streamlined automation.

●Improved deployment frequency by 30% through the successful implementation of a robust CI/CD pipeline.

●Utilized Databricks to enhance data processing efficiency, creating streamlined pipelines for improved analytics.

●Facilitated secure and efficient payment transactions through expertise in payment gateways.

●Highlighted practical involvement in deploying, observing, altering, logging, and monitoring systems.

●Orchestrated end-to-end CI/CD processes using Azure DevOps, automating build, test, and deployment phases to accelerate software delivery.

●Directed observability strategies using Dynatrace, focusing on optimizing system behavior and performance within a GCP cloud environment.

●Integral part of Agile development teams, promoting continuous improvement and iterative development cycles with a GCP-centric approach.

●Leveraged Java expertise for the development and optimization of applications on GCP, ensuring code quality and efficiency.

●Orchestrated the implementation of Dynatrace for comprehensive performance monitoring, enabling real-time insights into application behavior and resource utilization.

●Applied scripting languages for automation, enhancing deployment processes on GCP and contributing to operational efficiency.

●Utilized Python scripting for automating deployment and configuration tasks, streamlining operational workflows and enhancing team productivity.

●Utilized Power BI to create dynamic reporting solutions, offering stakeholders actionable insights into customer behavior and trends.

●Took a lead role in designing and implementing a comprehensive observability platform, leveraging tools such as ELK Stack, New Relic, and Jaeger to monitor, trace, and analyze system behavior.

●Integrated advanced alerting mechanisms, combining Prometheus alerts with custom scripting to provide proactive notification of potential issues, reducing downtime by 15%.

●Designed the target AWS architecture and executed phased migration approach through close stakeholder collaboration.

●Applied Full Stack expertise to streamline development workflows and improve user interfaces.

●Utilized Okta for effective identity and access management, ensuring robust security measures.

●Spearheaded the adoption of Infrastructure as Code (IaC) practices, minimizing manual errors and improving infrastructure reliability.

●Reduced AWS costs by 30% in the first year through Reserved Instances purchases and consolidation of excess capacity.

●Contributed to the design and implementation of a cloud-based data lake solution.

●Oversaw AWS resource usage, leading to a 20% cost saving on cloud expenditure.

●Conducted regular reviews and optimizations of AWS resource usage.

●Integrated security best practices into the DevOps pipeline.

●Proficient in documenting software solutions and infrastructure layouts.

●Played a pivotal role in managing production incidents, introducing streamlined incident response.

●Collaborated closely with Project Managers and other teams to ensure effective communication.

●Leveraged Bitbucket for source code management, optimizing branching and merging strategies.

Environment: Amazon Web Services (AWS), WebLogic Application server 11x/12c, Apache HTTP Server 2.0, Tomcat 6.0, JBOSS, Subversion (SVN), GIT, GIT Hub, Red Hat Linux 5.0, Docker, Kubernetes, Maven, UNIX, Linux, Chef, Jenkins, Shell, ANT, Python, SQL, JUnit, Jira, Clear Case.

Value labs, Hyderabad, India.

Jan 2016 – Nov 2017

DevOps engineer

Roles and Responsibilities:

●Proficient in creating and managing Azure infrastructure, including virtual machines, storage, Azure AD, DNS, and networking components.

●Managed and automated Azure infrastructure components, including AD, DNS, VMs, storage.

●Expertise in Azure Cloud Services (PaaS, IaaS, IoT) and GCP services.

●Pioneered the adoption of a cloud-native microservices architecture, improving system scalability and agility on AWS.

●Automated infrastructure provisioning on GCP for PoC project using Terraform and Ansible playbooks.

●Designed and implemented highly scalable and resilient infrastructure as code for the Ping Identity platform on AWS, enhancing overall system reliability.

●Executed strategic automation initiatives using Terraform, reducing manual intervention by 40% in infrastructure provisioning.

●Monitored availability and latency thresholds as per SLOs defined, ensuring rapid incident response.

●Orchestrated observability initiatives with Dynatrace, implementing monitoring solutions for effective performance management and issue resolution on GCP.

●Collaborated in Agile development teams, emphasizing GCP-specific practices and contributing to a culture of continuous improvement.

●Led APM tool evaluation exercises, implemented New Relic and AppDynamics across 4 critical apps. Improved uptime by 20%.

●Implemented Tealium for seamless integration of customer data into the AWS Cloud, ensuring a unified and centralized data management approach.

●Developed Power BI reports and analytics dashboards, providing valuable business intelligence and supporting data-driven decision-making.

●Executed Azure migrations, optimizing workloads for cloud environments and ensuring alignment with Azure architecture best practices.

●Implemented and managed CI/CD pipelines in Azure DevOps, contributing to a DevOps culture and accelerating the delivery of software solutions.

●Applied Azure-specific security measures, including Azure Security Center and Azure Policy, to strengthen the overall security posture of cloud resources.

●Utilized Java for the development and optimization of applications on GCP, adhering to coding standards and best practices.

●Spearheaded the adoption of Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to quantify and measure system performance, aligning with the principles of SRE.

●Integrated distributed tracing tools, providing end-to-end visibility into complex microservices architectures.

●Implemented comprehensive monitoring solutions using AppDynamics, resulting in a 25% reduction in incident response times.

●Collaborated with data science teams to deploy and manage machine learning workflows, improving overall model accuracy.

●Established automated DNS services across availability zones, ensuring 99.95% DNS availability.

●Developed robust CI/CD pipelines for Java/Spring-Boot applications, incorporating automated testing and quality checks.

●Championed adoption of Azure DevOps through customized training programs (1,200 engineers).

●Created shared team dashboards providing real-time visibility into testing status across projects.

●Integrated Azure DevOps with Grafana to correlate infrastructure metrics with application health.

●Integrated academic knowledge into practical problem-solving, applying a systematic and analytical approach to SRE responsibilities.

●Fine-tuned Windows environments for optimal performance, resulting in a 20% reduction in system-related incidents and improved user satisfaction.

●Standardized security controls and DNS configurations on AWS through infrastructure-as-code.

●Integrated AppDynamics alerts into AWS CloudWatch, enhancing proactive monitoring in a scalable cloud environment.

●Utilized AppDynamics alongside Datadog for comprehensive performance monitoring, resulting in a 25% reduction in critical incidents.

●Played a critical role in the integration of AWS Lambda for serverless functionalities, enhancing application scalability and reducing operational overhead.

●Automated build processes using Azure PowerShell, Ansible, Jenkins, and Terraform.

●Integrated container management using Docker, Kubernetes, OpenShift, and GCP.

●Established logging/monitoring using Elasticsearch, Kibana, Prometheus, Grafana.

Environment: GCP, Azure, Maven, Nexus, Terraform, Jenkins CI/CD, Jira, Shell, GIT, Docker, Kubernetes, Splunk, ServiceNow, GitHub, SVN, Bitbucket, Autoscaling, Load Balancers.

Accenture, Hyderabad, India.

March 2014 – Dec 2015

Build and Release Engineer / DevOps

Engineer Roles and Responsibilities:

●Implemented Jenkins for automated builds and deployments with Git support.

●Managed CI/CD using Jenkins, Maven, and Git for UNIX and Windows deployments.

●Automated ticket creation from Azure Monitor alerts using Azure DevOps APIs.

●Orchestrated a major infrastructure modernization project, migrating legacy systems to AWS ECS.

●Initiated proactive monitoring and alerting systems using AWS CloudWatch.

●Completed hands-on labs on infrastructure provisioning on GCP showcasing ability to administer GCP environments.

●Played a key role in setting up secure infrastructure for various projects, ensuring compliance with industry standards and contributing to a robust security posture.

●Configured Jenkins jobs for CI/CD pipelines with plugins like DSLplugin and parameterized Trigger plugin.

●Active participation in Agile development teams, promoting iterative development cycles and incorporating GCP best practices.

●Demonstrated Java proficiency in the development and optimization of applications on GCP, ensuring code quality and efficiency.

●Automated backup, recovery jobs for 80TB PostgreSQL database migrated from private cloud to AWS.

●Demonstrated expertise in CDP tools like Tealium, enabling effective customer data management within the AWS Cloud environment.

●Engineered customer data pipelines on Azure, optimizing data flows and enhancing overall data processing efficiency.

●Created Power BI reports and dashboards, offering stakeholders visualizations that facilitated strategic decision-making.

●Led the design and implementation of a robust log aggregation and analysis system using tools like Splunk and Logstash, enabling proactive identification and resolution of potential issues.

●Applied methodologies to enhance the reliability and scalability of containerized applications, utilizing Kubernetes for orchestration and Prometheus for monitoring.

●Implemented a comprehensive incident post-mortem process, analyzing system failures to extract valuable insights and drive continuous improvement in reliability practices.

●Leveraged Full Stack capabilities to optimize user experiences and overall system functionality.

●Optimized AWS cloud spends by 25% through collaboration with the finance department and implementation of cost-control measures.

●Conducted successful disaster recovery drills, validating the effectiveness of cloud backup and recovery strategies.

●Managed container replicas deployment on Kubernetes clusters, ensuring high availability and scalability of applications.

●Orchestrated deployment, scaling, and management of Docker Containers using Kubernetes.

●Automated container orchestration with Kubernetes and packaged apps in Docker for deployment.

●Utilized Ansible Playbooks for configuration management on OpenStack and AWS.

●Specialized in Windows OS optimization, achieving a 30% reduction in system-related incidents.

●Evangelized Azure DevOps best practices, enhancing collaboration across multiple teams.

●Collaborated with finance to optimize cloud spend, reducing overall costs by 25%.

●Managed successful disaster recovery drills, confirming the effectiveness of cloud backup strategies.

●Developed custom AppDynamics dashboards for actionable insights into application health.

●Integrated ML workflows seamlessly into the DevOps pipeline for optimized model development.

●Collaborated with the security team to integrate AppDynamics with security monitoring tools.

●Developed robust strategies for handling diverse data sets in secure, scalable pipelines.

●Tech Stack such as Jenkins, Git, Docker, Kubernetes, Ansible, AWS EC2, JBoss, Maven.

Environment: Jenkins, Python, Chef, JIRA, JUnit, Maven, Artifactory, Ansible, Ubuntu, CentOS, AWS EC2, VPC, S3, RDS, IAM, SNS, CloudWatch, CloudTrail, Route 53, Ruby, Kubernetes. Subversion, GIT, Jenkins, MY SQL, Perl Scripts, Shell scripts, GIT, Bamboo, Python, PowerShell.

EDUCATION

Masters: WESTERN ILLINOIS UNIVERSITY Computer Science Jan 2018 – Dec 2018

Bachelors: DVR & DR HS MIC COLLEGE OF TECHNOLOGY Computer Science Aug 2010 – Mar 2014

Certifications: Certified Kubernetes Administrator (CKA), Certified CloudBee Jenkins Engineer, Red Hat Certified Engineer (RHCE).



Contact this candidate