Post Job Free

Resume

Sign in

.

Location:
New York, NY
Salary:
55
Posted:
February 26, 2021

Contact this candidate

Resume:

Ram Kumar

Email: adkijr@r.postjobfree.com Phone: 703-***-****

Site Reliability Engineer

SUMMARY

Over 7+ years of IT experience as System Administrator, DevOps/Site Reliability Engineer and Production Support of various applications on Red Hat Enterprise Linux, CentOS, Debian, Ubuntu and skilled with Windows Server 2012-r2/2016 on various hardware platforms and AWS cloud and Openstack. Expert knowledge of Cloud Computing Strategies (IaaS, PaaS, SaaS)

Responsible for OpenStack project core infrastructure including code review, continuous integration systems, and developer tools.

Managed Amazon Web Services like EC2, S3, RDS, EBS, ELB, Auto-Scaling, AMI, IAM, VPC, RDS, IAM, Route 53, Lambda Functions, Cloud Formation, Python/Boto, AWS CLI and Console and API Integration.

Configured Amazon Virtual Private Cloud (Amazon VPC), created required security groups, network ACLs and firewall rules, AWS Identity and Access Management (IAM) to manage users, groups, roles and policies.

In depth Knowledge of AWS cloud service like Compute, Network, Storage and Identity & access management. Hands-on Experience in configuration of Network architecture on AWS with VPC, Subnets, Internet gateway, NAT, Route table.

Hands-on experience in writing Ansible Playbook Roles, inventory files from the scratch and sharing the roles on Ansible Galaxy, role management, reviewed the Playbook language features and creating Reusable Playbooks.

Experience in working on source control tools like SVN, CVS, Bitbucket and GIT. Strong knowledge on SCM concepts like Branches, Merges and Tags.

Expertise in migration and upgradation of WebLogic/Tomcat/WebSphere Server and updating JDK versions, applying patches and installing service packs for WebLogic/tomcat server.

Experience working on several Docker components like Docker Engine, Docker Hub, volumes, creating Docker images, Compose, Docker Registry and handling multiple images primarily for middleware installations and domain configurations.

Created Kubernetes Custom Cluster from Scratch designing and preparing which involves several steps like creating nodes, Networks, security models, preparing certs, credentials, Docker, Bootstrapping the cluster, API server, Controller Manager, Scheduler, starting cluster Services.

Creation of Pods through new application and control the scaling of pods, troubleshooting pods through ssh and logs, writing/modification of Buildconfigs, templates, Image streams etc.

Expertise in writing Shell scripts using ksh, Bash, and python for connecting to databases, applications, backup and scheduling. And expert in setting up SSH, SCP, SFTP connectivity between Linux hosts.

Experienced in DNS, NFS, CIFS, FTP, NIS, Samba Server, Tomcat & Apache servers, LDAP, remote access, security management, and system troubleshooting skills.

Good understanding of major professional audit frameworks and standards (NIST, ISO 27001, ITIL, COBIT, PCI-DSS, etc.)

Proficient in configuring Kickstart servers to initiate installation of Red hat Linux on several machines at once.

Experience in Storage, Disk Management, Logical Volume Management (LVM) and logical partitioning and maintaining file systems and creating NFS.

Expert on Monitoring and Log Analysis with CloudTrail to monitor resources such as EC2, CPU memory, Amazon RDS DB Services, EBS volumes and configure Alarms.

Involved in the SOX Auditing process, evaluating the internal control measures and procedures and report on areas of noncompliance.

Monitoring the server’s performances, network, cpu, memory, health checks using Nagios, Splunk, Sensu.

Involved in setting up JIRA as defect tracking system and configured various workflows.

Assist with design of core scripts to automate SPLUNK maintenance and alerting tasks. Support SPLUNK on UNIX, Linux and Windows-based platforms. Assist with automation of processes and procedures.

EDUCATION

Bachelor’s Degree: EEE, 2013, JNTUH, Hyderabad

TECHNICAL SKILLS

Operating Systems

RHEL/CentOS, Ubuntu/Debian/Fedora, Windows Server

Build/Automation Tools

Ansible, Puppet, Ant, Maven, Jenkins, Hudson &Bambo

Languages

Shell, Bash, Ruby and Python scripting

RDMS: databases

MySQL, PostgreSQL, SQL Server,

NOSQL: data bases

Cassandra, MongoDB, Redis, RabbitMQ, Elasticsearch

Web/App Server

Apache, IIS, IHS, Tomcat, WebSphere Application Server, JBoss

Bug Tracking Tools

JIRA, Fisheye, Crucible, Rally, Remedy, HP Quality Center.

Version Control Tools

GIT, SVN, Bitbucket

Web Technologies

JDBC, JSP, Java Script, Java/J2EE, python

Cloud technologies

AWS EC2, VPC, EBS, AMI, SNS, RDS, Aurora, Redshift, EBS, CloudWatch, Cloud Formation, AWS Config, S3, Lambda, Cloud Trail, IAM, VMware.

Monitoring Tools

Kibana, Sensu, Grafana dashboard, Splunk, Nagios, pager duty, victor Ops

PROFESSIONAL EXPERIENCE:

Domino’s Pizza (Ann Arbor, MI) Oct 2019 – Till date

Role: Site reliability Engineer

Resolving of build and deployment issues. Successfully delivered all major builds as per expectations. Given a great support for STAGE and PROD deployments.

Provided centralized software configuration management for enterprise application projects in a multi-tiered high-availability environment. Created and maintained Shell and Python scripts for building applications.

Worked closely with developers, project managers and product owners to setup the road map, resolve issues related to merging of subversion code.

Extensive experience in installing, configuring, and administering Jenkins CI tool on Linux machines. Used Jenkins Pipeline to drive all Microservices builds out to the Docker registry and then deployed to Kubernetes, Created Pods and managed using Kubernetes.

Release Planning & Coordination Worked with other members of their assigned Value Stream to ensure that the Production releases for their in-scope applications/platforms are properly planned and coordinated. This includes Holds Change/Release implementation reviews to ensure thorough and appropriate implementation plans. Provides review and sign-off/approval of change tickets for the assigned Value Stream.

Implemented a production ready, load balanced, highly available, fault tolerant Kubernetes infrastructure.

Managed Kubernetes charts using Helm. Created reproducible builds of the Kubernetes applications, managed Kubernetes manifest files and Managed releases of Helm packages.

Created a Continuous Delivery process to include support building of Docker Images and publish into a private repository.

Used CI/CD tools Jenkins, Git/Gitlabs, Jira and Docker registry/daemon for configuration management and automation using Ansible.

Building/Maintaining Docker container clusters managed by Kubernetes, Linux, Bash, GIT, Docker, on Prem. Utilized Kubernetes and Docker for the runtime environment of the CI/CD system to build, test deploy.

Automated and Installed various relational databases PostgreSQL, MySQL clusters by developing Ansible Playbooks on-premise environments.

Updating certs, creating pools, irules and vips on boarding new applications with their end points.

Developed Jenkins pipeline to build vm’s with downstream jobs passed parameters which runs terraform, shell script and ansible used to run puppet bootstrap on vsphere client. Created DNS entries where power shell is used

updating properties of new certs in Akamai, moving to staging and production after validations.

Nginx and apache setup for on boarding new web applications. Setup the logrotation modules in puppet.

Involved in complete puppet migration from 2.7 version to 6.8 newer version on stages. Setting up the Master of master’s setup in the environment.

Ultimate Software Inc – San Francisco, CA April 2018 – Sep 2019

Role: Site Reliability Engineer

Created a Lambda Deployments and configured it to receive events from S3 bucket. And used AWS Lambda to execute code in response to triggers such as changes in data, shifts in system state, or actions by users.

Mitigating production performance issues effectively by taking responsibility for seeing those performance issues through resolution with the goal of automating to prevent problem recurrence. Participating in on-call/on-demand network, server and database support and handling the maintenance windows.

Created a Lambda Deployments and configured it to receive events from S3 bucket. And used AWS Lambda to execute code in response to triggers such as changes in data, shifts in system state, or actions by users.

Understanding of EC2 storage and S3 storage, EC2 networking and deploying an EC2 linux instance.

Adding an Elastic IP to linux instance, Allowing Network Access, connecting to linux instance using ssh.

Problem Management Performed, Post-Incident Reviews of all Major Incidents and determined Action Items required to avoid similar issues/minimize downtime for future Incidents and also handling RCA’s reviews.

Acted as Operations representative in Value Stream planning and prioritize sessions to ensure that Operational needs of assigned applications/platforms are addressed and held quarterly Operational Performance Reviews with Value Stream management.

Created IAM roles for the Kubernetes cloud setup. Kubernetes deployment, statefulsets, Network policy etc. Created Kubernetes dashboard, Network policies and metrics and monitoring reports using Prometheus and Grafana dashboards.

Configured Sensu monitoring across all Data centers with Ansible configuration management Tool and Grafana dash boards.

Designing and implementing CI (Continuous Integration) system configuring Jenkins servers, Jenkins nodes, creating required scripts (Perl & Python), and creating/configuring VMs (Windows/Linux).

Represents the Value Stream in Change Advisory Board Meetings. Participates in Program Increment Planning Sessions as a liaison for Operations and Infrastructure support. Provides information regarding upcoming critical changes to the Value Stream.

Performs Monthly Capacity Analysis of applications/platforms within the Value Stream. Creates and Maintains Operationally focused ELK Dashboards for the Value Stream.

Renewed SSL Certs before expiring and validating certs which are Authorized by Certificate Authority. Addressing the problem tickets across all D2P application Environments.

Setting up the iptable rules for new Rabbitmq cluster before setting up the policies, queue mirroring, Vhosts, exchanges, DLQ and routing key with puppet.

Setup Dev environment like Vagrant and Docker. Integrated version control GitHub with Jenkin boxes (Master and slaves) to perform tests like Frisby-Tests, UIAeon tests, KArma-Tests on Dev and QA team pull requests.

Worked on Docker and Kubernetes on cloud providers, from helping developers build and containerize their application (CI/CD) to deploying either on public or private cloud.

Good understanding of Service Level Terminology the concept of SLIs, SLOs, SLAs which SRE care about availability, latency, and throughput of the products.

Configured MySQL percona master and slave replication. Migrating them to new data center taking care of all precautions (firewalls and DNS) in new data center with minimal down time.

Wrote shell scripts for nightly production backups and integrated with team city agents (with python fabric compatible). Configured pipelines for production deploys with team city agents.

JIRA experience with customizing projects with various schemas, complex workflows, screen schemes, permission schemes, and notification schemes etc.

Configured concourse pipelines for Sensu monitoring to deploy across production data center.

Anthem Inc – Atlanta, GA Oct 2017 – March 2018

Role: DevOps Engineer

Established, maintained and configured secure communication using SSL certificate generation and exchange; revised and modified as necessary to ensure secure network environment.

Hands-on experience in writing Ansible Playbook Roles, inventory files from the scratch and sharing the roles on Ansible Galaxy, role management, creating Reusable Playbooks.

Involved in Migration of Middleware Applications from Data Centers. This activity involved co-ordination between different Infrastructure teams Like Site Services, Load Balancing teams.

Extensively worked on an Ansible, maintained a fully immutable server architecture and design that includes updating and patching servers and spin up an exact new server’s replica that contains the upgrades and security patches as per the current environment and proceed with taking care of the updated packages will not break or cause service disruption.

Automate Linux servers using ansible for doing system administration tasks. Configured Openldap-server & Openldap-client ansible playbook for centralized login on Linux Vms.

Developed Ansible Playbooks to test the Sharding in MongoDB clusters by Dynamic configurations for mongo cluster. Implemented a Continuous delivery framework using Jenkins, Ansible, Maven and Oracle in Linux Environment.

Experience in Installing, configuring and maintaining DNS systems using BIND, Route53 (AWS)

Support the code builds by integrating with continuous integration tool (Jenkins) and written shell script for end to end build and deployment automation. Build the artifacts using MAVEN /ANT and created the pom.xml files and pushed the artifacts to Nexus and Jfrog-Artifactory manager.

Designed highly available, cost effective and fault - tolerant systems using multiple EC2 instances, Auto Scaling, Elastic Load Balance and AMIs and Glacier for QA and UAT environments as well as infrastructure servers for GIT and Chef.

Created scripts for backup and restoring GITHUB repositories. Implemented Microservices architecture to convert monolithic heavy application into smaller applications.

Extensively worked on Jenkins, Team city for continuous integration (CI) and for End to End automation for all build and deployments.

Work with third party application, hosting and CDN providers to integrate data feeds to a centralized Splunk platform. Managed SPLUNK user accounts (create, delete, modify, etc.).

Implemented new JIRA workflows for the QA teams and worked on Splitting JIRA server's configuration. And technical competencies in third party product integration into JIRA 7.2.

Created a JIRA workflow and apply the condition, validators and post-function in the transition to represent the business requirement process.

Assist internal users of Splunk in designing and maintaining production-quality dashboards.

Designed AWS Cloud Formation templates to create custom sized VPC, Subnets, NAT to ensure successful deployment of Web applications and database templates and expertise in architecting secure VPC Solutions in AWS with the help of Network ACLs, Security groups and public and private network configurations.

Working on AWS Auto Scaling for providing high availability of applications and EC2 instances based on the load of applications by using Cloud Watch in AWS.

Virtualized servers using Docker for test environments and dev-environments needs, configured automation using Docker containers and implemented several Tomcat Instances using Docker engine for running several Containerized Application Servers.

Worked on installing Docker, creating images using a Docker file. Worked on Docker container snapshots, removing images and managing Docker volumes. Experienced in building and maintaining Docker and Vagrant infrastructure in agile environment.

Using Kubernetes control plane and creating API objects to maintain the clusters in their desired state or modify and running the applications on them. Creating number of replicas, using container images, setting up the network and resources typically via cli.

Involved in the migration and implementation of multiple application from on premise to cloud using AWS services like SMS, DMS, CloudFormation, S3, Route53, Glacier, EC2, RDS, SQS, SNS, Lambda and VPC.

Developed Cloud Formation templates to automate EC2 instance. Design user's credentials and profiles using AWS IAM.

Installed, configured and Administering Bitbucket for Version control and Migrated Current GitLab Server onto Bitbucket on RHEL 7, Managed entire development workflow within Bitbucket, from code to deployment.

Utilize Nagios based Open Source monitoring tools to monitor Linux Cluster nodes configured using Red Hat Cluster Suite.

Learn soft Technology Group – CA Sep 2015 – Sep 2017

Role: DevOps Engineer

Expertise of the principles and best practices of Software Configuration Management (SCM) in Agile, scrum, and Waterfall methodologies.

Hands on experience CI/CD pipelines, strong background in Build and Release Management and Cloud Implementation all within that suites the needs of an environment under DevOps Culture.

Wrote custom puppet modules for managing the full application stack (Tomcat/httpd/MySQL/Java) and Implemented GitLab for version control of puppet modules and process documentation.

Worked on Gerrit to create user, uploading git repository, editing the project configuration through Git, managing project access.

Deployments from GIT to Cassandra via Bamboo and JSNodes, with full auditing and user authentication and authorization provided by the LDAP.

Configured Bamboo Remote agent on Windows Platform to perform .Net Applications Builds. Performed Remote Deployments from Bamboo Remote Agent to different IIS Environments using MS Deploy.

Configured Bamboo- Artifactory plugin to upload the artifacts on to Artifactory after the build is success. And Involved in implementing Atlassian Tool Suite (Jira, Bamboo).

Implementation/setup continuous project build and deployment delivery process using Subversion, Bamboo, urban code Deploy and Subversion, Jenkins, urban code Deploy.

Resolved update, merge and password authentication issues in Bamboo and Jira. Configured Source Code Management tools with Bamboo and executed triggers in Git.

Worked on Pipelines and aligned with the branch structure, making it easier to work with branching workflows like feature branching or git-flow.

Gained experience with modern software engineering tools such as Git version control, Jira issue tracking, Gerrit code review, and Hudson continuous integration.

Well versed in creating the tickets with Oracle and to the internal-partnering teams for the addressing the issues in the stipulated timelines.

Worked on the Net-scalar load balancers for creating, configuring the new VIPs, SSL certs & bridges, context switching, virtual servers, service groups.

Good Experience in various Jira plugins such as Jira client, Jira importer plugin, Jira Charting Plugin, the connector for Microsoft project and Jira Misc Custom fields.

Experience Writing/maintaining scripts, Maintaining Linux servers/firewalls and responsible for doing software upgrades on Juniper routers and switches.

Experience in writing Jira API Tools to auto-move Service Desk tickets of one issue type to a Jira project of another issue type and to extract the list of Jira users with the respective Jira Groups and Project Roles.

Analyzed, evaluated and documented application performance, by implementing AppDynamics into production, which directly led to efficiency gains and process improvements.

Used Dynatrace to monitor server metrics and Performed in-depth analysis to isolate points of failure in the application.

Impegno software solutions – Hyderabad, India Dec 2013 – Aug 2015

Role: Systems and VMware Administrator

Responsible for Active directory, GPO, Domain users, Administrating users and groups and given appropriate permissions and privilege to access our LAN and Domain environment.

Installation and Configuration of networks, router configuration and wireless access point/router with security, TCP/IP, VPN, Content Filtering, Access Control Lists on router/switches, VLANs (port mapping, naming etc.), and routing IP address in both LAN/WAN and wireless networks.)

Monitored virtual infrastructure by using a DRS/HA cluster. That cluster will pool (and load-balance, to some degree) CPU and memory from all ESXi servers in the cluster. Once placed in a cluster and monitored on cluster memory and CPU utilization.

Used VMware VMotion to eliminate application downtime from planned server maintenance by migrating running virtual machines between hosts.

Creating templates to deploy multiple virtual machines and clone using VMware virtual client and migrating machines between hosts with HA and DRS. And Installed and configured VMware ESXi 3.x, 4.x servers and applied security patches to ESXi servers.

Participated in regular 24x7 on-call rotations and coordinated with the offshore team for night-time scheduled activities.

Installed, configured, and administered Red hat Enterprise Linux 5.x, 6.x. and maintained SAMBA, NFS, HTTP, NGINX, and FTP in Linux for accessing and sharing files from the Windows environment.

Experience in installing multiple Linux servers using Kickstart installation and custom build scripts for RedHat Enterprise Linux and CentOS. Experience managing HPBlade center C7000 hardware management using ILO Console.

Created fence devices in the cluster, created failover domains within the cluster and Flipover/Failover tests between the nodes in the clusters.

Created filesystems using Red Hat volume manager and performed health checks on a regular basis for all the Linux servers and added storage to the cluster disks and managed the filesystem size in RHEL.

Set up and scanned the newly assigned LUNs to the servers and assigned them to the respective volume group and increased the filesystem using Red Hat volume manager.

Created Link aggregation (LACP) with VLAN tunneling using virtual connect (VC) and shared uplink set (SUS) using LACP and VLAN tagging.



Contact this candidate