Data Analyst Engineer

Location:

Santa Clara, CA

Posted:

December 20, 2020

Contact this candidate

Resume:

Prabhakar Perumalsamy Shanmugavel

Mobile : +1-669-***-****

MaiI id : *************@******.***

https://www.linkedin.com/in/prabhakar-shanmugavel-bb4843151/ Santa Clara,CA

Professional Profile

A highly motivated Linux System Administrator who is responsible for availability of servers in one of the largest server infrastructure in world.Apart from linux administration and engineering, I also bring my analytical skills through extensive usage of splunk by which I can act as an data analyst in team. Linux - Build,monitoring,maintenance and troubleshoot of Linux servers and its associated storage from 40 datacenter’s across the world on ~30,000 hardwares. Storage : Day to day administration of Netapp SAN and FAS (Eseries and filers), Fusion Io cards storages associated with DB servers

Python - Automating day to day activities and analysis of servers using various rest api’s. Splunk - Strong in SPL Commands,Field Extractions with regex, Alert and Report creation, Lookup tables, Splunk dashbaords using forms and HTML5 and CSS usage in Splunk dashboards Load Balancers : Managing servers load using citrix netscalers and DLB’s. Puppet/Ansible - Troubleshooting errors raised from Puppet run. Writing playbooks using Ansible for configuration management

Others : Verdad, Regex, IPMI, SSH, InfiniBand, Mellanox,Flash card storage, Confluence, Git, Servicenow, Slack, Hipchat, Extraction,of JSON, YAML,CSV, Rest API, Educational Background

Ø B.E. in Electrical and Electronics Engineering with Distinction Certifications

Ø RedHat Certified Engineer(2014)

Ø RedHat Certified System Administrator(2014)

Ø Splunk Core Certified User(2019)

Work Description:

Company : Apple Inc. (Independent Contractor-TCS)

Project : iCloud and iTunes structured Engineering. Role : System Administrator/Engineer.

Period : Oct 2011 – to till date.

Major Projects and Roles:

v Configuring servers with Redhat using kickstart with OEL 6 and 7 across multiple datacenetrs of varios manufcaturers like HP, ZT,Tyan,IBM,Super Micro through out the year v Cluster management and setup from standalone servers to multiple servers v Involving in all kernel and OEL upgardes.

v Installation of Oracle,Postgres, Mysql, Mongodb on servers based on DB team’s requirement v Creating and managing VIPS on these servers to load balance using citrix netscaler and DLB’s v Installation of various packages based on DB and App team requirement v Managing private network connection of pair servers using ethernet and Infiniband configration

v Created multiple dasboards for identifying various versions of kernel, os, python, hardware errors, puppet errors, monitoring system, datacenter and asset inventories. v Developed a tool which identifies the reboots happened in system without logging to server along with cpu, load, memory and disk utilization statistics at time of reboot in order to identify the root cause using monitoring metrics.

v Developed a tool which identifies the oldest servers for each SRE’s along with their config and manufacturer informations.

v Enabled logs for multiple scripts using cron to run the script on daily basis and collected the system data for analysis.

v Completely altered the alert architecture as per requirement in Apple by lateral solutions using splunk and python and avoided approximately 40,000 alerts per year. v Identified sel fullness issues and developed a script to clear logs which in turn reduced hundreds of alerts per quarter.

v Built ~1000 servers in two days on a major requirement using kickstart along with few engineers.

v Identified genuine hardware issues and altered the monitoring system by combining python scripts and splunk queries.

v Acting as an DRI for all monitoring tools.

v Identified single power supplied devices across all datacenters and sre’s in order to identify the high risk areas.

Responsibilites:

v Troubleshooting hardware issues on various harwdares like HP, ZT, SuperMicron,Tyan, Quanta.

v Build servers with LINUX OS on a highly paced environments using kickstart methods. 3

v Package management via RPM and YUM and mostly administered through Puppet. v Maintaining private network connection between critical servers through Infiniband and ethernet on more than 12k servers.

v Planning the hardware location in cabinets across various Datacenters. v Analysis on failure rate of servers specifications,manufacturers. v Using python effectively for analysis on drives, servers, incidents using various rest api. v Firware upgrades on Drives, BIOS etc.

v Moitoring cpu,load,memory,mailq,ping,ulimit,hardware,puppet,storage,network issues and fixing them.

v Monitoring hardware and work with vendor to resolve the hardware issue. v Working on Complex Incidents generated from ServiceNow on day to day basis along with other Engineers.

v Coordinating with Datbase team to keep the availability of Database servers to 100 %. v Proactive failure identification, verification, tracking, and notification to ensure application and infrastructure maximum uptime.

v Create Alerts,reports and Dashboards effectively using splunk. v Creating logs and monitoring various inventories using splunk. v Providing solutions to complex problems by collaborating with multiple teams. v Coordinate with datacenter and app team to ensure the availability of servers by more than 98 %.

v Involving in multiple projects with various app team to build the servers within deadline. v Effctive usage of IPMI and cli’s like hpacucli,megacli,storcli,smartcli to identify the hardare failures.

v Provided solutions to various issues in a lateral way. v Completely altered the complex alert architecture in Apple by lateral solutions and maintaining it.

v Passing feedback on quarterly basis with vendors. v Identifying and building a new server immediately in critical situations to avoid outages v Providing oncall support throughout the year in a rotation basis.

Contact this candidate