Post Job Free

Resume

Sign in

Data Center Operational Support

Location:
Santa Clara, CA, 95050
Posted:
November 09, 2023

Contact this candidate

Resume:

Michael Kirkley

Cell: 510-***-**** ad0zh9@r.postjobfree.com

SUMMARY

Experienced with troubleshooting on a distributed international network and have experience in Level1 systems, network, storage and backup administration. I have dealt with remote hardware troubleshooting and have a tremendous amount of administration experience remotely in locations from London to Tokyo. Experience with troubleshooting automation issues and automated scripting.

TECHNICAL SKILLS

Operating systems: FreeBSD, Solaris, Windows2K, Linux (RHEL & Ubuntu)Windows, Mac OS

Software packages: Big Brother, CA Unicenter, Veritas Net Backup, MS office, Blade Logic, Run Deck, Chef, Jenkins, Looper, Docker, Maven, Open Stack, Splunk, Tomcat, Atlassian,

Bright Cluster Manager, Slurm

Networking: Foundry, Cisco

Hardware: ADIC tape library, terminal server, Brocade switches, EMC Storage, Hitachi Storage

Scripting (UNIX): Perl, KSH, bash, Ruby

Medical: HL7, Carecast

PROFESSIONAL EXPERIENCE

Excell and Intel Apr 2022 – Present

Lab Dataceter Engineer

Imaging new PCs for use with testing equipment

Troubleshooting Hardware (HP, Dell, Supermicro)/ OS (windows server, unbuntu, redhat/centos) / Networking (cisco, Ontapp) issue in laboratory environment

Develop new tools for troubleshooting and monitoring

Installation and initial configuration of servers for testing

Diverse Lynx and Micron Ltd. Oct 2021 – Mar 2022

IT administrator

Imaged new PCs and laptops for users

Repaired faulty computing devices

inventory management of IT resources

General PC troubleshooting

Some Microsoft AD administration

Ultimate Staffing & Oracle June 2020 – May 2021

(as temp for Ultimate Staffing)

System and Data Center Admin

Documented server room configurations for move of server room

Hardware and software troubleshooting of managed systems (Oracle Linux and Solaris)

Hardware fixes, RMA and repairs to Oracle and HP systems

Basic Atlassian Administration (jira, confluence & bitbucket)

Physically moved 2 server rooms between counties

Un-wired and re-wired a 300 node SLURM cluster connected via Mellanox / Infiniband for move

Re-wired EDR and FDR fiber for networked storage after move

Managed tape backup and vaulting

Documented system procedures

Installed and maintined custom MacMini Mojave servers for test cluster of development team

remotely and in person trouble shot jump boxes running Windows 7 and 10

ZT Systems & Microsoft October 2019 – Febuary 2020

(as temp for Tek Systems)

Data Center Technician

Replaced hardware for large cluster storage and computing including unpacking, inventory and return prep

Inventoried on site assets

Splunk, San Francisco CA June 2018 – August 2019

(as temp for Tek Systems)

Cloud Network Operations Engineer

Monitor Cluster health of AWS instances for multiple customer sites

Troubleshoot issues involving Splunk applications and OS (Unbuntu) level issues in AWS

Help in the documentation of new troubleshooting, high-priority escalation and disaster recovery procedures for the CNOC

Wrote automation in Ansible, Puppet for trouble shooting issues as well as updating older automation to deal with new configurations/problems.

Helped with with planning of new automation tools based on python

Walmart.com, Brisbane, CA September 2005 – August 2017

(Sept 05 – Jan 06 as temp through Ajilon Consulting)

Senior Operations Engineer, Sep 2005 – Nov 2013

Monitored system health of e-commerce site

Monitored and troubleshoot backups

Administration and migration of Alarmpoint server cluster

Ran code updates on web servers, tomcat based

Coordinated maintenance and system repairs

Troubleshot Network and System issues as well as code rollout and automation. Common system issues involved runaway processes, disk-space issues and general alarms in alert logs

Installation of new servers, Solaris, RHEL and Windows based

Maintained Production Control during site code release and bug fixes

Acted in advisory role to project management regarding possible site impact of upcoming projects and ways to mitigate impact

Lead multi-day site upgrades that encompassed all portions of the Walmart global e-commerce site for a given country

Lead multi team troubleshooting ‘war rooms’ for site critical issues

Helped to document troubleshooting documents for use by new hires

Trained new hires and offshore Noc engineers

Remotely managed window Jump boxes used for backend access

Automation Engineer, Nov 2013 – Aug 2017

Maintained and administrated Blade Logic Automation Solution

oWrote new Scripts in Bash and trouble shot legacy Perl Scripting

oHelped other teams to develop Ruby script in our enviroment

Worked with other engineering teams to provide automation solutions

Kept abreast of various automation solutions and provided insight on usage and feasibility within the Walmart.com environment

Used Chef, Puppet, Jenkins, Maven, Docker, Open Stack and Rundeck to automate and control system configuration, FIM and application controls

Vargas and Esquivel Construction, San Francisco, CA September 2005

(Short term contract)

IT Contractor

File maintenance

Corrected backup issue

USCF Medical Center, San Francisco CA July 2005 – August 2005

(Temp through Kelly IT Services)

IT Engineer

Monitored system status of new install of a Carecast database

Troubleshot Carecast database

Documented solutions to reoccurring database issues

Ross Dress for Less, Newark, CA May 2005 - June 2005

(Temp through Data Systems Search Consultants)

NOC Operations

Monitored system stability and running processes

Acted as primary helpdesk on off-shift hours

Monitored environment in IDC

Monitored and swapped tapes for nightly and weekly backup processes

Savvis Communications, Santa Clara, CA

Cable and Wireless, San Jose, CA

Storage Way Inc., Fremont, CA Dec 2000 – Sept 2004

NOC Operations / Customer Support

Bought out in 2002 by Cable & Wireless and bought out in 2004 by Savvis Communications

Maintained networked storage and backup systems in remote data centers

Involved in the configuration of Cisco and Foundry Network equipment

Administration of Sun / Solaris Enterprise class machines

Configuration and troubleshooting of brocade fiber switches and Power Distribution Units

Knowledge of and troubleshooting of Veritas NetBackup software

Use of monitoring software from CA Unicenter and Big Brother and configuration and use of terminal server for dial up connectivity to remote data center when primary connection was nonfunctional

Provided emergency troubleshooting on all remote equipment and assistance to customer's IT staff .



Contact this candidate