Post Job Free

Resume

Sign in

System Administrator High Performance

Location:
Richmond, TX
Posted:
January 30, 2024

Contact this candidate

Resume:

David Ramirez

Senior HPC System Administrator

RHCSA

ad28jn@r.postjobfree.com

Cel. +1-979-***-****

https://www.linkedin.com/in/dramirezmolina

https://info.davidramirez.net

https://www.thelinuxwiki.net

Houston, Texas

Certification Red Hat Certified System Administrator Languages Bilingual Spanish/English, fluent German Technical skills High Performance Computing (HPC): Cluster Provisioning, benchmarking and administration (Bright Cluster Manager, xCAT), schedulers (Slurm, PBS/Torque, SGE/UGE), monitoring (Ganglia, Nagios, Icinga), high performance disk subsystem (Lustre, BeeGFS), GPU/CUDA, Infiniband, OS patching, kernel tuning, troubleshooting. NFS, SMB file sharing.

OS: Red Hat Enterprise Linux (RHEL) / Oracle / CentOS / Rocky / AlmaLinux / Fedora / Debian / Ubuntu / Windows

Software systems: Virtualization (VMware VSphere, KVM, VirtualBox), Docker, Kubernetes, Supervisor, Kickstart, NVIDIA toolset, AMD, Intel oneAPI, modules.

Shell scripting (bash), Python, Perl, awk, Linux toolset

Configuration Management (Ansible, CFEngine, SaltStack)

Version Control (Git, Subversion)

Backup solutions (BRU)

Identity Management (IPA, FreeIPA)

Web server: LAMP stack, HAProxy

Vulnerability Management: Rapid7/InsightVM

Supervision and monitoring, HPC performance metrics (XDMoD, Altair Mistral, Breeze), checkmk, Nagios

Databases: MySQL / MariaDB / Elasticsearch

Documentation (Atlassian Confluence, MediaWiki, ITGlue)

Infrastructure: Physical and VM server life-cycle (includes hardware, maintenance, commissioning), storage, networking, power. Experience

TotalEnergies

Advanced Technical Computing – 2023

Houston, TX

Co-administrator of a multi-cluster HPC facility supporting research and development projects in Oil, Gas and Wind simulations and models. Evaluation of new applications and technologies, proof-of-concept development, benchmarks, systems life cycle, user support and documentation. ad28jn@r.postjobfree.com Page 1 Cel. +1-979-***-**** Dell Technologies

HPC and AI Innovations Laboratory

Round Rock, TX

Principal Systems Development Engineer – 2022

Development, integration, validation, design and benchmarking of HPC prototype clusters, as well as technical documentation and collaterals. Physical build-up and commissioning, software builds and modules for systems and application environments. Perform comparative benchmarking among connectivity fabrics, escalation analysis. High Performance Computing Solutions – X-ISS

Houston, TX

Systems Analyst (HPC) - 2021

Member of the technical staff in charge of supporting multiple client HPC facilities. Cluster management, troubleshooting and supervision using tools such as Bright Cluster Manager, xCAT, schedulers such as Slurm, PBS, and monitoring tools such as Nagios, Ganglia, Zabbix. Introductory setup and usage of AWS HPC provisioning environments such as SOCA. Performed deployments and configuration of system and application software such as Ansys (RSM, Fluent, Dyna, Optistruct) on HPC cluster systems across several clients facilities.

U.S. Expediters – CPAP.COM

Stafford, TX

Linux Systems Administrator, 2018-2021

Member of the infrastructure staff in charge of supporting a high availability medical e-commerce platform, with extensive in-house development.

Administrative operation of a VMware platform for a 200+ VM Red Hat / CentOS / SUSE Linux farm, including VM lifecycle operations, and interaction with MS Windows Server components and 400+ end point stations.

Supported software includes multiple critical web servers (Apache, memcached, HAProxy, SSL), database (MySQL), file (NFS), image processing, monitoring (CheckMK/Nagios), running in-house PHP, JavaScript applications, CMS (Wordpress, MediaWiki), back-office, VoIP telephony, and warehouse workflow (handheld scanners, scales, printers, shipping) operations under HIPAA compliance.

Main responsible for an extensive structured documentation initiative using Atlassian Confluence. Coadministered Atlassian systems such as Bitbucket, Jira, and Confluence, and e-mail systems such as Zimbra. ad28jn@r.postjobfree.com Page 2 Cel. +1-979-***-**** Configuration management using SaltStack. Network monitoring and supervision using CheckMK.

Continuous vulnerability assessment and remediation (Rapid7/InsightVM), and systems patching / hardening.

Familiar with methodologies of the Agile workflow, EOS, SCRUM, and DevOps infrastructure support.

Installation, maintenance and support of Moodle LMS for call center staff, including trainers support, back-end maintenance, backup, upgrades. Supporting role in a new technologies project for critical application migration to AWS, including Docker, Kubernetes.

Texas A&M University (TEES) - Parasol Laboratory

College Station, TX

System Administrator - Analyst, 2010-2018

Responsible for the management of the Parasol Laboratory research infrastructure. Support for internal and external research collaborators and external dissemination of research products; ensuring quality of service of the computing platform.

System administration for high-performance parallel and heterogeneous systems, including a Cray XE6m Supercomputer (NetApp Lustre Storage), GPU/CUDA HPC servers (Supermicro, IBM), Dell servers supporting KVM and VMware virtualization, Rocks HPC clusters, and 100+ Linux VMs and workstations (130+ internal and remote users), and their customization for the laboratory's research projects.

Multiple OS (RHEL 5,6,7,8 CentOS, Fedora, SUSE), MySQL / MariaDB, and PHP platform, server, and application migrations, including provisioning, configuration, monitoring, configuration automation, optimization, and troubleshooting.

Identity (FreeIPA, IPA, LDAP), inventory, change and configuration management (CFEngine); hardware and software integration, web mastering (3 major LAMP websites, various intranet servers), MySQL and PostgreSQL database servers, contents management server, repositories, revision control (Subversion, Git), backups (BRU), monitoring (Ganglia), security (patching and hardening), and overall system administration, ensuring compliance with the State of Texas IT regulations and audits. Extensive documentation supporting staff and end users (MediaWiki). PVAMU ITS - Sungard Higher Education

Prairie View, TX

IT Support Specialist, 2009 (Internship)

Internship as Master student.

Second level help desk for a 8000+ students campus (account handling). MS Windows Server AD administration. Data center system deployment, server security analyst, virtualization prototyping and testing. ad28jn@r.postjobfree.com Page 3 Cel. +1-979-***-**** Education

Prairie View A&M University, Prairie View, Texas

M.Sc. Computer Information Systems, 2009

University of Los Andes, Bogota, Colombia

B.S. Electrical Engineering, 1984

Professional

Development

and Activities

Prairie View A&M University, Prairie View, Texas

Computer Science Department

Industry Advisory Council Committee – Member (2012-present) University of Southern California, Los Angeles, California Collaboratory for Advanced Computing & Simulations Workshop on Computational Science – 2009

Implement and deploy computer cluster and perform parallel computing experiments. (Grant from DoE).

University of Saarland, Saarbrücken, Germany

Institute for Informatics and Applied Mathematics

Graduate Internship - Information Technology, 1988-89

(Sponsored by the German Ministry for Economic Cooperation and the Carl Duisberg Society / InWEnt).

Implemented internodal communications at the system level in a distributed architecture computer.

Work permit U.S. Citizen

ad28jn@r.postjobfree.com Page 4 Cel. +1-979-***-****



Contact this candidate