Varstaff is currently searching for the following:
Specific Work Requirements:
· A minimum of 5 years’ experience working in a large HPC enterprise environment comprising thousands of servers, large storage solutions, tape and tape automation.
· Proficient in the installation, configuration and management of Linux based operating systems, preferably using RHEL, CentOS, Rocky Linux.
· Experience with IBM’s xCAT distributed computing management software.
· Experience with installation and maintenance of computer hardware including servers, tape drives, robotic tape libraries, GPGPU, SSD, disk arrays.
· Experience with containerization.
· Knowledge of networking and datacenter technologies, switching, routing, high-availability, LAN / WAN / WLAN topologies and system configuration for Ethernet, InfiniBand, and Fiber Channel SAN.
· Experience with HPC Storage Solutions, for example configuration and operation of HPE ClusterStor systems, NetApp, Dell Isilon, and Pure Storage.
· Ability to write and troubleshoot Bourne, Bash and C Shell, Perl, Python, Ruby and MRTG scripts.
· Experience with PostgreSQL and database installation and support.
· Experience with Google Cloud Platform and Azure public clouds. Able to provision and manage instances, build images, write installation scripts.
· Experience with configuration tools like Ansible and Terraform.
· Experience with backup and recovery tools, IBM Spectrum, Dell Networker.
· Good knowledge of Linux security, including configuration of endpoint security tools.
· Ability to evaluate HPC system environments and make recommendations for improvement in performance and manageability.
· Ability to investigate, debug and diagnose system level issues.
General Work Requirements:
· Conform to local change management philosophies, including full testing on non-production systems, prior to production deployment.
· Effectively communicate all change activities to all affected parties including a clear description of the change, related service outages and possible effects on the different environments we support.
· Ensure SLB IT deployment standards are maintained, with verification through reporting systems.
· Meet KPO requirements for InTouch support processing, including full documentation of problem resolution, creation of knowledge content and best practice items.
· Show a good understanding of computer equipment, and its care and maintenance.
· Work with other internal support groups, systems, networking, programming, desktop support, computer operations, and facilities as required to complete administration functions.
· Work with a variety of vendors in technical environments and in the reporting and investigation of system problems.
· Provide a written weekly status report to the team manager and be prepared to present and discuss this with the team at a weekly status meeting.
· Prepared to work outside of normal hours as system maintenance often must be performed outside of prime time; provide 24/7 support to computer operations; work with other remote support locations, for example Kuala Lumpur, backing follow the sun support.
· Participate in support on-call schedule and in weekend power outages, normally two per year and in emergency data center activities.
· Peer-review all major projects, as part of the normal deployment philosophy.
· Ensure compliance with all quality assurance, best practice procedures and QHSE requirements, as defined by job position.
Personal Traits:
· Self-motivated, able to work with minimum direction.
· Able to work as part of a team, either in small groups, or as part of the Data Center support team as a whole and accomplish this in either a lead or reporting role.
· Able to demonstrate good written, phone and face to face communication skills when working with a peer group and with internal and external customers and with vendors.
· Adhere to industry standard systems administration techniques and procedures.
· Document standard user and operational requirements.
· Willingness to train others.
1SLBJP00001274