Post Job Free
Sign in

System Engineer

Company:
Aquent
Location:
Houston, TX
Posted:
June 29, 2025
Apply

Description:

No C2C, No Visa Sponsorship.

Must be on site Responsibilities for the hybrid on-premises and cloud HPC environment supporting proprietary software and supporting the Geosolutions and production and development groups.

Apply here: Specific Work Requirements: · A minimum of 5 years’ experience working in a large HPC enterprise environment comprising thousands of servers, large storage solutions, tape and tape automation.

· Proficient in the installation, configuration and management of Linux based operating systems, preferably using RHEL, CentOS, Rocky Linux.

· Experience with IBM’s xCAT distributed computing management software.

· Experience with installation and maintenance of computer hardware including servers, tape drives, robotic tape libraries, GPGPU, SSD, disk arrays.

· Experience with containerization.

· Knowledge of networking and datacenter technologies, switching, routing, high-availability, LAN / WAN / WLAN topologies and system configuration for Ethernet, InfiniBand, and Fiber Channel SAN.

· Experience with HPC Storage Solutions, for example configuration and operation of HPE ClusterStor systems, NetApp, Dell Isilon, and Pure Storage.

· Ability to write and troubleshoot Bourne, Bash and C Shell, Perl, Python, Ruby and MRTG scripts.

· Experience with PostgreSQL and database installation and support.

· Experience with Google Cloud Platform and Azure public clouds.

Able to provision and manage instances, build images, write installation scripts.

· Experience with configuration tools like Ansible and Terraform.

· Experience with backup and recovery tools, IBM Spectrum, Dell Networker.

· Good knowledge of Linux security, including configuration of endpoint security tools.

· Ability to evaluate HPC system environments and make recommendations for improvement in performance and manageability.

· Ability to investigate, debug and diagnose system level issues.

General Work Requirements: · Conform to local change management philosophies, including full testing on non-production systems, prior to production deployment.

· Effectively communicate all change activities to all affected parties including a clear description of the change, related service outages and possible effects on the different environments we support.

· Ensure IT deployment standards are maintained, with verification through reporting systems.

· Meet KPO requirements for InTouch support processing, including full documentation of problem resolution, creation of knowledge content and best practice items.

· Show a good understanding of computer equipment, and its care and maintenance.

· Work with other internal support groups, systems, networking, programming, desktop support, computer operations, and facilities as required to complete administration functions.

· Work with a variety of vendors in technical environments and in the reporting and investigation of system problems.

· Provide a written weekly status report to the team manager and be prepared to present and discuss this with the team at a weekly status meeting.

· Prepared to work outside of normal hours as system maintenance often must be performed outside of prime time; provide 24/7 support to computer operations; work with other remote support locations, for example Kuala Lumpur, backing follow the sun support.

· Participate in support on-call schedule and in weekend power outages, normally two per year and in emergency data center activities.

· Peer-review all major projects, as part of the normal deployment philosophy.

· Ensure compliance with all quality assurance, best practice procedures and QHSE requirements, as defined by job position.

Apply