Post Job Free
Sign in

Principal Software Engineer - Infrastructure

Company:
Engtal
Location:
Rollingwood, TX, 78716
Posted:
May 13, 2025
Apply

Description:

We are seeking talented Senior Software/Infrastructure Engineers to join a rapidly expanding Research & Development team focused on building and maintaining cutting-edge infrastructure. This team is responsible for architecting and supporting massive distributed compute clusters, multi-petabyte storage systems, operating environments, automation frameworks, and development tools. The hardware and software stack are designed to push the boundaries of what’s currently possible.

Our environment includes some of the world’s largest CPU and GPU compute clusters—on par with leading research institutions. We operate proprietary on-premises data centers across multiple global locations and dedicate teams to advancing performance across compute, networking, storage, and power utilization. Our technology and our people are the foundation of our success, and we continue to invest heavily in both.

In this full-time Systems Engineer role, you’ll work on projects that are critical to keeping trading and research systems running 24/7 around the globe. Your responsibilities might include optimizing and scaling our HPC environment, developing internal services to streamline research workflows, managing hardware/software installation and configuration, remote system administration, performance tuning, and automation tool development.

Key Responsibilities:

Lead technical projects aimed at improving performance, scalability, and reliability across our Linux-based infrastructure (hardware, networking, OS layers, and beyond).

Manage and fine-tune large-scale distributed compute and HPC clusters.

Develop automation to streamline infrastructure tasks and troubleshoot complex technical issues.

Diagnose and resolve hardware/software problems efficiently.

Perform OS deployments, upgrades, and performance optimization.

Write scripts to automate routine operations.

Liaise with external vendors to address hardware and software concerns.

Qualifications:

5+ years of hands-on experience in systems engineering or a related role.

Deep expertise in installing, configuring, and troubleshooting Linux systems in a production environment.

Proficient with Linux/UNIX command-line tools and utilities.

Experience with configuration management and automation tools.

Strong scripting skills in Python and Shell (or equivalent).

Detail-oriented approach to managing and debugging production systems.

Excellent communication and organizational skills, with the ability to collaborate across multiple teams.

Strong multitasking ability—comfortable juggling several projects simultaneously.

Familiarity with Debian/Ubuntu is a plus.

Experience with GPU optimization is a bonus.

Apply