Senior On-Premises Distributed Systems EngineerBuilding Specialized High-Performance Computing Infrastructure for Transcriptome Analysis At the intersection of cutting-edge genomics and precision engineering, we are seeking a highly skilled engineer to architect, deploy, and operate our next-generation in-house distributed computing cluster — designed specifically to support large-scale transcriptome data analysis with high-throughput, low-latency precision.Your Mission You will build a purpose-driven high-performance computing (HPC) cluster from the ground up — designed around 5 to 10 nodes — to process terabytes of sequencing data with efficiency beyond what generic cloud services can provide.
Your infrastructure will be the computational backbone for breakthrough discoveries in transcriptomics and computational biology.
Key Responsibilities · Architect and Deploy Specialized HPC Infrastructure Design, build, and optimize a bare-metal distributed computing environment, including hardware selection, rack design, node setup, and network architecture.
· Implement Custom Cluster Management Solutions Deploy and fine-tune resource managers (Slurm, PBS, SGE) for efficient job scheduling and high utilization across a small-scale, high-performance cluster.
· Design High-Performance Storage Systems Architect and implement scalable, high-bandwidth distributed storage (such as Lustre, BeeGFS, or Ceph) optimized for transcriptome sequencing workloads.
· Optimize Network Topology for HPC Workloads Build low-latency, high-throughput networks using high-speed Ethernet or InfiniBand to maximize node-to-node communication for parallel computing tasks.
· Develop Custom Solutions for Bioinformatics Workloads Create tailored hardware/software pipelines for specialized needs like RNA-Seq analysis, distributed transcript quantification, or real-time expression profiling.
· Collaborate Across Disciplines Work directly with bioinformaticians, data scientists, and machine learning researchers to align hardware architecture with algorithmic needs.
Technical Requirements Education: · Bachelor’s degree in Computer Science, Computer Engineering, Systems Engineering, or a related technical field required · Master's or PhD in High-Performance Computing, Distributed Systems, or a related field preferred · Equivalent practical experience building distributed infrastructure will be considered Core Skills: · Strong background in bare-metal systems engineering, distributed computing, and HPC architecture · Proven experience designing and operating on-premises clusters (5–50 nodes preferred) · Deep understanding of parallel processing, storage system optimization, and high-speed networking Technology Stack: · Cluster Management: Slurm, PBS, SGE, HTCondor, Kubernetes (on bare-metal) · Distributed Storage: Lustre, BeeGFS, Ceph, HDFS, object storage tuning · Networking: InfiniBand, RDMA over Ethernet, 10/25/40/100G networking · Performance Monitoring: Prometheus, Grafana, Ganglia, Nagios · Hardware Management: IPMI, BMC, hardware health and diagnostics tools Bonus Qualifications · Experience with accelerators (GPUs, FPGAs) for computational biology workloads · Familiarity with bioinformatics file formats (FASTQ, BAM, GTF) and their storage implications · Background in scientific computing centers, genomics research labs, or national lab HPC projects · Experience integrating on-prem clusters with cloud burst capacity (hybrid setups) What Sets You Apart · 3+ years of hands-on experience designing, implementing, and optimizing bare-metal HPC clusters · Strong ownership and leadership in physical infrastructure projects · Practical experience balancing compute-intensive and I/O-intensive transcriptome analysis workloads · Ability to design for both current needs (5–10 node cluster) and future scalability (20+ nodes, hybrid extensions) Example Projects You Will Lead · Architect our next-generation 5–10 node transcriptome processing cluster capable of handling terabytes of RNA-Seq data · Design specialized I/O and memory architectures for distributed genomics file processing · Build resilient infrastructure balancing compute-bound and I/O-bound bioinformatics workloads · Integrate real-time RNA expression pipelines with custom-built distributed storage solutions Our Team Environment You will collaborate with multidisciplinary teams of bioinformaticians, data scientists, software engineers, and AI researchers.
While they innovate on algorithms and scientific analysis, you will architect and operate the custom-built infrastructure that enables their research at scale.
We Encourage Applications From Candidates with backgrounds in research computing centers, supercomputing facilities, genomics labs, or national HPC projects — We value hands-on experience with physical infrastructure design and implementation over purely cloud-based experience.
Join us to architect the specialized computing infrastructure that will enable the next generation of transcriptome research and breakthrough discoveries in genomics.