Persimmons Ai Compiler Developer
Persimmons is building the infrastructure that will power the next decade of AI. Founded in 2023 by veteran technologists from the worlds of semiconductors, AI systems, and software innovation, we're on a mission to enable smarter devices, more sustainable data centers, and entirely new applications the world hasn't imagined yet.
We're growing fast and looking for bold thinkers, builders, and curious problem-solvers who want to push the limits of AI hardware and software. If you're ready to join a world-class team and play a critical role in making a global impact - we want to talk to you.
This role focuses on transforming higher-level MLIR-based large language models by applying sophisticated mid- and backend compiler techniques to target Persimmons.ai's custom accelerator hardware. You will help design and optimize the Persimmons Compiler mid- and backend, integrate it with custom operations and kernels, as well as implement compiler passes that convert higher-level intermediate representations into runtime-oriented code and libraries. This position offers the opportunity to directly shape Persimmons.ai's innovative AI hardware and software stack through close collaboration with teams across hardware, systems, and software.
What you'll do:
Develop and enhance MLIR-based compiler pipelines targeting Persimmons' custom spatial accelerator hardware.
Design and optimize the Persimmons Compiler mid- and backend techniques for efficient lowering, graph-to-resources mapping, and code generation.
Implement transformations to convert Python, PyTorch, and similar kernel representations to LLVM IR and runtime-ready libraries.
Architect and implement efficient support for SPMD-based, distributed collective operations and lower them through specialized MLIR compiler dialects (e.g., MESH, SHARDY).
Drive advanced loop optimizations leveraging polyhedral analysis: loop tiling, fusion, interchange, skewing, and related techniques.
Apply and optimize techniques such as bufferization, padding, inlining, and integration of custom operations and kernels within the compilation workflow.
Work on register allocation and instruction scheduling for Persimmons' spatial hardware, ensuring high resource utilization, throughput, and low latency.
Contribute to graph and tensor partitioning logic for optimal hardware-targeted execution.
Collaborate across teams to deliver performant compilation flows from high-level ML representations to low-level executable artifacts.