ML Engineer
We’re partnering with one of the fastest-growing AI companies in San Francisco, founded by world-class researchers and engineers, and backed by top-tier investors. This team is building foundational ML infrastructure designed to operate at scale, with a focus on reliability, efficiency, and long-term impact.
The work sits at the intersection of cutting-edge research and real-world deployment. The team values deep technical rigor, thoughtful iteration, and a principled approach to building systems that are robust, interpretable, and aligned with broader goals around AI safety and performance.
This is a rare opportunity to contribute to core ML architecture in a company operating at the frontier of what’s possible.
What You’ll Be Doing
End-to-end ML ownership: Lead the design, engineering, and deployment of large-scale machine learning systems, from early experimentation through to robust production rollout.
Build for scale: Architect and optimize low-latency inference infrastructure capable of supporting millions of daily interactions with strict performance and efficiency requirements.
Innovate for performance: Research and implement novel techniques to improve model quality, reduce inference costs, and increase responsiveness.
Tackle hard infra problems: Work on inference optimization, model quantization, serving architectures, and cost-efficient scaling strategies across cloud environments.
Cross-functional collaboration: Partner with engineering, product, and deployment teams to align ML initiatives with real-world use cases and ensure reliability at scale.
Push the frontier: Experiment with state-of-the-art approaches in model training, system design, and ML ops, staying ahead of what’s possible in applied AI.
Ideal Background
Machine Learning depth: 3+ years of experience building and deploying machine learning systems, ideally with exposure to large-scale or latency-sensitive applications.
System-level thinking: Comfortable navigating complex tradeoffs in infrastructure, inference performance, and cost optimization.
Focused expertise: Whether in inference optimization, distributed training, or another core area, you’ve developed deep specialization and know how to go from research to real-world application.
Production-grade mindset: You understand the difference between a prototype and a production system. You've shipped models that power actual products or services.
Full-stack fluency: You’ve worked across the ML lifecycle, data pipelines, model development, deployment, and observability.
Startup-ready: You thrive in fast-paced, ambiguous environments. You take ownership, move quickly, and enjoy building things that don’t yet exist.
$160,000 - $300,000 + Equity & bonus