Post Job Free
Sign in

Machine Learning Engineer (Mid - Staff levels)

Company:
AI Republic
Location:
San Francisco, CA
Posted:
June 25, 2025
Apply

Description:

Machine Learning Engineer (Mid to Staff Level) LLMs in Production AI x Healthcare San Mateo (Hybrid)

We’re working with a Series A AI company based in the Bay Area that’s bringing large language models to life - in real-world, high-impact healthcare workflows.

Location: San Mateo, CA - 4 days/week in office

Compensation: $85K–$260K base (DOE) + strong equity + full benefits

Stage: Series A $19M raised 7-figure ARR Profitable with indefinite runway

What They’re Building

This startup is transforming how voice-based workflows are handled in healthcare - using production-grade Voice AI agents powered by LLMs to automate high-stakes interactions like insurance verifications, authorizations, claim status checks, and more.

This isn’t experimental - thousands of these AI-driven calls happen daily across major healthcare organizations.

️ The Role: Machine Learning Engineer (Open to Mid & Staff Levels)

This role sits at the intersection of software engineering, MLOps, and LLM deployment. You’ll be part of a lean, highly experienced team shipping real AI to production.

Depending on your level, your focus might range from prompt design and internal tooling to building core ML infrastructure at scale.

What You’ll Work On:

Design and iterate on prompts for classification, summarization, extraction, and task automation

Build tools for prompt testing, versioning, and performance evaluation

Optimize and fine-tune LLMs for latency, cost, and alignment with business logic

Deploy and monitor ML models in production environments (on-prem + cloud)

Maintain robust MLOps pipelines for training, evaluation, and CI/CD

Contribute to infrastructure powering high-volume, real-time inference for voice agents

Collaborate cross-functionally with product and engineering to translate complex workflows into ML-powered solutions

About You

We’re open to both mid-level and staff-level candidates - and will tailor the scope to match your background.

You would be a great fit if you have:

3+ years (mid) or 5+ years (staff) in ML engineering, LLMs, or AI infra

Strong Python and experience with tools like Hugging Face, LangChain, LlamaIndex, PromptLayer, or Langfuse

Hands-on experience deploying and optimizing LLMs in production

Experience with model serving (Triton, ONNX, FastAPI) and containerized infra (Docker, K8s)

Familiarity with MLOps frameworks (MLflow, Kubeflow, SageMaker, Vertex AI)

Bonus: exposure to healthcare data formats (FHIR, HL7) or vector databases (Pinecone, Weaviate)

You care deeply about model performance, prompt design, and production-grade reliability - not just research experiments.

Why This Role Matters

You’ll ship AI into real-world, high-impact workflows - not just prototypes

The engineering bar is high - all your peers are experienced and hands-on

You’ll shape new infrastructure from scratch - this is greenfield, not maintenance

You’ll help scale LLMs in one of the most valuable and complex sectors in the U.S.

If you're excited about working on production LLM systems that actually matter - with a team that cares deeply about both quality and speed - send me a message to learn more to

Apply