Post Job Free
Sign in

AI Inference Engineer

Company:
quadric.io, Inc
Location:
Burlingame, CA
Posted:
January 14, 2026
Apply

Description:

Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code.

Role:

The AI Inference Engineer in Quadric is the key bridge between the world of AI/LLM models and Quadric unique platforms. The AI Inference Engineer at Quadric will [1] port AI models to Quadric platform; [2] optimize the model deployment for efficient inference; [3] profile and benchmark the model performance. This senior technical role demands deep knowledge of AI model algorithms, system architecture and AI toolchains/frameworks.

Responsibilities:

Quantize, prune and convert models for deployment

Port models to Quadric platform using Quadric toolchain

Optimize inference deployment for latency, speed

Benchmark and profile model performance and accuracy

Develop tools to scale and speed up the deployment

Make Improvement to SDK and runtime

Provide technical support and documents to customers and developer community

Requirements

Requirements:

Bachelor’s or Master’s in Computer Science and/or Electric Engineering.

5+ years of experience in AI/LLM model inference and deployment frameworks/tools

experience with model quantization (PTQ, QAT) and tools

experience with model accuracy measures

experience with model inference performance profiling

experience with at least one of the following frameworks: onnxruntime, Pytorch, vLLM, huggingface-transformer, neural-compressor, llamacpp

Proficiency in C/C++ and Python

Demonstrate good capability in problem solving, debug and communication

Benefits

Health Care Plan (Medical, Dental & Vision)

Retirement Plan (401k, IRA)

Life Insurance (Basic, Voluntary & AD&D)

Paid Time Off (Vacation, Sick & Public Holidays)

Family Leave (Maternity, Paternity)

Short Term & Long Term Disability

Training & Development

Work From Home

Free Food & Snacks

Stock Option Plan

Apply