Post Job Free
Sign in

Tech Lead, AML Inference

Company:
ByteDance
Location:
San Jose, CA, 95111
Posted:
January 03, 2026
Apply

Description:

About the Team The mission of our Applied Machine Learning (AML) team is to push the next-generation AI infrastructure and recommendation platform for ads ranking, search ranking, live streaming, and e-commerce.

We drive substantial impact across ByteDance's core businesses by building world-class ML platforms and systems.

We are seeking an Tech Lead, AML Inference to oversee the development and execution of ByteDance's inference infrastructure.

This role will lead and mentor a team of Machine Learning Engineers focused on inference, ensuring reliability, scalability, and performance across large-scale distributed systems.

The Inference Lead will collaborate closely with research, product, and platform teams to design and deliver cutting-edge solutions that power critical ranking and recommendation services.

Responsibilities - Lead and mentor a team of inference-focused Machine Learning Engineers, setting technical direction and ensuring best practices.

- Drive the design and evolution of distributed inference infrastructure to support feeds, ads, search, and other core ranking models.

- Oversee the development of monitoring, observability, and management tools to ensure reliability and scalability of online inference services.

- Identify and resolve system inefficiencies, performance bottlenecks, and reliability issues, ensuring optimized end-to-end performance.

- Partner with research and product teams to translate requirements into robust and efficient inference solutions.

- Stay at the forefront of advancements in inference frameworks, ML hardware acceleration, and distributed systems, incorporating innovations where impactful.

Minimum Qualifications - Bachelor's degree or above in Computer Science, Electrical Engineering, or related field.

- 5+ years of experience in developing and deploying large-scale, distributed systems, with at least 5 years in a leadership or technical lead role.

- Strong programming skills in languages such as C++, Python, or Go.

- Deep understanding of inference frameworks and ML system deployment (e.g., TensorFlow, PyTorch, TensorRT, JAX, MXNet). - Proven experience optimizing performance for large-scale machine learning systems, including hardware-software co-design, GPU/RDMA acceleration, or HPC techniques.

- Excellent communication and collaboration skills; ability to work across research, engineering, and product teams.

Preferred Qualifications - Experience leading teams working on high-throughput, low-latency ML serving systems.

- Contributions to open-source ML or systems projects.

- Familiarity with container orchestration, service mesh, or cloud-native ML infrastructure.

- Experience collaborating with and leading global, cross-functional teams across different time zones.

Apply