Post Job Free
Sign in

Machine Learning Infrastructure Engineer

Company:
GLASS Imaging
Location:
Los Altos, CA, 94024
Posted:
May 09, 2025
Apply

Description:

About the role

Glass Imaging is looking for a Machine Learning (ML) Infrastructure Engineer to re-design and develop the backbone of our ML training and evaluation ecosystem. As an experienced professional with a track record of success architecting solutions in this area, you will have the freedom to reshape our platform from the ground up—crafting everything from GPU allocation and data management to experiment tracking and evaluation pipelines.

You’ll be working closely with our ML researchers and engineers to understand their needs, streamline their workflows, and ensure that our platform can scale with the team. You will help create automated repeatable solutions, reducing manual overheads. While we prioritize clean, maintainable code, we operate in a fast-moving research environment where adaptability is key—this role offers plenty of opportunity to explore new ideas, refine solutions, and continuously improve our infrastructure. If you’re excited by the prospect of taking ownership of a system that will serve as the core of our ML efforts, we’d love to talk.

Right now, we’re looking for someone eager to tackle these challenges hands-on, but as our team grows, this role will have the opportunity to take on more leadership responsibility, guiding the continued development of our ML platform and helping shape the team around it.

Responsibilities

Design & build a scalable, efficient Python infrastructure for training and evaluating ML models.

Improve automation of ML train/test infrastructure. E.g. Inference tools that log, cache, and visually compare model outputs, provide code-free methods to run models on new datasets.

Develop and manage systems for GPU resource allocation, dataset management, experiment tracking, and evaluation pipelines. Integrate job scheduling (e.g. SLURM).

Implement automated dataset versioning and validation.

Build tools for reporting and visualizing model metrics and performance.

Improve developer efficiency by creating tools and workflows that streamline ML model iteration and testing. Add and improve performance profiling.

Ensure scalability and reliability of the ML platform as the company grows.

Collaborate closely with ML researchers and engineers to understand their workflows and translate their needs into robust infrastructure.

Introduce best practices for code organization, version control, and reproducibility in ML experiments. Encourage modularity, reusability and portability.

Required Skills

Strong software engineering / software architect level skills

Experience designing and building infrastructure for ML training workflows

Familiarity with performance profiling and optimization for ML training

Excellence in Python, Linux scripting, and typical ML frameworks (e.g., PyTorch, TensorFlow).

Experience with GPU management, distributed computing, and optimizing training pipelines

Passion for turning messy, unstructured codebases into clean, scalable platforms

Seeing the big picture in terms of code repo structure, job orchestration, task pipelining, and on-prem ML Ops for efficient resource usage

Preferred Skills

Proficiency in C++

Experience with customization/design of ML experiment tracking tools (e.g., Weights & Biases, ClearML, etc.); creation / customization of web GUIs and dashboards or Mac OS apps

Knowledge of database and storage solutions for ML datasets

Experience managing on site linux servers, NAS arrays, with large scale datasets

Knowledge of cloud computing (e.g. AWS, GCP, etc.) and containerization (Docker, Kubernetes, etc.)

Knowledge in image restoration and image quality assessment

Location & Travel

We are primarily hiring for positions in our SF Bay Area, CA (primarily in-person) office but may consider other arrangements for outstanding circumstances.

Compensation & Benefits

Competitive pay

Stock options

Health/Dental/Vision Insurance

401(k)

Visa Sponsorship

About GLASS Imaging

Our mission is to bring professional-level image quality to everyone by making cutting-edge image processing accessible to all devices—from smartphones and XR devices to infrastructure maintenance and security applications. We believe that AI-driven processing can extract every ounce of image quality from any camera, making capturing better pictures with any camera easier for everyone.

But we aren’t just enhancing how images are processed; we’re revolutionizing how they’re captured, redefining the core principles of camera design and reimagining how lenses, sensors, and AI-driven processing work together. We’re fundamentally changing how cameras operate to unlock unprecedented levels of performance and image quality.

Founded by former Apple engineers behind Portrait Mode and other groundbreaking iPhone camera features, we’re a team of passionate and experienced engineers pushing the boundaries of photography. Join us in shaping the future of camera technology!

Equal Opportunity & Diversity Statement

Glass Imaging is committed to fostering a diverse and inclusive workplace. We celebrate differences and do not discriminate based on race, ethnicity, gender, sexual orientation, age, disability, veteran status, or any other protected status. We encourage individuals from all backgrounds to apply.

Apply