Machine Learning Engineer
Location: San Francisco Bay area - Full time in office
About the Company:
This is a fast-scaling AI startup dedicated to building cutting-edge foundation models and intelligent systems tailored for complex challenges in semiconductor and electronics design. Backed by top-tier venture capital firms, they work closely with some of the largest and most influential technology companies in the industry.
The team is composed of exceptional talent from academia, elite programming competitions, and high-growth startups, with a proven history of driving innovation and rapidly growing revenue.
Key Responsibilities:
Architect and build high-performance AI solutions that incorporate large language models (LLMs), retrieval systems, and agentic frameworks, specifically designed for advanced engineering processes
Oversee and refine large-scale datasets used for model training and evaluation, leveraging both synthetic data and data collected from real-world sources
Train, fine-tune, and deploy foundation models, optimizing for efficiency across latency, accuracy, and cost metrics
Create innovative retrieval and search algorithms to navigate engineering documents, technical schematics, and other specialized corpora
Develop comprehensive evaluation pipelines to assess model performance on tasks such as code generation, schematic design, and reasoning over extended contexts
Translate state-of-the-art machine learning research into robust, production-ready implementations across the entire ML lifecycle
Take ownership of critical technical initiatives, working closely with customers, engineers, and researchers to deliver measurable solutions to domain-specific problems
Required Skill Sets:
Proficient in programming with Python, C, or Rust, with an emphasis on writing scalable and maintainable code for large-scale systems
Extensive hands-on experience with PyTorch, particularly in developing and training machine learning models
Skilled in GPU programming using CUDA, including the development of custom kernels for optimized performance
Familiar with distributed computing frameworks such as DeepSpeed, Accelerate, Unsloth, or Kubeflow, for efficient large-scale model training
Practical experience in fine-tuning and quantizing transformer-based models for real-world applications
Capable of translating research insights and academic papers into robust, production-grade implementations
Solid background in machine learning research and development, or significant contributions to applied AI projects
Proven experience in evaluating model performance, deploying models into production, and monitoring them post-deployment
Preferred Qualifications:
Background in hardware or electronics, gained through professional experience, academic work, or personal projects
Track record of contributing to open-source projects or initiatives
Recognition through awards or publications in leading academic journals or top-tier industry conferences
Familiarity with the dynamics of high-growth, fast-paced startup environments
Benefits:
Comprehensive Health Coverage: Receive top-tier medical and dental insurance to support your well-being
Professional Growth: Engage in meaningful projects and take advantage of continuous learning and development opportunities
Unlimited Paid Time Off: Enjoy a flexible vacation policy designed to promote a healthy work-life balance
Visa Sponsorship: We welcome global talent and provide visa support to help international candidates join our team