Post Job Free
Sign in

AI/Machine Learning Engineer

Company:
Enormous Enterprise LLC
Location:
Cupertino, CA
Pay:
DOE
Posted:
March 25, 2025
Apply

Description:

Job Title: AI/Machine Learning Engineer

Location: Cupertino, CA - Local or Nearby Candidates only - Face to Face is must

Duration: Long Term Contract

Responsibilities -

· Designing, implementing, and maintaining distributed systems to build world-class ML platforms/products at scale

· Diagnose, fix, improve, and automate complex issues across the entire stack to ensure maximum uptime and performance

· Design and extend services to improve functionality and reliability of the platform

· Monitor system performance, optimize for cost and efficiency, and resolve any issues that arise

· Build relationships with stakeholders across the organization to better understand internal customer needs and enhance our product better for end users

Required Skills -

· 3+ years of experience in distributed systems with deep knowledge in computer science fundamentals

· Experience with containerization and orchestration technologies, such as Docker and Kubernetes.

· Experience in delivering data and machine learning infrastructure in production environments

· Experience configuring, deploying and troubleshooting large scale production environments

· Experience in designing, building, and maintaining scalable, highly available systems that prioritize ease of use

· Experience with alerting, monitoring and remediation automation in a large scale distributed environment

· Extensive programming experience in Java, Python or Go

· Strong collaboration and communication (verbal and written) skills

· B.S., M.S., or Ph.D. in Computer Science, Computer Engineering, or equivalent practical experience

Preferred Skills -

· Understanding of the ML lifecycle and state of the art ML Infrastructure technologies

· Experience with GPU and other type of HPC infrastructure

· Experience with training framework like PyTorch, Tensorflow, JAX

· Deep understanding of Ray and KubeRay

· Experience with ML Training/Inference profiling and optimization

Apply