Post Job Free
Sign in

Machine Learning Engineer

Company:
Koda Staff
Location:
Santa Rosa, CA, 95402
Posted:
June 07, 2025
Apply

Description:

Machine Learning Engineer – Speech & Audio AI

Location: San Francisco, CA (Hybrid)

Employment Type: Full-time

Experience Level: Mid to Senior

Are you passionate about shaping the future of voice and sound technology? Join a cutting-edge AI startup in San Francisco that’s building the next generation of speech and audio intelligence products.

We're looking for a Machine Learning Engineer who enjoys solving complex problems and working across multiple areas of AI and data-driven technology in a dynamic environment.

What You’ll Do

Design, train, and optimize ML models for speech recognition, audio classification, speaker diarization, or text-to-speech (TTS).

Collaborate with product and research teams to bring state-of-the-art models into production.

Develop scalable pipelines for model training, evaluation, and deployment.

Apply techniques like self-supervised learning, transformers, or diffusion models to real-world audio data.

Analyze and clean large-scale voice datasets (structured and unstructured).

Monitor and improve inference performance in real-time audio systems.

What We’re Looking For

2–6 years of experience in machine learning, with a focus on speech/audio.

Strong background in deep learning (PyTorch or TensorFlow).

Hands-on experience with tools and frameworks such as:

Hugging Face Transformers

torchaudio, librosa, Kaldi, ESPnet

Neural vocoders (e.g., WaveGlow, WaveNet, HiFi-GAN)

Voice conversion frameworks (e.g., RVC, DiffVC, YourTTS)

TTS engines like Coqui TTS

Self-supervised learning tools like S3PRL

Solid understanding of digital signal processing and acoustic modeling, with experience in: FFmpeg, SoX, NumPy/SciPy, Praat

Experience deploying ML models in cloud environments (AWS, GCP, or Azure).

BS or MS in CS, EE, ML, or related field (or equivalent industry experience).

Apply