Machine Learning Engineer

Company:

Koda Staff

Location:

Santa Rosa, CA, 95402

Posted:

June 07, 2025

Apply

Description:

Machine Learning Engineer – Speech & Audio AI

Location: San Francisco, CA (Hybrid)

Employment Type: Full-time

Experience Level: Mid to Senior

Are you passionate about shaping the future of voice and sound technology? Join a cutting-edge AI startup in San Francisco that’s building the next generation of speech and audio intelligence products.

We're looking for a Machine Learning Engineer who enjoys solving complex problems and working across multiple areas of AI and data-driven technology in a dynamic environment.

What You’ll Do

Design, train, and optimize ML models for speech recognition, audio classification, speaker diarization, or text-to-speech (TTS).

Collaborate with product and research teams to bring state-of-the-art models into production.

Develop scalable pipelines for model training, evaluation, and deployment.

Apply techniques like self-supervised learning, transformers, or diffusion models to real-world audio data.

Analyze and clean large-scale voice datasets (structured and unstructured).

Monitor and improve inference performance in real-time audio systems.

What We’re Looking For

2–6 years of experience in machine learning, with a focus on speech/audio.

Strong background in deep learning (PyTorch or TensorFlow).

Hands-on experience with tools and frameworks such as:

Hugging Face Transformers

torchaudio, librosa, Kaldi, ESPnet

Neural vocoders (e.g., WaveGlow, WaveNet, HiFi-GAN)

Voice conversion frameworks (e.g., RVC, DiffVC, YourTTS)

TTS engines like Coqui TTS

Self-supervised learning tools like S3PRL

Solid understanding of digital signal processing and acoustic modeling, with experience in: FFmpeg, SoX, NumPy/SciPy, Praat

Experience deploying ML models in cloud environments (AWS, GCP, or Azure).

BS or MS in CS, EE, ML, or related field (or equivalent industry experience).

Apply

Machine Learning Engineer

Description:

Report this job