About the Role:
We're seeking a Backend Software Engineer with expertise in Large Language Model (LLM) integration and voice interface systems. In this role, you'll build and maintain scalable systems that power intelligent agents, voice assistants, and real-time AI applications. This is a cross-functional role that blends traditional backend engineering with modern AI tooling and audio systems - ideal for a multidisciplinary technologist.
Responsibilities:
Design, develop, and optimize backend APIs using Python (FastAPI or Flask)
Build and deploy LLM pipelines using OpenAI, Anthropic, Mistral, Meta (LLaMA)
Implement ASR and TTS systems with Whisper, DeepSpeech, Google STT, Amazon Polly, Coqui TTS
Architect real-time pipelines using WebSockets, Redis Pub/Sub, microservices
Integrate RAG pipelines using FAISS, Pinecone, and LangChain/Haystack
Secure backend endpoints with JWT/OAuth2 and rate-limiting
Deploy on AWS/GCP using Lambda, Cloud Run, ECS, Docker, Kubernetes
Collaborate with frontend/mobile teams for multimodal interaction
Contribute to prompt tooling, memory systems, and agent orchestration
Design with cost-awareness in mind, optimizing cloud usage and minimizing overhead for model inference and streaming pipelines
Core Skills & Traits:
Strong software architecture design for real-time, low-latency systems
Deep understanding of LLM behavior in production and inference workflows
Experience building multimodal (voice + text) interfaces
Passion for building robust AI systems and developer tooling
Excellent collaboration and communication across disciplines
Security-first mindset in all aspects of data and system design
Cost-aware mindset: ability to evaluate tradeoffs and reduce infrastructure/inference costs