JD
Key Responsibilities
Development and integration of Python-based applications with LLMs (Bedrock, AzureGPT, OpenAI, DeepSeek, Anthropic, LLaMA, etc.).
Architect and implement LLM pipelines including prompt engineering, retrieval-augmented generation (RAG), fine-tuning, and evaluation.
Design scalable microservices and APIs for AI features.
Collaborate with MLOps teams to deploy and monitor AI models in production.
Ensure performance optimization, cost efficiency, and security in LLM workflows.
Guide the team on Python best practices, code reviews, and technical problem-solving.
Stay updated on emerging AI/LLM advancements and propose adoption where beneficial.
Required Skills & Experience
Technical:
Strong proficiency in Python (FastAPI, Flask).
Comfortable in coding using Cursor ai
Solid experience with LLM integration (OpenAI API, Hugging Face Transformers, LangChain, LlamaIndex).
Understanding of RAG pipelines (vector databases like Pinecone, Weaviate, FAISS, Milvus).
Experience with prompt engineering & evaluation techniques.
Knowledge of MLOps tools (MLflow, Kubeflow, Langfuse) and deployment on AWS/GCP/Azure.
Sound knowledge with containerization and orchestration (Docker, Kubernetes).
Strong grasp of REST APIs and microservices architecture.
Knowledge of model fine-tuning and performance optimization.
Experience with agentic AI or Model Context Protocol (MCP).