Location: Plano, TX
Salary: $160,000.00 USD Annually - $185,000.00 USD Annually
Description:
Location: Plano, TX or Camas, WA
The Opportunity
We are seeking an AI Engineer to design, build, and optimize production-grade generative AI solutions. In this role, you'll collaborate closely with AI leaders, ML engineers, and platform teams to deliver scalable systems powered by LLMs, GPU-accelerated frameworks, and cloud-native microservices. You'll ensure these solutions are secure, high-performing, and fully integrated within enterprise environments.
What You'll Do
Architect, develop, fine-tune, and deploy generative AI models into scalable production environments
Build and maintain APIs and microservices-primarily using FastAPI-to enable AI capabilities across the organization
Partner with AI Infrastructure teams to design robust LLM pipelines, including training workflows and RAG (retrieval-augmented generation) systems
Integrate AI solutions into enterprise applications using secure, cloud-native patterns and best practices
Ensure models meet standards for explainability, reliability, security, and regulatory compliance
Monitor and improve model performance through evaluation frameworks, observability tools, and continuous fine-tuning What You Bring
8+ years of experience in the IT industry
Minimum 2+ years of hands-on AI development experience
3+ years of professional Python programming
Strong proficiency with LLMs, embeddings, vector databases, and RAG architectures
Proven experience working with generative AI models, including multimodal systems
Practical expertise with cloud-native AI platforms such as Azure AI Foundry, AWS Bedrock, OpenAI models, and AI governance frameworks
Bachelor's degree in Computer Science, AI, or related discipline-or equivalent professional experience Preferred Qualifications
Experience with GPU-accelerated training and inference using NVIDIA technologies (e.g., NIM, NeMo)
Ability to optimize and scale AI models with NVIDIA NIM and fine-tune via NeMo services
Familiarity with agentic AI frameworks and deploying production AI agents
Experience delivering low-latency, high-throughput model deployments using tools such as vLLM and GPU-optimized inference frameworks
Background in CI/CD pipelines for ML and Generative AI, including containerization and orchestration with Docker and Kubernetes
By providing your phone number, you consent to: (1) receive automated text messages and calls from the Judge Group, Inc. and its affiliates (collectively "Judge") to such phone number regarding job opportunities, your job application, and for other related purposes. Message & data rates apply and message frequency may vary. Consistent with Judge's Privacy Policy, information obtained from your consent will not be shared with third parties for marketing/promotional purposes. Reply STOP to opt out of receiving telephone calls and text messages from Judge and HELP for help.
Contact:
This job and many more are available through The Judge Group. Please apply with us today!