EPMA is looking for an AI DevOps & Infrastructure Engineer for in-house project.
Title: AI DevOps & Infrastructure Engineer
Location: 100% Remote
Responsibilities:
Build and maintain hybrid AI environments (AWS + on-prem DGX)
Automate infrastructure provisioning with Terraform, Helm, or CloudFormation
Manage Kubernetes clusters, namespaces, and workload isolation
Implement CI/CD pipelines (GitHub Actions, GitLab CI/CD, Argo, etc.)
Monitor system performance with Prometheus, Grafana, ELK
Secure systems with RBAC, IAM, TLS, and Vault
Support deployment of LLMs, RAG agents, and model pipelines
Must-Have Skills:
3+ years in DevOps, preferably supporting ML or AI workloads
Strong experience with Docker, Kubernetes, Terraform
Hands-on with AWS (EC2, S3, IAM, KMS, VPCs)
Experience supporting ML pipelines with Airflow, MLFlow, or Kubeflow
Proficiency in Linux server administration
Understanding of networking, DNS, and VPNs
Ability to work with AI/ML engineers collaboratively
Nice-to-Haves:
Familiarity with LLM deployment tools (vLLM, DeepSpeed, TGI, etc.)
Experience with vector databases (Pinecone, Weaviate, FAISS)
Security certifications or experience with GDPR, HIPAA, SOC2 compliance
Past experience supporting ethical AI projects or autonomous systems
Why Join EPMA?
As a people-first company with a sharp focus on innovation and AI, EPMA empowers team members to grow professionally while making a real impact. Join a team where your organizational superpowers are valued—and where your role plays a key part in driving operational excellence.