Post Job Free
Sign in

DevOps Engineer

Company:
Diamondpick
Location:
Clinton Township, OH, 43224
Posted:
May 09, 2026
Apply

Description:

DevOps Engineer

Max Vendor Rate: $84

Location: Remote

JD:

We are seeking a highly skilled and experienced Principal Software Engineer focused on Agentic AI and DevOps. The ideal candidate will architect and deliver agentic microservices and platform capabilities, lead cloud-native DevOps at scale, and partner with organizational leaders to communicate strategy, status, and results. Deep hands-on expertise with Azure, Kubernetes, CI/CD, infrastructure as code, and LLM/agent frameworks (LangChain/LangSmith/OpenAI/LiteLLM) is essential. Experience with dataflow orchestration (Apache NiFi), enterprise integrations (ServiceNow/Snowflake/Power BI/SharePoint), and production-grade observability is highly desirable.

What You'll Do:

• Architect, build, and operate agentic AI services and microservices leveraging LangChain, LangSmith, OpenAI/Azure OpenAI, and LiteLLM; implement tool-use orchestration, evaluation, and guardrails.

• Design, build, and maintain CI/CD pipelines using Azure DevOps (ADO) YAML and GitHub Actions; enforce trunk-based workflows, quality gates, progressive delivery, and automated rollbacks.

• Stand up and manage Azure infrastructure (AKS, Service Bus, Event Hubs, Storage Accounts, Key Vault, Bastion); codify environments with Terraform; implement secure networking, secrets, and RBAC.

• Containerize and ship services with Docker/Buildah; operate Kubernetes with CNI networking and Linkerd service mesh; implement canary/blue-green strategies and autoscaling.

• Create and operate Apache NiFi dataflows; deploy and manage NiFi clusters on AKS with VM Scale Sets, enabling resilient, scalable ingestion and orchestration.

• Implement enterprise-grade observability and logging: ELK/EFK (Elasticsearch, Fluentd/Fluent Bit, Kibana), Prometheus metrics, Azure Dashboards, and KQL-based alerting.

• Engineer data and analytics integrations: Azure Databricks, PostgreSQL, Snowflake; operationalize Power BI, SharePoint, and Jupyter-based workflows.

• Build robust platform and app integrations: ServiceNow APIs, REST APIs, SMTP/IMAP/POP email automations; configure and manage NGINX/HAProxy load balancers.

• Lead incident response, root-cause analysis, and postmortems; continuously improve reliability, performance, security, and cost.

• Mentor teams, drive architectural runway, and communicate plans, trade-offs, and outcomes to stakeholders and leadership.

Key Qualifications / Experience Required:

DevOps Experience

• Expert-level hands-on DevOps across Azure and Kubernetes: CI/CD, Git workflows, infrastructure as code, automated testing, monitoring, and secure deployment.

• Proficiency with Azure DevOps (ADO) YAML pipelines and GitHub Actions; experience optimizing pipelines for cloud-native systems.

• Strong Kubernetes operations including CNI networking and service mesh (Linkerd); container build and supply chain (Docker, Buildah).

• Observability at scale using ELK/EFK, Prometheus, Fluentd/Fluent Bit, Azure Monitor dashboards and alerting (KQL).

Automation Skills

• Deep automation with PowerShell, Bash, and Python to eliminate toil across build, release, environment, and operational workflows.

• Infrastructure as Code expertise with Terraform (Azure resources: AKS, Service Bus, Event Hubs, Storage, Key Vault, Bastion).

• Proven track record reducing manual intervention, increasing repeatability, and improving MTTR through automation.

Agentic AI Experience

• Practical, production experience delivering agentic AI solutions (task orchestration, tool-use, planning, retrieval, and evaluation).

• Hands-on with LangChain, LangSmith (tracing/eval), OpenAI/Azure OpenAI, and LiteLLM integration; familiarity with prompt engineering, safety/guardrails, and LLM observability (e.g., Arize).

• Experience operationalizing AI services within DevOps pipelines and platform governance.

Technical Proficiency

• Apache NiFi expertise: authoring and governing dataflows; deploying and scaling NiFi clusters on AKS with VM Scale Sets.

• Azure services: AKS, Service Bus, Event Hubs (setup and integration), Storage Accounts (setup and integration), Key Vault, Bastion, Azure Dashboards & Kusto Query Language (KQL).

• Data/analytics: Azure Databricks, PostgreSQL, Snowflake; Power BI and SharePoint integrations; Jupyter Notebook workflows.

• Networking fundamentals: DHCP/DNS; load balancer configuration and operations (NGINX, HAProxy); Kubernetes ingress best practices.

• Messaging and email protocols: SMTP, IMAP/POP.

• Microservices and app frameworks: Python and Node.js microservices (REST APIs), Electron build and packaging.

Required Technical Skills

• Windows PowerShell; Linux/Unix administration; Bash and Python.

• Azure Cloud (architecture, security, cost, RBAC); Azure DevOps (ADO) with YAML; GitHub Actions.

• Docker and Buildah; Kubernetes (CNI), Linkerd; ELK/EFK, Prometheus, Fluentd/Fluent Bit.

• Apache NiFi flow development and clustered operations on Kubernetes with scale sets.

• Azure Databricks; PostgreSQL; Snowflake; REST APIs; ServiceNow APIs; Power BI; SharePoint.

• Azure Service Bus, Azure Event Hubs, Storage Accounts, Key Vault, Bastion.

• Jira; Jupyter Notebook; Azure Dashboards and KQL; SMTP/IMAP/POP.

• Python and Node.js microservice architecture; Electron build.

Project Management Skills

• Plan, schedule, and coordinate multi-team deliveries and releases; manage dependencies, risks, and change.

• Drive execution across platform, app, data, and AI workstreams with clear milestones and success criteria.

• Establish SLOs/SLAs and error budgets; align roadmaps to business priorities.

Communication and Interpersonal Skills

• Communicate architectural decisions, roadmaps, and trade-offs to technical and executive audiences.

• Lead cross-functional ceremonies; produce clear runbooks, architecture docs, and dashboards.

• Foster collaboration across engineering, product, security, and operations.

Analytical and Problem-Solving Abilities

• Rapid diagnosis and resolution of complex production issues; strong RCA and remediation planning.

• Attention to detail in reliability, security, performance, and cost optimization.

Adaptability and Continuous Learning

• Track and adopt evolving best practices in cloud, containers, DevOps, and agentic AI.

• Champion continuous improvement in engineering excellence and platform governance.

Experience and Education

• Typically requires 10-15+ years in software engineering, DevOps/SRE, or platform engineering with principal-level impact.

• Bachelor's degree in Computer Science, Information Technology, or related field preferred (or equivalent experience).

Secondary Skills and Experience (Desired):

Design and Development

• Define and design subsystems and interfaces; allocate responsibilities across services and platforms.

• Translate non-functional requirements (security, reliability, scalability) into concrete designs.

Technical Enablement

• Provide technical enablement for components and subsystems; drive critical design decisions and reviews.

• Establish patterns and reusable templates for CI/CD, IaC, and agentic service scaffolding.

Continuous Delivery Pipeline

• Plan, define, and implement the continuous delivery pipeline with quality gates, progressive delivery, and rollback strategies.

Architectural Runway

• Develop the architectural runway to support new features and capabilities; align with Solution and Enterprise Architects and portfolio stakeholders.

Integration

• Architect and implement integrations with external components, systems, and platforms (ServiceNow, Snowflake, Power BI, SharePoint, email systems, and enterprise identity/secrets).

Top Skills:

• Windows PowerShell; Linux/Unix administration; Bash and Python

• Azure Cloud (architecture, security, cost, RBAC); Azure DevOps (ADO) with YAML; GitHub Actions

• Docker and Buildah; Kubernetes (CNI), Linkerd; ELK/EFK, Prometheus, Fluentd/Fluent Bit

Apply