Job Description
We are seeking an experienced MLOps Engineer to design, deploy, monitor, and maintain machine learning solutions in production across AWS, Microsoft Azure, and Snowflake environments. This role will collaborate closely with data scientists, platform engineers, and cloud teams to operationalize ML models, automate pipelines, and build reliable, secure, and scalable ML/data platforms.
The ideal candidate brings strong hands-on expertise across the end-to-end ML lifecycle, cloud-native deployment, CI/CD automation, model monitoring, and production-grade data pipelines.
Key Responsibilities:
· Design and implement end-to-end ML pipelines for ingestion, feature engineering, training, validation, deployment, and monitoring.
· Deploy and manage ML models in production across AWS, Azure, and Snowflake ecosystems.
· Build batch and real-time inference pipelines using cloud-native and platform-native services.
· Automate model packaging, testing, releases, and rollback using CI/CD best practices.
· Integrate ML workflows with AWS SageMaker, AWS Lambda, Azure Machine Learning, Azure Data Factory, and Snowflake.
· Build and maintain orchestration workflows using Airflow, Azure Data Factory, or similar tools.
· Implement experiment tracking, model registries, and model governance processes.
· Monitor model accuracy, drift, latency, throughput, pipeline performance, and infrastructure usage.
· Establish advanced deployment strategies (canary, shadow, blue-green, rollback).
· Collaborate with cross-functional teams to transition models from research to production.
· Ensure security, compliance, traceability, and access control for ML systems and data.
· Optimize platform reliability, performance, and cost across AWS, Azure, and Snowflake.
Qualifications:
· Master’s degree or higher (PhD preferred) in Computer Science, Engineering, or related discipline.
· 5+ years of relevant experience in MLOps, ML engineering, platform engineering, or DevOps.
· Strong hands-on experience with AWS, Microsoft Azure, and Snowflake.
· Proficiency in Python and SQL.
· Proven experience deploying and managing ML models in production environments.
· Experience with AWS SageMaker and Azure Machine Learning.
· Experience building and integrating data pipelines with Snowflake.
· Strong understanding of CI/CD pipelines, infrastructure automation, and model versioning.
· Experience with containers and orchestration tools such as Docker and Kubernetes.
· Experience with Airflow, Azure Data Factory, or similar workflow orchestrators.
· Familiarity with model monitoring, logging, alerting, and observability.
· Strong understanding of data engineering concepts, distributed systems, and APIs.
· Excellent troubleshooting skills and proven cross-team collaboration abilities.
Preferred Qualifications:
· Experience with Snowflake Cortex AI, Snowpark, or ML workloads within Snowflake.
· Experience with AWS Bedrock, Azure OpenAI, or production deployment of LLM-based systems.
· Experience building real-time inference pipelines, serverless architectures, and event-driven systems.
· Familiarity with feature stores, vector databases, and RAG-based architectures.
· Experience with IaC tools such as Terraform, CloudFormation, or Azure Bicep.
· Understanding of compliance, governance, and security requirements in regulated industries.
· Experience with A/B testing, shadow deployments, canary releases, and model rollback strategies.