Post Job Free
Sign in

AWS ML Cloud Engineer

Company:
Tek Spikes
Location:
Plano, TX
Posted:
May 25, 2025
Apply

Description:

Job Description

Responsibilities:

We are seeking an AWS ML Cloud Engineer to design, deploy, and optimize cloud-native machine-learning systems that power our next-generation predictive-automation platform. You will blend deep ML expertise with hands-on AWS engineering, turningdata into low-latency, high-impact insights. The ideal candidate commands statistics, coding, and DevOps—and thrives on shipping secure, cost-efficient solutions at scale.

Objectives of this role:

Design and productionize cloud ML pipelines (SageMaker, Step Functions, EKS) that advance predictive-automation roadmap

Integrate foundation models via Bedrock and Anthropic LLM APIs to unlock generative-AI capabilities

Optimize and extend existing ML libraries / frameworks for multi-region, multi-tenant workloads

Partner cross-functionally with data scientists, data engineers, architects, and security teams to deliver end-to-end value

Detect and mitigate data-distribution drift to preserve model accuracy in real-world traffic

Stay current on AWS, MLOps, and generative-AI innovations; drive continuous improvement

Responsibilities:

Transform data-science prototypes into secure, highly available AWS services; choose and tune the appropriate algorithms, container images, and instance types

Run automated ML tests/experiments; document metrics, cost, and latency outcomes

Train, retrain, and monitor models with SageMaker Pipelines, Model Registry, and CloudWatch alarms

Build and maintain optimized data pipelines (Glue, Kinesis, Athena, Iceberg) feeding online/offline inference

Collaborate with product managers to refine ML objectives and success criteria; present results to executive stakeholders

Extend or contribute to internal ML libraries, SDKs, and infrastructure-as-code modules (CDK / Terraform)

Skills and qualifications:

Primary technical skills:

AWS SDK, SageMaker, Lambda, Step Functions

Machine-learning theory and practice (supervised / deep learning)

DevOps & CI/CD (Docker, GitHub Actions, Terraform/CDK)

Cloud security (IAM, KMS, VPC, GuardDuty)

Networking fundamentals

Java, Springboot, JavaScript/TypeScript & API design (REST, GraphQL)

Linux administration and scripting

Bedrock & Anthropic LLM integration

Secondary / tool skills:

Advanced debugging and profiling

Hybrid-cloud management strategies

Large-scale data migration

Impeccable analytical and problem-solving ability; strong grasp of probability, statistics, and algorithms

Familiarity with modern ML frameworks (PyTorch, TensorFlow, Keras)

Solid understanding of data structures, modeling, and software architecture

Excellent time-management, organizational, and documentation skills

Growth mindset and passion for continuous learning

Preferred qualifications:

10+ years of Software Experience

3+ years in an ML-engineering or cloud-ML role (AWS focus)

Proficient in Python (core), with working knowledge of Java or R

Outstanding communication and collaboration skills; able to explain complex topics to non-technical peers

Proven record of shipping production ML systems or contributing to OSS ML projects

Bachelor’s (or higher) in Computer Science, Data Engineering, Mathematics, or a related field

AWS Certified Machine Learning – Specialty and/or AWS Solutions Architect – Associate a strong plus

Full-time

Hybrid remote

Apply