Post Job Free
Sign in

AWS Cloud DevOps / Site Reliability Engineer (SRE)

Company:
The Judge Group
Location:
Clinton Township, OH, 43224
Posted:
January 26, 2026
Apply

Description:

Job Title: AWS Cloud DevOps / Site Reliability Engineer (SRE)

Location: Columbus OH (4 days Onsite, 1 day Remote)

Type: 3 Months Contract to Hire

Contract only W2

Job Summary:

We are seeking a skilled and proactive AWS Cloud DevOps / Site Reliability Engineer (SRE) to join our team. This role combines software engineering and cloud operations expertise to build and maintain scalable, secure, and reliable cloud infrastructure using AWS services. The ideal candidate will have a strong background in DevOps practices, cloud infrastructure automation, CI/CD pipelines, monitoring, and incident response.

Key Responsibilities:

• Establish and maintain efficient and reliable Azure DevOps CI/CD pipelines to facilitate seamless integration between environments.

• Manage source code repositories with version control tools, follow branching strategies and release management.

• Implementing and managing infrastructure using Terraform and Terragrunt as Infrastructure as Code (IaC).

• Integrate IaC workflows into CI/CD pipelines for seamless, automated deployments.

• Manage scalable, secure, and highly available AWS infrastructure using services like Lambda, CloudWatch, EC2, S3, RDS, DynamoDB, API Gateway and VPC.

• Maintain reusable Terragrunt and Terraform modules/templates for consistent infrastructure patterns.

• Monitoring and alerting systems to ensure high availability and resilience of application and infrastructure through automation, alerting, and auto-healing mechanisms.

• Identify performance bottlenecks, optimize system resources, and implement scaling strategies to support growing demands.

• Troubleshoot infrastructure and application issues, perform root cause analysis, and drive incident response.

• Implement security best practices, conduct vulnerability assessments, IAM roles/policies, and address security incidents promptly.

• Continuously evaluate existing systems, tools, and processes to identify areas for improvement.

• Recommend and implement enhancements to optimize efficiency, reliability, and scalability.

• Create and maintain documentation related to infrastructure, processes, and best practices

• Collaborate with developers to support deployment, performance, and reliability of services.

• Conduct incident response and root cause analysis for infrastructure issues. Optimize cloud infrastructure for performance and cost.

• Create documentation and runbooks for operational procedures and troubleshooting. Participate in on-call rotation for SRE Support.

Required Qualifications:

• Hands-on experience in a DevOps or SRE role.

• Hands-on expertise with CloudWatch, Splunk, & Dynatrace.

• Strong experience with AWS services (e.g., S3, RDS, Lambda, API Gateway, VPC, IAM, Event bridge, Serverless functions etc.).

• Experience with Infrastructure as Code (Terraform preferred).

• Proficiency with scripting languages (e.g., Python, typescript, Boto3).

• Strong knowledge of CI/CD tools and processes.

• Experience with observability and monitoring tools.

• Understanding of networking, security, and cloud cost management

Soft Skills:

• Excellent communication and collaboration skills.

• Ability to work independently and in a team-oriented environment.

• Strong problem-solving and debugging skills.

• Passion for automation and improving system reliability.

Apply