Post Job Free
Sign in

Site Reliability Engineer

Company:
Brooksource
Location:
Miramar, FL
Posted:
May 09, 2024
Apply

Description:

Site Reliability Engineer (SRE)

Contract to Hire

Remote (EST Time Zone)

Our Fortune 15 health care client is seeking a Site Reliability Engineer (SRE) to assist them as they fully transition to the cloud. You will play a critical role in ensuring the reliability, scalability, and performance of their systems and applications. You will work closely with cross-functional teams to design, implement, and maintain robust infrastructure and automation solutions. Your expertise in release management and automation will be instrumental in streamlining their software delivery processes and enhancing their overall operational efficiency.

Responsibilities:

Design, implement, and maintain highly available and scalable infrastructure solutions to support our applications and services on the Azure cloud platform.

Collaborate with software engineering teams to define and implement reliable deployment pipelines and release processes using GitHub and Azure Pipelines for CI/CD.

Develop automation scripts and tools using PowerShell and other languages to automate repetitive tasks and streamline operational workflows.

Implement monitoring, alerting, and logging solutions to proactively identify and mitigate potential issues. Help the team stabilize monitoring and improve observability.

Lead disaster recovery planning and testing efforts to ensure business continuity and minimize downtime in case of system failures or disasters.

Perform capacity planning and resource optimization to ensure optimal performance and cost-effectiveness of our infrastructure.

Participate in incident response and resolution, including root cause analysis and post-incident reviews.

Stay up to date with industry best practices, emerging technologies, and trends in site reliability engineering and DevOps.

Qualifications:

Bachelor's degree in Computer Science, Engineering, or a related field.

3+ years of experience in site reliability engineering, DevOps, or a similar role.

Proficiency in scripting and programming languages such as PowerShell, Python, Bash, or Go.

Hands-on experience with Azure cloud services and technologies.

Experience with GitHub and Azure Pipelines for CI/CD.

Strong understanding of containerization technologies and orchestration frameworks like Kubernetes.

Experience with configuration management tools such as Terraform, Ansible, or Puppet.

Familiarity with release automation tools like release-please.

Excellent problem-solving skills and the ability to troubleshoot complex issues in distributed systems.

Strong communication and collaboration skills, with the ability to work effectively in a cross-functional team environment.

Apply