Requirement- SRE
Experience: 6+Years
Location : Pan India
Key responsibilities
Review Monitoring & alerts to provide recommendations for enhancement towards 360 coverage
Create dashboards, setup synthetic and real user monitoring, visualize large data sets with interactive custom dashboards, setup alerts, reports, self-remediation actions, leverage AIOps capabilities using APM tools.
Identify areas of automation for manual tasks and suggest utilities, solutions, and plan, which includes CI/CD implementation and best practices enforcement.
Review Reliability/Resiliency assessment strategy and results/observations to provide recommendations for improvement
Support reporting and tracking of reliability defects in the management platforms
Key Skills
Experience in Non-functional requirements management; gathering, determination, enforcement, assessment, and assurance
Should have experience handling distributed (preferably multi-cloud) infrastructure.
Should have worked on a minimum of 3 projects in performance monitoring of Applications / Infra Domain and Deployment
experience in APM tools & Cloud monitoring tools
Strong working knowledge of Git and code-review systems such as Gerrit, Bitbucket, and GitHub
Deeper understanding of SRE concepts such as SLO, SLI and error budgeting and knowledge on Change management, Agile, ITIL concepts, SOP creation, Life Cycle management is a Plus.
A deep understanding of CI/CD technologies & tools. Also, good understanding of AIOps
Good to have Skills
DevOps Tools Skills: Terraform/CloudFormation, Ansible, Chef, Puppet, Jenkins
APM Tools Skills: AppDynamics, Dynatrace, ELK, New Relic, eG Innovation, Splunk, BMC Trusight
Infra Tools Skills: Microfocus, SolarWinds
Cloud Monitoring tools: Cloud Watch, Azure App Insight, DataDog
Scripting Skills: Java Script, Python, Power Shell, Unix Shell
Fundamental Knowledge: Dockers, Kubernetes