Skills: AWS native services such as CloudWatch, CloudTrail, AWS Config, and X-Ray
AWS Engineer
Monitoring and Alerting
Design and implement monitoring solutions using AWS native services such as CloudWatch, CloudTrail, AWS Config, and X-Ray.
Set up custom metrics, dashboards, and automated alerts to ensure system health and performance.
Continuously refine monitoring strategies to align with evolving infrastructure and application needs. Incident Response and Troubleshooting
Lead incident response efforts for AWS infrastructure and services.
Perform root cause analysis and implement corrective actions to prevent recurrence.
Collaborate with development and operations teams to resolve issues quickly and effectively.
Maintain incident logs and post-mortem reports for transparency and learning. Documentation and SOPs
Create and maintain runbooks, standard operating procedures (SOPs), and knowledge base articles for operational tasks and incident handling.
Ensure documentation is up-to-date, accessible, and aligned with compliance and audit requirements.
Contribute to onboarding materials and training guides for new team members.