Associates in Information Technology -
Computer Network Systems
ITT Technical Institute
Strong, in-depth understanding of
Incident, Problem, Change
Management Processes
Experience with environment
operating system monitoring tools,
Splunk, Dynatrace, Datadog,
Solarwinds, Grafana
Expert with Helpdesk software
ServiceNow, Footprints, Jira, Kanban,
Asana software
Excellent problem resolution
judgment and decision making
Team player, ability to lead teams
globally
Experience working with all levels of
management
Exceptional verbal and written
communication skills
Ability to work effectively in a fast-
paced, dynamic environment.
Experience working with IT systems
and applications
Ability to Analyze a high volume of
technical data and work in a fast-
paced environment
Advanced experience in Incident,
Major Incident & Problem process
Ability to adapt, learn new
technologies, develop new processes,
and improve procedures
Ability to work with cross-functional
teams, and manage and
communicate with teams in different
locations
*********@*****.***
Linkedin Profile
12592 Wildcat Cove Cir, Estero, FL 33928
Detail-oriented team player with exceptional organizational skills and strong customer focus and a passion for improving service resilience and customer satisfaction. Ability to handle multiple projects/incidents simultaneously with a high degree of accuracy. Over 15 years of experience in supporting and analyzing business solutions for Fortune 100+ companies. Observation, inspiration, and determination are my foundation for success. Introducing fresh perspectives and new techniques allow businesses to evolve and grow. July 2021 - November 2022
Applied development best practices in daily tasks for efficiency and accuracy Responded to technical concerns quickly and effectively devised solutions Developed and implemented process flow improvement and standardization projects
Support and monitor Critical / Major applications and services Focus on identifying business impact to critical incidents Provide tactical solutions and support the strategic vision Continuous Improvement to Incident/Problem Processes Host Post Incident Reviews for follow-up activities leading to Problem Management
Monitor progress and perform escalations as needed November 2022 - February 2024
Overseeing the incident management process and team members involved in resolving the incident.
Provide operational metrics for all incidents and problems North America Incident Leader
Handled high priority incidents with exceptional poise and composure, making quick decisions in effort to reduce overall impact Gathered data from incidents that had been remedied for careful review and analysis to prevent future events
Assessed incident priority based upon impact to business and escalated issues as necessary
Contributor to reduction of incidents and downtime by driving process/continuous improvement efforts
Creation of runbook/playbooks (Knowledge Articles) Support/monitor/lead and scribe Severity 0 -2 Incidents Monitor staging and non production environments
Hosting After Action Review (AAR) meetings to review Post Mortem docs Partner with support teams to implement Corrective Actions for Incidents identified by users
Provide 24x7 on-call support
Partner with Site Reliability Engineers to ensure stability of infrastructure Partner with Observability teams post incident to determine where additional monitoring can be implemented
S K I L L S
C O N T A C T
E D U C A T I O N
P R O F I L E
W O R K E X P E R I E N C E
Senior Critical Incident Manager Lead
Airbnb
Principal Engineer Service Management
CarMax
C E N K A K O V A
S R C R I T I C A L I N C I D E N T
M A N A G E R L E A D
Support and monitor Tier 1 (Critical / Major applications) Lead Problem Management Root Cause investigations for all Critical and Major P1/P2 Incidents
Work with various teams to improve monitoring and critical incident alert notifications
Work with application owners and monitoring teams to ensure all critical functions are monitored to prevent critical or major outages (Preventative / Proactive)
Built key metrics to utilize for decision making and developing “run-books” and continue to look for improvements
Publish Root Cause analysis reports to technology and business leadership Continue to build knowledge and thorough understanding of Service Management/ITIL framework
Continue to improve Incident and Root Cause analysis processes for critical and major incidents
Responsible for communicating critical and major incidents to internal leadership/IT teams based on scale/scope and impact Lead and document critical and major outages untilfull service is restored Engage/escalate to application/infrastructure SMEs to assess and determine best course of actions to be taken to restore full functionality Engage various support personnel and drive critical and major incidents until service is restored
Established team priorities, maintained schedules and monitored performance Work with application owners and monitoring teams to ensure all critical functions are monitored to prevent critical or major outages (Preventative / Proactive)
Exposure with Change Management – Host CAB meetings Communicate concise communications based on business impact Manage and coordinate urgent and complicated support issues Perform deep dives on incidents manually reported to create strategic directives March 2020 - July 2021
May 2019 - March 2020
Analyzed and assessed data on incidents and applied analytical methods to support management decision making
Available 24x7 to lead Major Incident bridge calls by providing strong leadership and restore service as quickly as possible, in addition to sending executive communications based on incident scope and impact
Enhanced and developed new Incident and Problem management processes Work with Disaster Recovery team to ensure each critical application is well documented
Created dashboards to detect and monitor critical applications to avoid major incidents
Responsible for Network level 1 team to ensure alerts are acknowledged and escalated within established SLT
Partnered with various technology teams to create and maintain “runbooks” for application/system outages
Built relationships with field support technicians to ensure each site that has encountered issue is fully rectified within SLT
Provide monthly metrics and KPIs based on availability from application perspective and overall uptime and downtime across enterprise Continuous improvement (Best Practices) - Look for monitoring opportunities
(preventative measures) from application and infrastructure perspective Incident/Problem Manager Lead
Arthrex
Enterprise Incident and Problem Manager Lead
Autonation
Command Center Sr Manager, Lead
Allstate
January 2010 - April - 2019
Provide overall guidance, education, and training to all Incident Managers in two locations to ensure processes and SOP's created are followed to ensure customer and leadership expectations are exceeded
Manage and host Major Incidents (critical) to business and driving fast incident resolution
Ability to drive restoration of services in timely manner to reduce MTTR and MTTA
Experience with identifying unexpected behaviors and ensuring restoration of service and root cause analysis
Responsible for decisions made related to escalation and prioritization and actions taken
Ability to drive root cause investigations, document actions taken, create and update current SOP's
Communication and coordination status of Major Incidents Document lifecycle of Incident and Problem related incidents Correlate and work with various support personnel (ITSM, Application, SME) to improve/and or correct processes, workarounds, and continuous improvements Create variety of reports to present to leadership of daily/weekly/monthly stats
(Trending analysis)
Assess and provide areas of improvement based on data assessed Demonstrate organizational success factors by being point person for daily issues, updates, and communications for two centers (located in Belfast, Ireland; and Northbrook, IL)
Focus on strategy and continuous improvement
Providing input to problem management teams during root cause investigations Coordinating responses between technical teams during service disruption to address and solve service failures as quickly and effectively as possible Risk assesses major impacting incidents, and take whatever actions are necessary to restore service back to business and customers Act as liaison between incident and leadership and provide status updates when called upon
Interact and work closely with leadership, application support and technical support groups to resolve incidents
Responsible for Incident Detection, Facilitation, Communication, Change Coordination,Problem Coordination for all applications within Command Center Perform end to end transaction monitoring for all Tiers 0-9 priority one applications to identify trends/problems, make on-demand decisions, create processes, and maintain health/availability
Detect/maintain IT infrastructure/application incidents by monitoring both application and infrastructure for errors, latency, or any other issues with various monitoring tools
Maintain Target MTTR (Mean Time to Repair)
Perform best practices by using ITIL methodology framework Support comprehensive DR testing for consumer, agency and claims facing applications Consistently and effectively implement innovative solutions and streamline processes that impact customers and internal partners to increase speed, agility, and affordability
Track key standards and metrics Assist in creation and communication of monthly metrics on Command Center services to leadership Measured team performance and reported metrics to leadership team members