Ref ESCAMAN1219 CDI Paris Roubaix France This new unit is in charge of
Incident initial assessment, declaration and logging
Major incident priority continuous assessment
Coordination between OVH stakeholders (incl. Comex and relevant teams)
Internal communication, with team members being the legitimate single source of truth
Major incident closure and their Post Mortem follow-up
Support Service Operations governance
Data analytics comes also within his scope.
Escalation Manager role is energized by the challenge of working through difficult situations and finding positive outcomes.
The activity is an 24h / d 7d / w, based on on-call process.
Composed by 3 people, the team can defuse the situation, isolate the core issue and address the customer's concerns.
All members of this team must
Enjoy gathering information and understanding the facts of a situation
Have an exceptional critical thinking skills and like solving problems
Have the ability to understand complex situations and formalize from them clear and executable improvement plans while keeping the global picture in mind.
Have to deep dive into OVH monitoring data in order to compute it, have it enriched, in order to produce quality of service reports and dashboard
Major incident management
Assumes command & control of Major Incidents, the graduation of low impact incidents to Major Incidents and triggering of Crisis.
Identification and Support of Resolution Manager.
Ensures all Major Incidents remediation efforts focus on tactical restoration of service within SLA guidelines
Monitore ongoing outages to ensure compliance with SLA's and performance goals.
Analyze Business Security / Financial Impact
Facilitate Conference Calls directing resources of highly technical competencies to maintain focus on restoration of service.
Accountable for all communication with respect to Major Incidents inclusive of target audience, content / context,... Ensure communication is consistent, clear, concise, and effective.
Responsible for the ongoing development and maturity of the communications process and content.
Optimize response capabilities providing input to tools development which accelerate Major Incident response capabilities throughout the organization.
Engage with customer executives and account teams when necessary.
Engage with Crisis team if required.
Gathering data and producing standard or ad hoc reporting and presenting to management team, account team and customer as required.
Ensure key performance indicators are measured for Major Incidents and report trends of process maturity levels and tool adoption.
Drive action plans to improve QoS thanks to the data analytics.
Measure and define effectiveness of the problem processes and identifying opportunities for improvement.
Produce statistics and reports to demonstrate performance of the Problem Management process.
Analyze and report incident trend data to identify and eliminate root causes.
Identify and promote proactive means to prevent incidents from occurring or to restore service more quickly.
Lead and facilitate post mortem investigations into high impact incidents and take ownership to drive root cause determination, risk mitigation, and ensure permanent resolution is put in place.
Your assets ?
Stay calm and structured under pressure
Ability to adjust quickly to shifting priorities and make quick decisions with limited information.
Self-driven, proactive and organized
Business and Data Analytics driven
Knowledge & experience with infrastructure & technologies (i.e. DCs, Networking, Server, ) sufficient for driving troubleshooting and remediation.