Skip to main content

Head of Resilience and Recovery

This job is brought to you by Jobs/Redefined, the UK's leading over-50s age inclusive jobs board.

Job description

The role reports to the Head of Platform Engineering & Operations within Enterprise-Wide Technology (EWT), the UK Firm's internal technology division. In total, EWT comprises c. 1,100 FTEs across both retained and outsourced teams, and across both onshore and offshore locations; budget for the fiscal year is c. £160m. EWT is accountable for delivering a range of services to the UK Firm, including gathering requirements, solution design, build and run, and the execution of complex change portfolios focused on security, data, core infrastructure and business applications.

We are looking to recruit a senior individual to improve our technology resilience and recovery capabilities across the UK firm, including those supporting Cloud and on-prem infrastructure, applications and data. You will build and lead our Resilience & Recovery team, creating risk-based resilience and recovery strategies, plans and roadmaps for our technology; create and maintain failure scenarios to guide invocation and recovery across multiple teams; and periodically test recovery plans, ensuring that recovery targets, including Recovery Time Objectives (RTOs) and Recovery Point Objectives (RPOs), can be met. This is a vital role that will help KPMG to identify, mitigate and manage significant risks in our technology estate.

Specific Responsibilities
Create risk-based resilience and recovery strategies, plans and roadmaps for all technology supporting KPMG business services, incl. Cloud and on-prem infrastructure, applications and data; create supporting policies, standards, processes and controls Create and maintain failure scenarios to guide invocation and recovery across multiple teams, including Service and Security Operations, Cloud Engineering, other KPMG Member Firms and Global IT Services, and third party service providers as required
Develop end-to-end playbooks and periodically test recovery plans, ensuring that recovery targets, including Recovery Time Objectives (RTOs) and Recovery Point Objectives (RPOs), can be met; identify lessons learned from simulated and real examples of business disruption, ensuring continual improvement
Ensure that business services are mapped to all relevant components of the tech stack and that an operating model exists for the maintenance of this data; proactively target technology weaknesses to improve resilience and recovery, e.g. by identifying and remediating End Of Life components, reducing point-to-point integrations, increasing redundancy and high availability, and removing key person dependencies
Maintain a risk register covering all types of planned and unplanned downtime arising from technology, e.g. due to mergers and acquisitions, Cloud migration, maintenance and upgrades, cyber attacks incl. ransomware, failed changes, and system failures
Provide training, advice and support to business unit CTOs, the Head of Service Operations and the Head of Cloud Engineering to develop their own resilience and recovery plans where these aren't supported by centrally maintained scenarios, e.g. including the recovery of SaaS applications deployed in the business
Create and sponsor projects to enhance the resilience and recovery capability of the technology function; act as a key stakeholder in the Backup and Recovery programme to implement Backup as a Service (BaaS)
Provide input on the evaluation, selection, implementation and maintenance of technology systems and third party services, ensuring appropriate investment in strategic and operational capabilities
Create and introduce KPIs to ensure that platforms are being managed and maintained to underpin viable recovery, e.g. backups are occurring as planned, data is being replicated, and recovery scripts function correctly

The Person A highly experienced IT professional, working as Head of Disaster Recovery or similar in current or previous roles; you need to be passionate about building a Recovery & Resilience CoE in KPMG and have a proven track record of doing this elsewhere
Highly capable people manager, adept at building and coaching a team of professionals with the right mix of grades, skills, and experience to support delivery and project growth; you will be able to motivate people, build a 'learn it all' rather than a 'know it all' culture, and foster an environment in which everyone's opinion is valued
Rich experience of working closely with other teams across the technology delivery lifecycle
Excellent communication and stakeholder management skills, with the ability to influence C-suite executives and others, to build and maintain strong relationships
Solid problem solving skills, including the ability to analyse complex data, identify core issues, and investigate, evaluate and reach appropriate conclusions
Broad experience of different types of technology change and resilience projects, including software, applications, infrastructure, middleware and end user computing
Good working knowledge of Business Continuity Standards (BS25999/ISO22301) and the Business Continuity Institute's Good Practice Guidelines
Good understanding of Microsoft M365 and Azure recovery capabilities desirable

#LI-MS1

Head of Resilience and Recovery

KPMG United Kingdom
London, UK
Full-Time

Published on 11/04/2024

Share this job now