Manager, Operational Resilience Engineering
This job is brought to you by Jobs/Redefined, the UK's leading over-50s age inclusive jobs board.
Job Description
Overview
At Fanatics Betting & Gaming (FBG), a core division of Fanatics' mission to build the ultimate end-to-end digital sports platform, we're shaping the future of sports betting. As part of our team, you'll help create cutting-edge experiences that match the passion of fans worldwide.
We're looking for talented individuals to join us in driving innovation across our Sportsbook and Casino products, helping to strengthen the reliability, efficiency, and resilience of our platforms. In this role, you'll contribute to reducing the frequency and impact of Priority 1 incidents with measurable improvements in Mean Time to Mitigation (MTTM), ensuring smooth and transparent incident processes that keep engineering teams focused while stakeholders remain informed. You'll also help embed a culture of continuous improvement, turning post-incident reviews into actionable follow-ups that prevent repeat issues, while supporting proactive operations through runbooks, automation, and advanced monitoring to reduce day-to-day overhead. By tracking and reporting key operational health metrics, you'll play a key role in building data-driven resilience across the business.
Responsibilities
- Build, lead, and mentor the Operational Resilience Engineering team.
- Own the end-to-end incident management lifecycle, including response, escalation, postmortems, and continuous improvement.
- Define and standardize incident playbooks, escalation practices, and operational metrics (MTTR, error budgets, reliability KPIs).
- Champion a blameless culture that drives learning and operational maturity.
- Partner with engineering, product, and business leaders to align resilience priorities with risk, compliance, and customer impact.
- Deliver executive-level communications during major incidents with clarity and business context.
- Stay hands-on in critical incidents and guide improvements in monitoring, alerting, automation, and observability.
- Leverage and evolve key tools (AWS, Datadog, PagerDuty, Terraform, GitHub) to improve operational resilience.
Required Qualifications
- 8+ years in platform operations, site reliability, incident management, or operational resilience.
- 3+ years in leadership roles, including team management and cross-functional incident response.
- Proven track record leading high-severity incidents and communicating effectively under pressure.
- Strong technical background in cloud platforms (AWS preferred), infrastructure-as-code (Terraform), and CI/CD/containerized workloads.
- Deep familiarity with observability and incident tooling (Datadog, PagerDuty, FireHydrant, or similar).
- Strong written and verbal communication skills, with ability to distill technical detail into executive-
- ready updates.
- Demonstrated success driving metrics-based improvements in reliability and availability.
- Skilled in stakeholder management and influencing across engineering, product, and business leadership.
Other Qualifications / Nice to have
- Experience in high-availability, high-transaction industries (sports, gaming, entertainment, or similar).
- Familiarity with cloud-native development and modern infrastructure practices.
- Knowledge of ITIL or incident management best practices (certification not required).
- Background with regulatory or compliance-driven environments.
Ready to build the future of sports betting? If you possess some of these skills but not all of them, we still encourage you to apply! Please note that visa sponsorship is not available for this position. We are open to fully remote candidates based in the United Kingdom or Ireland, but we strongly encourage those who can join us on campus two days per week
Responsibilities
- Design and development of various integration specifications using JSON, XML, EDI (X12 and EDIFACT maps). Very familiar with Supply chain domain concepts. Having worked with a retailer or wholesaler in apparel industry is a definite plus.
- Work closely with the order capture and validation teams as they build out new solutions for B2B Customer order capture and enabling a pathway to migrating applications from the current older EDI platform (Informix 4GL) to a new one (Java, ReactJS). The transition role is more from supporting on a Subject Matter Level effort providing guidance on business rules.
- Manage daily EDI and integration operations including but not limited to - monitoring order flow between B2B entities, shipment advices, invoice processing etc
- As a leader of integration projects, the manager is responsible for defining project scopes, deliverables and schedules. Allocates resources for such efforts and ensuring that any testing efforts both internal and external are well coordinated.
- Communication is key for this role. Some important activities for communication involve establishing business stakeholder meetings, frequent meetings with product managers who are responsible for various systems that interface with the order and shipment pipelines.
- Manage team members who have engineering and non engineering background.
Qualifications
- The candidate will be reporting the Director of Engineering / VP of Engineering, so understanding of finance concepts of running a small or large team is essential.
- The candidate will have software engineers and analysts reporting into them. People management skills is also essential.
- Conflict resolution and understanding trade offs when prioritizing resources is a critical trait of the Engineering manager.
- The candidate will have expert skills in SQL, software engineering principles, agile and Kanban delivery mechanisms, documentation using wiki and task management tools.
- The candidate will possess fair skills in at least one programming language, software development estimation, software architecture and tools associated with that.
- Should have a minimum of 5 years experience in the relevant field as an individual contributor and a minimum of 3 years experience as a manager.
- Bachelor's degree in engineering in computer science or equivalent discipline preferred. MBA is an added plus.