Role Overview
We are building a new class of vertically integrated AI infrastructure that unites power, data centre environments, and GPU clusters into a single industrial-scale system. As the VP of Data Centre Operations, you will lead the operational backbone of this platform. This is a high-stakes, hands-on leadership role focused on running live infrastructure from the earliest stages of deployment. You will define the operational model from day one and ensure our high-density environments are stable, performant, and ready for enterprise-grade AI workloads.
Key Responsibilities
- Operational Leadership: Own the end-to-end operations of high-density GPU infrastructure, scaling from the first cluster to multiple global regions.
- Strategic Partnerships: Manage relationships with colocation providers to optimise live environments and ensure strict adherence to SLAs.
- Incident Management: Personally lead the response to critical incidents, ensuring rapid resolution under pressure in a 24/7 production environment.
- Process Engineering: Build and implement the operational standards, processes, and tooling required to transition from build-phase to live operations.
- Team Building: Recruit and lead a high-performance operations organisation capable of supporting hyperscale and AI-native customers.
Required Skills and Qualifications
- Extensive experience operating hyperscale or large-scale data centre environments with direct responsibility for uptime.
- Proven track record in incident management and running mission-critical production environments under intense pressure.
- Deep technical understanding of power, cooling, and the physical infrastructure required for high-density compute.
- Experience building operations teams from the ground up in early-stage or rapidly scaling environments.
Nice-to-Have Qualifications
- Experience specifically with GPU clusters or AI-specialist infrastructure.
- Previous leadership roles within a major Cloud Service Provider (CSP) or high-performance computing (HPC) firm.