Open Compute Project Foundation expands its open systems for AI initiative

October 16, 2024 | 00:30

(0) user say

NVIDIA and Meta Contribute Modular Server and Rack Technologies Establishing New Multi-Vendor AI Cluster Supply Chain

SAN JOSE, Calif., Oct. 16, 2024 /PRNewswire/ -- Today, the Open Compute Project Foundation (OCP), the nonprofit organization bringing hyperscale innovations to all, announced an expansion of its Open Systems for AI Strategic Initiative, with approved contributions from NVIDIA, including the NVIDIA MGX-based GB200-NVL72 Platform and in-progress contributions from Meta. The OCP launched this community effort January 2024, with leadership provided by Intel, Microsoft, Google, Meta, NVIDIA, AMD, ARM, Ampere, Samsung, Seagate, SuperMicro, Dell and Broadcom. The objective for the OCP Community with the Open System for AI initiative is to establish commonalities and develop open standardizations for AI clusters and the data center facilities that host them, advancing efficiency, sustainability and enabling the development of a multi-vendor supply chain that rapidly and impactfully advances the market adoption.

NVIDIA has contributed MGX based GB200-NVL72 Rack and Compute and Switch Tray designs, while Meta is introducing Catalina AI Rack architecture for AI clusters. The contributions by NVIDIA and Meta, along with efforts by the OCP Community, including other hyperscale operators, IT vendors and physical data center infrastructure vendors, will form the basis for developing specifications and blueprints for tackling the shared challenges of deploying AI clusters at scale. These challenges include new levels of power density, silicon for specialized computation, advanced liquid-cooling technologies, larger bandwidth and low-latency interconnects, and higher-performance and capacity memory and storage.

"We strongly welcome the efforts of the entire OCP Community and the Meta and NVIDIA contributions at a time when AI is becoming the dominant use case driving the next wave of data center build-outs. It expands the OCP Community's collaboration to deliver large-scale high-performance computing clusters tuned for AI. The OCP, with its Open Systems for AI Strategic Initiative, will impact the entire market with a multi-vendor open AI cluster supply chain that has been vetted by hyperscale deployments and optimized by the OCP Community. This significantly reduces the risk and costs for other market segments to follow, removes the silos, and is very much aligned with OCP's mission to build collaborative communities that will streamline deployment of new hardware and reduce time-to-market for adoption at scale," said George Tchaparian, CEO at the Open Compute Project Foundation.

NVIDIA's contribution to the OCP Community builds upon existing OCP ORv3 specifications to support the ecosystem in deploying high compute density and efficient liquid cooling in the data center. NVIDIA's contributions include: (1) its reinforced rack architecture that provides 19" EIA support with expanded front cable volume, the high-capacity 1400A bus bar, an NVIDIA NVLink cable cartridge, liquid-cooling blind mate multi-node interconnect volumetrics and mounting, and blind mate manifolds, and (2) its 1RU liquid-cooled MGX compute and switch trays, including a modular front IO Bay design, the compute board form factors with space for a 1RU OCP DC-SCM, liquid-cooling multi-node connector volumetrics, blind mate UQD (universal quick disconnect) float mechanisms and narrower bus bar connectors for the switch trays.

"NVIDIA's contributions to OCP helps ensure high compute density racks and compute trays from multiple vendors are interoperable in power, cooling and mechanical interfaces, without requiring a proprietary cooling rack and tray infrastructure — and that empowers the open hardware ecosystem to accelerate innovation," said Robert Ober, Chief Platform Architect at NVIDIA.

Meta's in-progress contribution includes the Catalina AI Rack architecture, which is specifically configured to deliver a high-density AI system that supports GB200.

"As a founding member of the OCP Foundation, we are proud to have played a key role in launching the Open Systems for AI Strategic Initiative, and we remain committed to ensuring OCP projects bring forward the innovations needed to build a more inclusive and sustainable AI ecosystem," said Yee Jiun Song, VP Engineering, Meta

The OCP Community has been actively engaged in building open large-scale high-performance computing platforms, and the AI use case is a natural extension of the Community's activities. As the AI-driven buildout moves along, trends show AI-accelerated systems to be deployed out to the edge, and on-premises at enterprise data centers, ensuring low latency, data recency and data sovereignty. At-scale problems discovered by hyperscale data center operators are solved by the OCP Community bringing innovations to all AI data centers such as modularity of servers, precise time keeping, security, liquid cooling and specialized Chiplet based System in Package (SiP).

"The market for advanced computing equipment to deliver on the promise of AI is moving rapidly, with vendors and hyperscale operators each developing their own custom solutions. While this rapid pace is essential to create a differentiated offering and maintain competitive advantage, it risks fracturing the supply chain into silos, raising costs, and reducing efficiencies. In addition, the potential environmental impact of such a large data center buildout must be attenuated. The timing is right, for collaborative innovations to drive efficiencies using less power, water and lower carbon footprint to impact the next generation of AI clusters that will be deployed by the hyperscale data center operators and also cascade to enterprise deployments. The contributions by NVIDIA and Meta and their continued engagement in the OCP community has the potential to benefit many market segments," said Ashish Nadkarni, Group Vice President and General Manager, Worldwide Infrastructure at IDC.

By PR Newswire

Open Compute Project Foundation

What the stars mean:

★ Poor ★ ★ Promising ★★★ Good ★★★★ Very good ★★★★★ Exceptional