Tech giants form AI group to counter Nvidia with new interconnect standard

May 30, 2024:

Abstract image of data center with flowchart.

On Thursday, several major tech companies, including Google, Intel, Microsoft, Meta, AMD, Hewlett Packard Enterprise, Cisco, and Broadcom, announced the formation of the Ultra Accelerator Link (UALink) Promoter Group to develop a new interconnect standard for AI accelerator chips in data centers. The group aims to create an alternative to Nvidia’s proprietary NVLink interconnect technology, which links together multiple servers that power today’s AI applications like ChatGPT.

The beating heart of AI these days lies in GPUs, which can perform massive numbers of matrix multiplications—necessary for running neural network architecture—in parallel. But one GPU often isn’t enough for complex AI systems. NVLink can connect multiple AI accelerator chips within a server or across multiple servers. These interconnects enable faster data transfer and communication between the accelerators, allowing them to work together more efficiently on complex tasks like training large AI models.

This linkage is a key part of any modern AI data center system, and whoever controls the link standard can effectively dictate which hardware the tech companies will use. Along those lines, the UALink group seeks to establish an open standard that allows multiple companies to contribute and develop AI hardware advancements instead of being locked into Nvidia’s proprietary ecosystem. This approach is similar to other open standards, such as Compute Express Link (CXL)—created by Intel in 2019—which provides high-speed, high-capacity connections between CPUs and devices or memory in data centers.

It’s not the first time tech companies have aligned to counter an AI market leader. In December, IBM and Meta, along with over 50 other organizations, formed an “AI Alliance” to promote open AI models and offer an alternative to closed AI systems like those from OpenAI and Google.

Given the market dominance of Nvidia—the current market leader in AI chips—it is perhaps not surprising that the company has not joined the new UALink Promoter Group. Nvidia’s recent massive financial success puts it in a strong position to continue forging its own path. But as major tech companies continue to invest in their own AI chip development, the need for a standardized interconnect technology becomes more pressing, particularly as a means to counter (or at least balance) Nvidia’s influence.

Speeding up complex AI

UALink 1.0, the first version of the proposed standard, is designed to connect up to 1,024 GPUs within a single computing “pod,” defined as one or several server racks. The standard is based on technologies like AMD’s Infinity Architecture and is expected to improve speed and reduce data transfer latency compared to existing interconnect specifications.

The group intends to form the UALink Consortium later in 2024 to manage the ongoing development of the UALink spec. Member companies will have access to UALink 1.0 upon joining, with a higher-bandwidth version, UALink 1.1, planned for release in Q4 2024.

The first UALink products are expected to be available within the next two years, which may afford Nvidia plenty of lead time to expand proprietary lock-in as the AI data center market grows.

Source link