Among NVIDIA’s slate of announcements tonight at Computex 2022, the company has revealed that it is preparing to launch liquid cooled versions of their high-end PCIe accelerator cards. Being offered as an alterative to the traditional dual-slot air cooled cards, the liquid cooled cards come in a more compact single-slot form factor for both improved cooling and improved density. The liquid cooled A100 will be available in Q3, and a liquid cooled H100 will be available early next year.
While liquid cooling is far from new in the datacenter, it’s typically been reserved for more bespoke hardware with extreme cooling and/or density requirements, such as the upcoming generation of high-end NVIDIA H100 (SMX) servers. PCIe servers, by contrast, are all about standardization and compatibility. Which for server video cards/accelerators means dual slot cards designed for use with forced air cooling within a server chassis. This serves the market segment well, but the 300 to 350 Watt TDPs of these cards means that they can’t get any thinner and still be effectively cooled by air – which in turn creates a 4 card limit for standard rackmount systems.
But times are changing, and liquid cooling is being implemented in datacenters in greater capacities both to keep up with cooling ever-hotter hardware, and to improve overall datacenter energy efficiency. To that end, NVIDIA will be releasing liquid cooled versions of their A100 and H100 PCIe cards in order to give datacenter customers an easy and officially supported path to installing liquid cooled PCIe accelerators within their facilities.
The cards (pictured above) are essentially a reference A100/H100 with the traditional dual-slot heatsink replaced with a single-slot full coverage water block. Designed to be integrated by server vendors, they use an open loop design that is meant to be used as part of a larger liquid cooling setup.
But other than changing the cooling system, the specifications of the cards remain unchanged. NVIDIA isn’t increasing the TDPs or clockspeeds on these cards, so their performance should be identical to traditional air cooled cards (so long as they’re not thermally throttling, of course). Put another way, these new cards are using liquid cooling to improve energy efficiency and density, rather than performance.
The first card out of the gate will be the liquid cooled version of the 80GB A100 PCIe accelerator. That will be available to customers in Q3 of this year. Meanwhile a liquid cooled version of the H100 PCIe is also under development, and NVIDIA expects that to be available in early 2023.
In the interim, NVIDIA has been working with Equinix in order to qualify the liquid cooled A100 within their datacenters, as well as to get an idea of the real-world power savings of the new hardware. Interestingly, NVIDIA is reporting a significant reduction in overall datacenter power usage from switching to liquid cooling – a 2000 server (4000 A100 card) setup saw its total power needs drop by 28%. According to NVIDIA, this is from a combination of overall power savings across the datacenter from the switch, including everything from improved video card energy efficiency from lower temperatures, to reduced energy needs from cooling water versus running large air chillers. All of which underscores why NVIDIA is promoting liquid cooled hardware as a power efficiency gain for datacenter operators who are looking to trim power usage.
And while this first generation of liquid cooled hardware is focused on efficiency, according to NVIDIA that won’t always be the case. For future generations of cards the company will also be looking at liquid cooling to improve performance at current energy levels – presumably by investing the datacenter-scale gains back into higher TDPs for the cards.
Finally, while the bulk of NVIDIA’s announcement today (as well as the case study) is focused on PCIe cards, NVIDIA is also revealing that they’ve been working on official, liquid cooled designs for their HGX systems as well, which are used to house the company’s more powerful SMX cards. A liquid cooled HGX A100 is already shipping, and a liquid cooled HGX H100 is slated to be released in Q4.
AnandTech