Menu

Search

  |   Technology

Menu

  |   Technology

Search

NVIDIA’s GB200 NVL2 Servers Face Thermal Challenges in 2U Form Factor, Morgan Stanley Reports

NVIDIA’s GB200 NVL2 servers face thermal issues, prompting concerns about heat management in 2U form factor. Credit: EconoTimes

Morgan Stanley reports ongoing thermal issues with NVIDIA’s 2U air-cooled GB200 NVL2 servers, citing challenges in heat dissipation for the Blackwell-based GPUs. The firm suggests a potential move to a 4U form factor to address these problems, affecting NVIDIA’s production ramp-up.

Morgan Stanley Highlights Thermal Issues in NVIDIA’s 2U GB200 NVL2 Servers, Suggesting a Shift to 4U

Morgan Stanley has been notably vocal about NVIDIA’s Blackwell-related production challenges, recently publishing a detailed note on the minor setbacks that have delayed the company’s production ramp-up for its new Blackwell GPUs. The financial giant has issued another investment note, focusing on NVIDIA’s ongoing "technical challenges" with the GB200 NVL-based servers, drawing insights from its channel checks.

To recap, during its June quarter earnings announcement, NVIDIA confirmed the existence of a minor design flaw in the Blackwell GPUs. However, the company reassured investors that the issue had been rectified by adjusting the photomask — a critical template used in semiconductor manufacturing. Morgan Stanley added some context to this announcement, explaining that the problem was only identified after packaging, leading to the scrapping of units that resulted in the loss of CoWoS (chip-on-wafer-on-substrate) and HBM3e, both of which were already in limited supply. This exacerbated NVIDIA’s ongoing supply challenges.

Leveraging its supply chain insights, Morgan Stanley now highlights that NVIDIA’s MGX GB200 NVL2 servers, which are built in a 2U air-cooled form factor and integrate two Grace CPUs and two Blackwell B200 GPUs on a single PCB board, are currently facing thermal management issues. The GPUs connect to the main board using the SXM7 module, but the 2U form factor—where 1U equals 1.75 inches of server rack height—has proven problematic regarding heat dissipation.

Morgan Stanley elaborates on this by noting:

"All of the servers showcased at OCP were based on a 2U air-cooled form factor."

The firm further suggests that:

"Our conversations with supply chain partners indicated to us that there are still some thermal issues with the 2U form factor, so this may potentially end up being in a 4U form factor instead."

Morgan Stanley Notes AMD’s CoWoS Adjustments and Predicts Surge in Marvell’s Trainium 2 Production

Shifting focus, Morgan Stanley also touched on AMD’s adjustments to its CoWoS wafer bookings at TSMC for 2025, citing uncertainty around demand for its MI325 chip. NVIDIA has swiftly absorbed this freed-up capacity, highlighting the competition between the two companies for advanced manufacturing resources.

Moreover, Morgan Stanley forecasts that Marvell’s Trainium 2 chip production will surge in 2025, with CoWoS bookings at TSMC increasing threefold compared to 2024. The firm estimates:

"If the output for Trainium 2 is 200k units in 2024, based on CoWoS capacity booking, there could be 400k units of Trainium 2 in 2025 and 500k-600k units of Inferentia 2.5 in 2025.

This reflects the competitive landscape in the semiconductor industry, where supply chain management and technological innovations are critical to meeting growing demand.

  • Market Data
Close

Welcome to EconoTimes

Sign up for daily updates for the most important
stories unfolding in the global economy.