Morgan Stanley Reports NVIDIA GB200 NVL72 Racks Achieve 77.6% Profit Margin Compared to AMD MI355X’s -64% Margin, With Similar Total Cost of Ownership

This content does not constitute investment advice. The author holds no positions in the stocks referenced herein.

Understanding GPU Economics and AI Factory Efficiency

Amidst the fluctuating dynamics of GPU economics, Morgan Stanley has released a compelling analysis highlighting the exceptional efficiency of NVIDIA’s GB200 NVL72 GPUs for powering large-scale AI factories. This insight is especially pertinent for stakeholders involved in investment decisions or technological advancements in AI infrastructure.

Key Components of the NVL72 AI Racks

To clarify, each NVL72 AI rack integrates 72 NVIDIA B200 GPUs along with 36 Grace CPUs, which are interconnected through the advanced NVLink 5 technology, designed for high bandwidth and low latency. Notably, the current cost for such a server rack exceeds $3.1 million, in stark contrast to approximately $190, 000 for an H100 rack.

Despite the higher initial investment, Morgan Stanley argues that choosing NVIDIA’s latest rack-scale solution offers superior economic advantages over the older generation H100k, aligning with contemporary market demands.

Profitability Insights

According to Morgan Stanley’s calculations, NVIDIA’s GB200 NVL72 systems outpace competitors in terms of profitability and revenue generation. The TPU v6e pods developed by Google hold a position just behind NVIDIA’s offerings, as illustrated in the following profitability chart for a theoretical 100MW AI factory.

Chart of 100MW AI factory profitability by server rack showing profit margin and cost comparison.

Specifically, the GB200 NVL72 AI racks can yield an impressive 77.6 percent profit margin, while Google’s TPU v6e achieves a close 74.9 percent margin.

Cost Comparisons and Market Dynamics

While pricing for Google’s TPU v6e pods remains unavailable publicly, it is generally noted that rental costs for TPU pods are roughly 40 to 50 percent less than those for NVL72 racks.

AMD’s Position in the Market

Morgan Stanley’s report further indicates a concerning trend for AI factories using AMD’s MI300 and MI355 technologies, which are projected to incur negative profit margins of -28.2 percent and -64 percent, respectively.

Total Cost of Ownership Analysis

The analysis assumes that establishing a 100MW AI data center includes infrastructure costs approximating $660 million, amortized over a decade. GPU expenses can fluctuate remarkably, ranging from $367 million to $2.273 billion with a four-year depreciation period. Additionally, operational costs account for power efficiencies from cooling systems adjusted to global electricity rates.

In this context, NVIDIA’s GB200 NVL72 systems present the highest Total Cost of Ownership (TCO), calculated at $806.58 million, closely followed by the MI355X platform at $774.11 million.

Source & Images