Elon Musk Reveals Plans To Make World’s “Most Powerful” 100,000 NVIDIA GPU AI Cluster

Ramish Zafar

This is not investment advice. The author has no position in any of the stocks mentioned. Wccftech.com has a disclosure and ethics policy.

With the global AI industry doing all that it can to gain access to precious chips from NVIDIA and outpace Microsoft backed OpenAI in the race to develop the most advanced artificial intelligence model, Elon Musk has shared fresh details about his plans to build a GPU cluster to train xAI's Grok AI model. According to Musk, xAI has decided to rely only on itself to build "the most powerful training cluster in the world" after parting ways with Oracle to speed up progress on AI development. Oracle has provided 24,000 NVIDIA Hopper GPUs to xAI for training the Grok 2 AI model, which Musk shares will be ready to release in August.

Elon Musk's xAI Will Build 100,000 NVIDIA Hopper GPU System Itself To  Catch Up With Other AI Companies

Musk shared the latest details for xAI's 100,000 GPU cluster in response to a media report that outlined that talks between the AI firm and Oracle to expand their existing agreement have ended. Under the current deal, xAI was using 24,000 of NVIDIA's H100 GPUs to train the Grok 2 AI model, and by the looks of it, the firm was interested in expanding cooperation for Musk's 100,000 GPU system. Oracle, according to the media report, is also working with Microsoft to supply it with a cluster of 100,000 NVIDIA Blackwell GB200 chips, which are the latest AI processors on the market.

Related Story NVIDIA GeForce RTX 40 GPUs Might Potentially Be Short In Supply Around The Globe, 4070 & 4060 Series Affected The Most

Musk shared that xAI is building its 100,000 GPU AI system internally to achieve the "fastest time to completion." He believes this is necessary to "catch up" with other AI companies, as, according to him, "being faster than any other AI company" is very important for xAI's "fundamental competitiveness."

Today's details follow Musk's statements early last month which revealed xAI's plans to build a multi billion dollar system with NVIDIA's Blackwell chips. He had outlined that the system would use roughly 300,000 B200 GPUs. When taken together with price details shared by NVIDIA CEO Jensen Huang, the system could cost as much as $9 billion.

Musk believes that by building the H100 system instead of working with Oracle, xAI can achieve the "fastest time to completion." The system will start training this month and will be the "most powerful training cluster in the world by a wide margin," believes the executive. Before models like Grok or ChatGPT are ready to respond to queries, they are trained on existing data sets. These allow them to mathematically predict what the response to a user's question will be based on what they have already learned

As key players in the AI industry spend 2024 upgrading and launching new models, xAI has been relatively mute on this front. Now, Musk has shared that Grok 2 will be available next month, as most of the model's development is complete and xAI is making last minute adjustments and bug corrections.

AI chips are a hot commodity and have propelled NVIDIA to become the third most valuable company in the world in the course of less than a year. Facebook's parent company Meta has shared its plans to accumulate 350,000 of these chips by the end of 2024 to work with its AI platform. Meta, Google parent Alphabet, Microsoft backed ChatGPT and Amazon's Anthropic are among the world's leading AI software companies.

Share this story

Comments