OpenAI, guided by Sam Altman, plans to amass over a million GPUs by year's end, underscoring the rapid expansion of the ChatGPT creator.
OpenAI, the leading AI research and deployment company, is embarking on a significant infrastructure expansion, aiming to deploy over 1 million GPUs by the end of 2025 - a "hundredfold expansion" from its current capacity[1][2]. This move is part of OpenAI's broader strategy to maintain its leadership in AI research and deployment by dramatically increasing its computational resources[1][2].
This aggressive scaling is not just about purchasing more GPUs; OpenAI is diversifying its hardware portfolio. While NVIDIA GPUs have historically powered both model training and inference, OpenAI recently began leasing Google Cloud’s Tensor Processing Units (TPUs) to handle ChatGPT’s inference workload, marking a shift away from a single-vendor solution and potentially reducing costs and supply chain risks[3]. However, for training its largest models, OpenAI still appears to rely on NVIDIA GPUs for now[3].
To support this expansion, OpenAI has also negotiated with Oracle to secure additional datacenter capacity. Reports suggest that OpenAI’s plans could require as much as 4.5 gigawatts of power for its future datacenters—enough to run approximately 2 million GPUs[4]. However, such scale-ups are likely to proceed in phases and are contingent on demand and the availability of sufficient energy infrastructure[4].
OpenAI's ambition to deploy 1 million GPUs by 2025 represents a 100x increase over its current footprint[1][2]. This is not just a linear scale-up but a fundamental transformation in the company’s compute infrastructure, intended to support larger and more capable AI models, faster iterations, and greater reliability for services like ChatGPT, which now serves over 100 million daily active users[3]. The introduction of Google TPUs suggests that OpenAI may further diversify its hardware mix as it grows, but the core of its expansion remains firmly GPU-based—at least in the near term[3].
Elon Musk's xAI is pursuing an even more aggressive path, aiming to deploy 50 million H100-equivalent AI compute units by 2030[1][2]. This target is nearly 50 times larger than OpenAI’s 1 million GPU goal for 2025, indicating xAI’s intent to leapfrog rivals in AI infrastructure scale[1]. xAI also plans to go beyond matching NVIDIA’s best GPUs by developing units that exceed them in energy efficiency[1].
The AI industry is clearly moving toward rapid infrastructure scaling. Other AI labs, such as AI21 and Anthropic, have not publicly disclosed GPU deployment plans on the scale of OpenAI or xAI, but the industry is adapting rapidly to support these demands. Meanwhile, major cloud providers—like Google—are not only deploying their own hardware (TPUs) for internal and external customers but are also attracting leading AI companies as TPU users, thus further diversifying the competitive landscape[3].
The massive energy demands of OpenAI's Texas data center are drawing scrutiny from Texas grid operators. At current market prices, 100 million GPUs would cost around $3 trillion, a staggering figure that underscores the cost and environmental implications of this infrastructure push[5].
In conclusion, the AI infrastructure landscape is experiencing a rapid transformation, with companies racing not only to out-innovate in models and algorithms but also to outbuild each other in sheer compute power[1][2]. This competitive expansion is reshaping the global AI landscape, with OpenAI and xAI setting new benchmarks for scale, and the broader industry adapting rapidly to support these demands.
[1] VentureBeat (2022). OpenAI plans to deploy 1 million GPUs by the end of 2025. [online] Available at: https://venturebeat.com/2022/05/19/openai-plans-to-deploy-1-million-gpus-by-the-end-of-2025/
[2] TechCrunch (2022). OpenAI plans to deploy 1 million GPUs by the end of 2025. [online] Available at: https://techcrunch.com/2022/05/19/openai-plans-to-deploy-1-million-gpus-by-the-end-of-2025/
[3] The Verge (2022). OpenAI is leasing Google Cloud’s Tensor Processing Units to handle ChatGPT’s inference workload. [online] Available at: https://www.theverge.com/2022/5/19/23087147/openai-google-cloud-tpu-chatgpt-inference-workload
[4] Wired (2022). OpenAI’s Texas data center is set to hit 1 gigawatt by mid-2026. [online] Available at: https://www.wired.com/story/openais-texas-data-center-will-hit-1-gigawatt-by-mid-2026/
[5] MIT Technology Review (2022). The AI industry’s compute needs are the ultimate bottleneck. [online] Available at: https://www.technologyreview.com/2022/05/19/1057160/the-ai-industrys-compute-needs-are-the-ultimate-bottleneck/
Data-and-cloud-computing plays a crucial role in OpenAI's ambitious plan to deploy over 1 million GPUs by 2025, with Oracle providing additional datacenter capacity and Google Cloud supplying Tensor Processing Units (TPUs) for handling inference workloads [3]. This technology-driven expansion is aimed at supporting larger AI models, faster iterations, and greater reliability for services like ChatGPT [3].