March 13, 2024
Future-Proofing Frontier AI Regulation
Projecting Future Compute for Frontier AI Models
Executive Summary
Policymakers should prepare for a world of significantly more powerful AI systems over the next decade. These developments could occur without fundamental breakthroughs in AI science simply by scaling up today’s techniques to train larger models on more data and computation.
The amount of computation (compute) used to train frontier AI models could increase significantly in the next decade. By the late 2020s or early 2030s, the amount of compute used to train frontier AI models could be approximately 1,000 times that used to train GPT-4. Accounting for algorithmic progress, the amount of effective compute could be approximately one million times that used to train GPT-4. There is some uncertainty about when these thresholds could be reached, but this level of growth appears possible within anticipated cost and hardware constraints.
Improvements of this magnitude are possible without government intervention, entirely funded by private corporations on the scale of large tech companies today. Nor do they require fundamental breakthroughs in chip manufacturing or design. Increased spending beyond the limits of private companies today or fundamentally new computing paradigms could lead to even greater compute growth.
Rising costs to train frontier AI models may drive an oligopoly at the frontier of research, but capabilities are likely to proliferate rapidly. At present, algorithmic progress and hardware improvements quickly decrease the cost to train previously state-of-the-art models. Within five years at current trends, the cost to train a model at any given level of capability decreases roughly by a factor of 1,000, or to around 0.1 percent of the original cost, making training vastly cheaper and increasing accessibility.
The U.S. government has placed export controls on advanced AI chips destined for China, and denying actors access to hardware improvements creates a growing gap in relative capability over time. Actors denied access to hardware improvements will be quickly priced out of keeping pace with frontier research. By 2027, using older, export-compliant chips could result in a roughly tenfold cost penalty for training, if export controls remain at the current technology threshold and are maximally effective.
However, proliferation of any given level of AI capabilities will be delayed only a few additional years. At present, the cost of training models at any given level of AI capabilities declines rapidly due to algorithmic progress alone. If algorithmic improvements continue to be widely available, hardware-restricted actors will be able to train models with capabilities equivalent to once-frontier models only two to three years behind the frontier.
Access to compute and algorithmic improvements both play a significant role in driving progress at AI’s frontier and affecting how rapidly capabilities proliferate and to whom. At present, the amount of compute used to train large AI models is doubling every seven months, due to a combination of hardware improvements and increased spending on compute. Algorithmic efficiency—the ability to achieve the same level of performance with less compute—is doubling roughly every eight to nine months for large language models. Improved performance comes from both increased compute and algorithmic improvements. If compute growth slows in the 2030s due to rising costs and/or diminishing hardware performance gains, future progress in frontier models could depend heavily on algorithmic improvements. At present, fast improvements in algorithmic efficiency enable rapid proliferation of capabilities as the amount of compute needed to train models at any given level of performance quickly declines. Recently, some leading AI labs have begun withholding information about their most advanced models. If algorithmic improvements slow or become less widely available, that could slow progress at AI’s frontier and cause capabilities to proliferate more slowly.
While there is significant uncertainty in how the future of AI develops, current trends point to a future of vastly more powerful AI systems than today’s state of the art. The most advanced systems at AI’s frontier will be limited initially to a small number of actors but may rapidly proliferate. Policymakers should begin to put in place today a regulatory framework to prepare for this future. Building an anticipatory regulatory framework is essential because of the disconnect in speeds between AI progress and the policymaking process, the difficulty in predicting the capabilities of new AI systems for specific tasks, and the speed with which AI models proliferate today, absent regulation. Waiting to regulate frontier AI systems until concrete harms materialize will almost certainly result in regulation being too late.
The amount of compute used to train models is likely to be a fruitful avenue for regulation if current trends continue. Massive amounts of compute are the cost of entry to train frontier AI models. Compute is likely to increase in importance over the next 10 to 15 years as an essential input to training the most capable AI systems. However, restrictions on access to compute are likely to slow, but not halt, proliferation of capabilities, given the ability of algorithmic advances to enable training AI systems with equivalent performance on less compute over time. Regulations on compute will be more effective if paired with regulations on models themselves, such as export controls on certain trained models.
Introduction
Policymakers and industry leaders have increased their attention on regulations for highly capable general-purpose AI models, sometimes called “frontier” models. Examples of current frontier AI models include GPT-4 (OpenAI), Claude 3 (Anthropic), and Gemini Ultra (Google). Companies already are training larger, more capable next-generation models using ever-larger amounts of data and computing hardware.
The computation used to train frontier AI systems is growing at an unsustainable rate. The amount of computation, or compute, used to train state-of-the-art machine learning models increased ten billionfold from 2010 to 2022 and is doubling every six months.1 For the largest models, the amount of compute used for training is doubling approximately every seven months. This rapid increase in compute exceeds the pace of hardware improvements and is in part driven by increased spending on training. Costs for training the largest models are doubling roughly every 10 months.2 Training current frontier models costs on the order of tens of millions of dollars just for the final training run. The full cost for training frontier models today, accounting for earlier training runs and experiments, could cost around $100 million.3 As training costs continue to rise, they could reach hundreds of millions of dollars or even billions of dollars.
Current trends point to a future of vastly more powerful AI systems than today’s state of the art.
In the near term, growth in large-scale training runs at AI’s frontier is likely to continue. Leading AI labs already are reportedly training next-generation models or raising funds to do so.4 Nvidia is shipping hundreds of thousands of new chips, which will enable more powerful future training runs. In the long run, however, cost and possibly hardware limitations are likely to constrain future compute growth.5 The current exponential pace of compute growth cannot continue indefinitely. How long it continues, at what pace, and how much compute grows before leveling off has important implications for the future of AI progress. The role of cost and access to hardware as barriers to entry for training highly capable AI systems also has policy implications, such as for export controls and some regulatory proposals.
Research Questions
This paper aims to answer several questions about how trends in cost and compute could affect the future of AI:
- Cost and compute projections: If current trends were to continue, how would the amount of compute used to train frontier AI models and the cost of training rise over time? Accounting for algorithmic progress, how would the amount of effective compute increase over time?
- Limits on cost growth: How much could compute increase before reaching the spending limits of private companies, and when would that occur? If the rate of cost growth slows as costs rise, how might that affect the amount of compute used for training frontier models?
- Limits on hardware improvements: How might limits on continued hardware improvements affect future compute growth?
- Proliferation: How might improvements in hardware and algorithmic efficiency affect the availability of AI capabilities over time?
- Costs for hardware-restricted actors: How might constraints on hardware availability (for example, due to export controls) affect cost and compute growth for actors denied access to continued improvements in AI hardware?
- Compute regulatory threshold: How might improvements in hardware and algorithmic efficiency impact the effectiveness of training compute as a regulatory threshold for frontier models over time?
The answers to these questions have important bearing on policy-relevant decisions today, such as the anticipated effect of export controls or other proposed regulations that would limit access to compute-intensive AI models. On the one hand, trends in rising costs are consolidating access to frontier AI models among a handful of leading AI labs. On the other hand, countervailing trends in hardware improvements and algorithmic efficiency are lowering barriers to capabilities, enabling proliferation. Some regulatory and policy interventions may be more or less feasible or desirable depending on how compute and cost change over time and the consequences for access to frontier AI models and the proliferation of capabilities. This paper aims to answer these questions with the goal of informing policymakers’ understanding of possible scenarios for future AI development.
Approach
Using current trends as a baseline, this paper projects cost and compute growth under various scenarios. The paper projects compute growth due to increased spending and hardware improvements. Additionally, it accounts for algorithmic improvements by projecting effective compute over time. The paper then estimates when training costs are projected to reach current limits for large corporations and move into the realm of what have historically been government-level expenditures. Additional scenarios explore how limits in hardware improvements may affect the availability of future compute. Since the cost to train a model with any given level of capabilities will decrease over time due to improvements in hardware and algorithmic efficiency, the paper also estimates how costs will decline over time, making capabilities more accessible to a wider array of actors, enabling greater proliferation. The paper then estimates how training costs change for actors that are restricted from continued improvements in AI hardware, such as U.S. government export controls on advanced AI chips destined for China. Finally, this paper estimates how future improvements in hardware and algorithmic efficiency may increase the accessibility of compute and capabilities relative to the U.S. government’s notification threshold established in the October 2023 executive order. The paper concludes by assessing the policy implications of these projections.
Download the Full Report.
- Jaime Sevilla et al., Compute Trends Across Three Eras of Machine Learning, arXiv.org, March 9, 2022, https://arxiv.org/abs/2202.05924. ↩
- This estimate comes from a similar methodology to Cottier (2023), updated with the most recent estimates for compute growth and hardware price-performance. For more on this estimate, see this paper’s section, Current Best Estimates and Assumptions: Cost Growth. Ben Cottier, “Trends in the dollar training cost of machine learning systems,” Epoch, January 31, 2023, https://epochai.org/blog/trends-in-the-dollar-training-cost-of-machine-learning-systems. ↩
- OpenAI’s CEO Sam Altman stated that the company spent $100M to train GPT-4, although he did not give further details, and this may include earlier experiments in addition to the final training run. Will Knight, “OpenAI’s CEO Says the Age of Giant AI Models Is Already Over,” Wired, April 17, 2023, https://www.wired.com/story/openai-ceo-sam-altman-the-age-of-giant-ai-models-is-already-over/. ↩
- For example, OpenAI’s then-CEO Sam Altman stated in November 2023 that the company was working on GPT-5. Madhumita Murgia, “OpenAI CEO Sam Altman wants to build AI ‘superintelligence,’” Ars Technica, November 14, 2023, https://arstechnica.com/ai/2023/11/openai-ceo-sam-altman-wants-to-build-ai-superintelligence/. See also David Tayar (@davidtayar5), “Remainder of note,” X (formerly Twitter), February 20, 2023, https://twitter.com/davidtayar5/status/1627690520456691712; Kyle Wiggers et al., “Anthropic’s $5B, 4-year plan to take on OpenAI,” Tech Crunch, April 6, 2023, https://techcrunch.com/2023/04/06/anthropics-5b-4-year-plan-to-take-on-openai/. ↩
- Nvidia reportedly is expected to ship 550,000 state-ofthe- art H100 graphics processing units (GPUs) in 2023. Madhumita Murgia et al., “Saudi Arabia and UAE race to buy Nvidia chips to power AI ambitions,” Financial Times, August 14, 2023, https://www.ft.com/content/c93d2a76-16f3-4585-af61-86667c5090ba. ↩
More from CNAS
-
Technology to Secure the AI Chip Supply Chain: A Working Paper
Advanced artificial intelligence (AI) systems, built and deployed with specialized chips, show vast potential to drive economic growth and scientific progress....
By Tim Fist, Tao Burga & Vivek Chilukuri
-
Trump Must Rebalance America’s AI Strategy
The disagreements about AI progress are so fundamental and held with such conviction that they have evoked comparisons to a “religious schism” among technologists....
By Bill Drexel & Ruby Scanlon
-
Response to Request For Comment: “Bolstering Data Center Growth, Resilience, and Security”
CNAS experts emphasize the importance of data centers for artificial intelligence...
By Janet Egan, Geoffrey Gertz, Caleb Withers & Grace Park
-
Sovereign AI in a Hybrid World: National Strategies and Policy Responses
Going forward, the U.S. government will need to ensure that it continues to work with allies and partners as it attempts to mitigate the risks of international AI diffusion, e...
By Pablo Chavez