Article on AI compute demand

Why demand for AI compute keeps rising

Three forces push AI compute demand up at the same time: models keep growing, more people use AI, and AI agents multiply the number of requests. Here is how that adds up.

Key takeaways

  • Training compute for frontier models has grown roughly 4 to 5 times per year since 2010, according to Epoch AI.
  • Adoption is mainstream, so inference demand rises with every new user and feature.
  • AI agents make many calls per task, multiplying inference demand further.
  • Demand has grown quickly, but it is never guaranteed to continue at any pace.

Force one: models keep getting larger

The most visible driver of compute demand is the relentless scaling of models. Bigger models, trained on more data with more parameters, tend to be more capable, so developers keep scaling them up. Epoch AI finds that the compute used to train frontier AI models has grown roughly 4 to 5 times per year since 2010. That is a staggering pace. A four to five times annual increase means a model trained today can require hundreds of times more compute than one trained only a few years ago.

Each new generation therefore demands more hardware than the last, not a little more but dramatically more. NVIDIA introduced its Blackwell platform specifically for trillion-parameter scale training and inference, a reminder that the hardware roadmap is being pulled forward by the size of the models it has to serve. The GPU sits at the center of this, the core compute engine for both training and running modern AI.

That alone would strain supply. But training is only the opening chapter of the story. The larger and more persistent source of demand comes after a model is built.

Force two: more people use AI every day

Every time someone uses an AI feature, that is an inference request consuming compute. Training a model happens once, but inference happens every single time the model is used, by every user, forever. As adoption widens, inference becomes the dominant and continuously growing load on infrastructure.

The Stanford AI Index found that generative AI reached roughly 53 percent population usage within three years, faster than the internet or the personal computer before it. The IEA's work on energy and AI noted that major model providers reported a threefold rise in active users and a fivefold rise in revenue over a single year. Adoption is not a future projection. It is already mainstream, and it is still climbing.

Unlike a one-time training run, inference repeats endlessly, so it scales directly with how many people use AI and how often. As AI gets embedded into search, productivity tools, customer service, and software development, the number of inference requests grows with every feature shipped.

Force three: AI agents multiply the requests

The newest source of demand changes the math again. Older AI systems answered a question once and stopped. Agents do not. An agent breaks a task into many steps, calls a model repeatedly to plan, act, check its work, and revise, and may invoke tools along the way. A single user request can quietly become dozens or even hundreds of inference calls behind the scenes.

This matters because it decouples compute demand from the number of human users. Even if the number of people stayed flat, giving each of them agents that work autonomously would multiply the underlying compute load. The Stanford AI Index notes that AI agents still fail roughly one in three attempts, which means many tasks require retries, adding still more calls.

Multiply more capable models by more users by more calls per task, and you get demand that compounds from several directions at once. It is worth being careful here, though. Demand is never guaranteed to continue at any particular pace, and outcomes depend on market conditions that no one controls.

The numbers

What the demand data shows

4 to 5x

Annual growth in training compute for frontier AI models since 2010, according to Epoch AI.

Source: Epoch AI, May 2024

~53%

Population using generative AI within three years, faster than the internet or PC, per Stanford HAI.

Source: Stanford Institute for Human-Centered AI (HAI), April 2026

~50%

Surge in AI-focused data centre electricity use in 2025, according to the IEA.

Source: International Energy Agency (IEA), 2025

Behind every request is hardware doing work

AI compute engineers at workstations monitoring infrastructure that serves model requests
Each user prompt and agent step maps to real computation running on physical GPUs somewhere.

It is easy to forget that a chatbot reply or an agent's action is not free or abstract. Every token generated corresponds to arithmetic performed on physical hardware in a real building, drawing real power.

That is why rising usage translates so directly into rising demand for GPUs, energy, and the people who keep that infrastructure running.

Two kinds of demand

Why training and inference both keep climbing

Training is bursty but huge

Building a frontier model is a one-time, concentrated effort that can draw more than 100 MW for a single run, and run sizes keep growing each generation.

Inference is constant

Serving a model happens every time it is used. As adoption widens, inference becomes a steady, always-on load that dwarfs any single training run over time.

Agents stack on top

Agentic systems turn one request into many calls, so inference demand grows even faster than the raw number of users.

Efficiency does not cap it

Algorithmic efficiency improves roughly threefold per year, per Epoch AI, but gains are spent on bigger, more ambitious workloads rather than shrinking total demand.

What rising demand means for hardware

Rising demand from several directions at once is the reason the hardware that serves AI compute stays valuable and scarce. When more capability, more users, and more calls per task all push in the same direction, the machines that do the work remain in demand. One way to hold a position in that hardware is managed ownership, where you own a physical NVIDIA machine and a professional team operates it inside a data center. That is the model Golden Core Mining is built around.

Demand trends, however, do not guarantee any outcome. No one can be certain that demand will keep growing at today's pace, and owning hardware carries real costs and responsibilities regardless of how the market moves. Operational benefits are not guaranteed and depend on utilization, uptime, demand, costs, hardware performance, and market conditions. The useful takeaway is understanding why the demand exists, so any decision you make is a deliberate one.

Sources

References and data

  1. Training compute of frontier AI models grows by 4 to 5x per year. Epoch AI. May 2024.
  2. The 2026 AI Index Report. Stanford Institute for Human-Centered AI (HAI). April 2026.
  3. Key Questions on Energy and AI. International Energy Agency (IEA). 2025.
FAQ

Questions about AI compute demand

Models keep getting larger, more people use AI every day, and AI agents make many calls per task. All three forces push compute demand higher at once. Epoch AI finds training compute has grown roughly 4 to 5 times per year since 2010.

Training builds a model once and can be enormous, with single runs exceeding 100 MW. Inference is running the model, which happens every time it is used. As adoption grows, inference becomes a constant, always-on load that adds up far beyond any single training run.

An agent breaks a task into many steps and calls a model repeatedly to plan, act, and revise. A single user request can become dozens of inference calls. Since agents still fail about one in three attempts per Stanford HAI, retries add even more calls.

Not in practice. Epoch AI estimates algorithmic efficiency improves roughly threefold per year, but those gains tend to be spent on larger and more ambitious workloads rather than on shrinking total compute use. Efficiency raises what is possible rather than lowering demand.

No one knows. Demand has grown quickly and many forces point upward, but it is never guaranteed to continue at any particular pace. Outcomes depend on market conditions, so any decision should account for that uncertainty.

Hold a position

Interested in the hardware behind the demand?

Talk through owning managed NVIDIA hardware that can serve AI compute, with no pressure and straight answers.

Operational benefits are not guaranteed and depend on utilization, uptime, demand, costs, hardware performance, and market conditions.

Legal disclaimer. Golden Core Mining is an AI infrastructure ownership and management company organized under United States law. Not investment advice. Not a broker, financial adviser, or securities provider. Golden Core Mining does not guarantee any operational benefit, utilization, or resale value. See the full risk disclosure.