Why is this comparison even important?
Generative AI is derived from search “find pages” moves to “write me an answer”For readers, this means faster insight, for business, better processes, and for the environment, a new question: how much is actual energy consumption to such an answer compared to classic Google search? Understanding this relationship helps in making mature decisions: when AI really brings value and when it is better to stick with search or simpler models.
How spending is generated in classic search
When searching, the browser sends a short request to the data centers, where a match is initiated with the already prepared indices and cache. Much of the work is done in advance (indexing the web), so it is inferential part of an individual query is relatively short and energy-efficient. Typically, we are talking about a fraction of a watt-hour (about ~0.3 Wh per query), with small variance — most queries are similarly “lightweight.”
Why generative AI can be more expensive
LLM does not return links, but generates textThis requires:
- Multiple calculations per token (each new word is the result of matrices and attention to context).
 - More context (longer prompts → more tokens → more cycles).
 - Multimodality (image, sound, video) and above all “reasoning” tasks where the model makes inferences in multiple steps.
 - Less caching results (answers are unique), so there are fewer duplicate hits than with a search.
 
Together, this means that spending can rise quickly — sometimes it stays comparable to search, sometimes it’s an order of magnitude higher.
Fair comparison across scenarios
Below are framework ranges for energy consumption per response/query. The figures are for guidance (they do not necessarily include all infrastructure elements), but show the differences between tasks.
| Scenario | Typical consumption (Wh per response) | Note | 
|---|---|---|
| Classic search (Google) | ~0,3 | Short, standardized process via indexes/caches. | 
| AI: short text prompt | ~0.24–0.3 | Today's optimized systems can be compared to search. Google Cloud | 
| AI: “reasoning”/longer answer | ~5 and up | The number of tokens and multi-step reasoning raise the consumption. iea.blob.core.windows.net | 
| AI: image → generation/analysis | ~1–2 | Images are significantly more expensive than text. iea.blob.core.windows.net | 
| AI: short video → generation | ~100+ | Example: ~115 Wh for ~6 seconds of video. iea.blob.core.windows.net | 
What to remember: at short text messages tasks AI can in search rank, at complex and multimodal but easily jumps on 10× or more.
What has the greatest impact on consumption in practice?
- Length of the challenge and response: more tokens = more calculations.
 - Model selection: Smaller/optimized models are significantly more economical than huge “reasoning” models.
 - How to use: batching multiple questions, context reuse (cache), avoiding unnecessary image/video generation.
 - Infrastructure: Newer accelerators, better orchestration, and low-carbon flow (where/when available) reduce the footprint.
 
How to reduce your footprint
For users
- To start, ask short and to the point; asks for concise answer.
 - Generate images/videos only when they bring added value.
 
For developers/products
- Default on smaller models; larger and “reasoning” include conditionally (feature flags).
 - Use caching partial results and batching requests.
 - Introduce length restrictions (max tokens), trim context, use retrieval instead of blindly loading large instructions.
 - On-the-go consumption measurement (telemetry) and parameter adjustment (temperature, max tokens).
 
For businesses/IT
- Route tasks to time/areas with more low-carbon energy.
 - Think about quantization/distillation internal models and rules for economical use (e.g. “no-image-by-default”).
 
Frequently asked questions and myths
Is AI always 10x more greedy than Google?
No. With short text prompts, it can be done today comparable to search; the difference explodes when it comes to complexity (reasoning, images, video).
Do multiple smaller calls consume less than one debt?
Not necessarily. If each call involves a large model and a long context, it can be one well-designed longer call more efficient.
Do “green” plugs solve everything?
They help, but efficiency at source (model, call, architecture) remains key. Spending less is always better than “justifying” later.
Conclusion
AI is not inherently an “energy blunder” nor “free magic”. Task context is the one that stretches the spectrum from comparable spending with search (short text) to an order of magnitude higher (reasoning, images, video). If you want good results with a small footprint, optimize model, call and path — and consider whether a generative answer is really the best choice for a given task.
Sources (for key figures):
- Google Cloud: Measuring the environmental impact of AI inference (median ~0.24 Wh per text prompt). Google Cloud
 - IEA: Energy and AI (comparative ranges for text/image/video, e.g. ~115 Wh for a short video). iea.blob.core.windows.net
 
