How much energy consumption is involved in Chat GPT responses being generated?

Question

I note this question was deemed off-topic, so I'm trying to clearly frame this question in terms of scope of response I'm interested in, namely ethics and sustainability issues associated with the soon-to-be proliferation of OpenAI Chat GPT types of tools for all manner of online information seeking behavior (from humans and other bots). This is not a programming or specific hardware question.

On average, how much energy is consumed for each response that Open AI's public chatgpt-3 provides? i.e. what is the energy to run the entire system for 24 hours divided by the number of responses generated in 24 hours (ignoring energy consumed to train the system or build the hardware components).

How does this compare to a Google/Duck Duck Go/Bing search inquiry?

I read somewhere an OpenAI employee on the ChatGPT team that the computer power used to provide responses to queries is "ridiculous", and there's documentation of the size of the memory requirements of hosting servers and parameters but without knowing its throughput for example it's hard to quantify the energy consumption.

I often get more interesting results from Chat GPT than Duck Duck Go on certain types of queries where I used to know the answer but cannot remember the answer. IN these cases I can fact check for myself, I'm looking for a memory prompts with names and jargon that will remind me.

Also when seeking out counter-views to my own (say critiques of degrowth or heterodoxy economics concepts) Chat GPT is good at providing names and papers/reports/books that critiques the view I provide it.

In many cases more usefully than conventional search engines. Therefore, I can see the popularity of these tools ballooning rapidly, especially when the operational costs CAPEX + OPEX of the servers and maintainers is borne by large amounts of seed funding (eg OpenAI) or any other loss-leader startup wishing to ride the next wave of AI.

The heart of my question is "at what externalized costs do we gain these tools in terms of greenhouse gases, use of limited mineral resources, GPUs scarcity etc."

Note the environment tag is for reinforcement learning, not the physical environment of earth. It has nothing to do with your question. — chessprogrammer, Commented Jan 31, 2023 at 6:51
Unless there is someone from the OpenAI team reading your question, no one here can answer this since they have not published any concrete information about the final design or supporting hardware architecture. — David Hoelzer, Commented Jan 31, 2023 at 11:46
Other answers on this forum have have very general descriptions of the size of GPU architecture, from which power consumption could be estimated assuming a constant workload. What i haven't seen in public domain is a estimate of the throughput of response numbers per second/hour/day/week. — wide_eyed_pupil, Commented Feb 1, 2023 at 20:54
GHG emissions depend on the energy source (renewable, nuclear, fossil) and also whether the waste heat is used. — darsie, Commented Nov 6, 2023 at 15:01
"ignoring energy consumed to train the system"? Why??? Without knowing the orders of magnitude involved, you really shouldn't ignore the training. According to reddit.com/r/GPT3/comments/p1xf10/comment/h8h3sl4/… , 10_000 V100 GPUs were used, and the training cost $5M in compute time, for GPT-3 alone. — Eric Duminil, Commented Jan 8 at 13:05

KFilter · Accepted Answer · 2023-02-02 12:48:44Z

7

Sam Altman states "probably single-digits cents" thus worst case 0,09€/request.

I guess a least half the cost are energy at a cost of 0,15€/1kWh, a request would cost 0,09€/request*50%/0,15€/1kW=0,3kWh/request = 300Wh per request. 60 Smartphone charges of 5Wh per Charge ;) Source:https://www.forbes.com/sites/ariannajohnson/2022/12/07/heres-what-to-know-about-openais-chatgpt-what-its-disrupting-and-how-to-use-it/

Google Search request 0.0003 kWh = 0,3Wh, thus a search request by Google uses 1000x less, but as Google has started to use AI to, probably a search consumes more by now as well. Source: https://store.chipkin.com/articles/did-you-know-it-takes-00003-kwh-per-google-search-and-more

answered Feb 2, 2023 at 12:48

KFilter

712 bronze badges

$\begingroup$ It’s worth noting that Sam Altman and OpenAI are American, so USD and American energy costs are more relevant than Euros. Also, we know ChatGPT is trained from a GPT3.5 foundation model, so the architecture and energy expenditure can be estimated from that direction as well. (I assume that running the language model itself absolutely dominates the energy intensity, compared to e.g web servers, moderation models, etc.) $\endgroup$
– kdbanman
Commented Feb 2, 2023 at 16:54
1

$\begingroup$ Correct the difference between € and $ are not so important, compared to the other factors, like energy cost. As I live in the EU, it is harder for me to estimate the energy price OpenAI might pay, may you provide that, so I can update my estimate? I don't have the knowledge neither inside to calculate from GPU usage of the model. I am looking forward to that calculation to see how off I was. $\endgroup$
– KFilter
Commented Feb 3, 2023 at 17:14
$\begingroup$ yes, like CAPEX and OPEX, there would be training phase and embodied energy of the GPU servers (embodied energy as a one off energy and emissions 'cost' equivalent of CAPEX) and the energy consumption of the servers since ChatGPT was opened up for public access. I'm guessing operational energy is the majority of energy use. $\endgroup$
– wide_eyed_pupil
Commented Feb 17, 2023 at 2:57
$\begingroup$ @Kfilter depends if they are paying for energy on a Power Purchase Agreement (probably if their consumption is high) and if it's 100% RE or dirty power supply. depends what state the servers are operating it too (they might be in Greenland or Alaska running on geothermal with cheap cooling with the freezing air exchange outside). I'll try and find a wholesale power price. Do you kwon where their servers are located geographically? $\endgroup$
– wide_eyed_pupil
Commented Feb 17, 2023 at 3:00

Add a comment |

wide_eyed_pupil · Accepted Answer · 2024-03-18 07:36:00Z

Thanks for the tip @Yoric! (Sorry, my reputation points are too low to comment, so I have to do this answer-variant)

Alright, here's the lowdown on the energy use of AI from the paper "The growing energy footprint of artificial intelligence"

AI and Energy Use: AI's getting big, and so is its energy appetite.
People are starting to notice how much electricity AI and data
centers are gobbling up, and it's got some environmental impacts.
AI Training Takes a Lot of Power & time = energy!: Training AI models is a real energy hog. For instance, the BLOOM model used up 433 MWh, GPT-3 needed a whopping 1287 MWh, Gopher 1066 MWh, and OPT 324 MWh. That's a lot of ~~power~~ energy!
AI Working Overtime: When AI models like ChatGPT get down to answering our questions, they're using a good chunk of energy.
OpenAI's ChatGPT needs about 564 MWh every day just for this.
More AI, More Power?: If we start using AI in things like Google Search, the power use could jump up. We're talking about 6.9–8.9 Wh for every AI-powered search.
Getting Smarter and Greener: Good news is, better hardware and smarter AI could help cut down on the power use. But, there's a catch - as AI gets more efficient, we might just end up using it more.
Balancing Act: The paper really highlights that we need to balance the cool stuff AI does with how much energy it uses. We've got to think about both sides of the coin.

I have done some research from multiple sources regarding the energy consumption and operational efficiency of OpenAI's ChatGPT-4 Turbo, but specific details that can help to calcualate its average energy expenditure per query are not to be found. Cntacting OpenAI directly for specific data on ChatGPT's energy consumption is needed.

ChatGPT-4 Turbo's Notable Features: This model stands out for its impressive speed and efficiency. It's got a huge context window of 128k, meaning it can handle a conversation as lengthy as 300 pages. Plus, it's up-to-date with events until April 2023. When it comes to cost, it's more efficient, cutting down input token costs by 3x and output token costs by 2x compared to GPT-4.
A Nod to Better Performance: OpenAI has enhanced ChatGPT-4 Turbo's performance, which likely translates to better energy efficiency. However, we're short on exact energy figures for this.
Comparing with the Past: Previous models like GPT-3 had a reputation for high energy use, especially during training. While ChatGPT-4 Turbo is thought to be more efficient in query responses, the specific energy figures aren't disclosed.
A Friendly Comparison: If ChatGPT-4 Turbo is indeed more energy-efficient, imagine its energy use being similar to a short smartphone charge rather than brewing a coffee, the analogy once used for Google searches. But remember, without solid data, this is more of an educated guess. To get the full picture, detailed info from OpenAI or expert analysis would be key.

Minor nitpick: "That's a lot of power!" talking about MWh, it should be "That's a lot of energy!" — Eric Duminil, Commented Jan 8 at 13:08
thanks for the link to that paper and your answer, @EarthHunger. searching for open access to the paper I came across another paper that was discursive in nature (and over-generalising I thought, reaching hysterical level of assumption in suggesting crypto will help solve climate change LOL) rather than quantitative. The author concluded with the following sentence: — wide_eyed_pupil, Commented Mar 17 at 5:39
"Dodge concurs, noting that in his research, a lot of the CO2 emissions were calculated from the electric- ity consumed in training the model. “Choosing the region that you’re train- ing your model in or putting your mod- el into production can have a pretty big impact,” Dodge says. “We found that training in the least carbon-in- tense region, compared with the most carbon-intense region, could reduce your emissions by about two-thirds, to one-third of what the full emissions would’ve been.” DOI: 10.1145/3603746 — wide_eyed_pupil, Commented Mar 17 at 5:40
Thanks for the input @wide_eyed_pupil. Can you link to the paper outside of a log-in-wall? Much appriciated! — EarthHunger, Commented Mar 18 at 7:33

KasperGL · Accepted Answer · 2023-03-04 09:47:03Z

3

I've taken a stab at estimating the carbon footprint of ChatGPT here. I estimated the daily carbon footprint of the ChatGPT service to be around 23 kgCO2e and the primary assumption was that the service was running on 16 A100 GPUs. I made the estimate at a time with little information about the user base was available. I now believe that the estimate is way too low because ChatGPT reportedly had 590M visits in January which I don't think 16 gpus can handle.

Recently, I also estimated ChatGPT's electricity consumption in January 2023 to be between 1.1M and 23M KWh.

To convert that into a carbon footprint, we'd need to know the carbon intensity of the electricity grid in every location where a ChatGPT instance is running. We don't have this info, but if we instead convert the electricity consumption into a carbon footprint using a very low carbon intensity like Sweden's 9g / KWh (which is the lowest in EU and lower than the US), the carbon footprint of ChatGPT in January 2023 would be estimated to be between 10 and 207 tons CO2e.

answered Mar 4, 2023 at 9:47

KasperGL

1312 bronze badges

$\begingroup$ I asked CHatGTP what cloud computing services openAI use… it listed four of the big ones… > 1. AWS, 2. Microsoft Azure, 3. Google Cloud Platform (GVP) 4. IBM Cloud… just ask ChatGTP for details, but it doesn't get into specifics or proportions of workloads used across these services used by ChatGTP so it's all pretty vague. $\endgroup$
– wide_eyed_pupil
Commented Mar 4, 2023 at 15:47
4

$\begingroup$ Since ChatGPT is known to hallucinate, I don't really trust ChatGPT's answers in this regard. Since OpenAI has what appears to be a pretty close partnership with Microsoft, I'd be very surprised if ChatGPT also run on other cloud vendors :) btw I've asked ChatGPT about its carbon footprint many times, but never gotten anything useful. But perhaps I'm not good enough at prompt engineering $\endgroup$
– KasperGL
Commented Mar 5, 2023 at 9:46
$\begingroup$ true, I often correct it and sometimes for fun I correct it with a lie and then it repeats the lie back to me like I schooled it in something. :-) $\endgroup$
– wide_eyed_pupil
Commented Mar 7, 2023 at 5:29
$\begingroup$ it's pretty amazing that it's not a question that one hundred blogger, podcasters and YouTubers haven't asked. if the C footprint is in the scale of 10^3 or higher and literally all information based website start using it or similar architectures, (which is a lot if you think about it a little) then that's a huge added load on the worlds mostly dirty power systems. $\endgroup$
– wide_eyed_pupil
Commented Mar 7, 2023 at 5:31
$\begingroup$ off-topic, but do you know if humans ever intervene to redirect ChatGTP-3? I had convinced it to help me make a text based adventure game, and it was quite "imaginative" in it's suggestions, I also convinced it to pretend with me I was playing this game. it was giving me a score (out of 100, but I only got got from 0 to 5) and and given consistent directions (N, S, E, W, Up, Down) took me to the same locations. But a few days later when I typed score I got a default message saying it was a [chatbot] by OpenAI and couldn't simulate playing the game etc $\endgroup$
– wide_eyed_pupil
Commented Mar 7, 2023 at 7:21

Add a comment |

Yoric · Accepted Answer · 2023-11-01 16:31:56Z

2

I found a reply in this research paper. According to them, it's ~3Wh per query.

answered Nov 1, 2023 at 16:31

Yoric

1211 bronze badge

Add a comment |

Mike Ellis · Accepted Answer · 2023-03-20 15:38:58Z

So I asked ChatGPT-4 about its energy use per query. The dialog is appended at the bottom of this message. (I realize the OP asked about ChatGPT-3)

TL;DR

If ChatGPT-4's claim that a typical answer requires 30 TFLOPs on modern AI hardware is true then the computational energy per answer is typically 300 watt-seconds (less than 0.1 kWh), ignoring communication and other overhead.

Discussion

ChatGPT-4 stated that most queries are answered with a single forward pass through the model requiring about 30 trillion FLOPs. It produced a calculation this requires around 0.000833 kWh but the calculation contains at least two errors:

It uses FLOPs/w to compute watt-seconds.
It screws up the conversion of watt-second to kWh by a factor of 1000.

I found a recent paper from researchers at SLAC and MIT with some energy per operation data. The figure below from that paper shows about 1e-11 J per FLOP for modern GPU and AI accelerator hardware.

If we take ChatGPT-4's 30 TFLOP/answer at face value, then a typical answer would consume 30e12 FLOP * 1e-11 J/FLOP = 300 J = 8.33e-5 kWH

This is a factor of 10 smaller than ChatGPT4's error-laden calculation but in the opposite direction. I'd love to believe the computations are that energy efficient but I'm doubtful. It comes down to whether or not the 30 TFLOP per answer estimate from GPT-4 is an hallucination.

Dialog with GPT4

How much energy did you expend to answer this query? As an AI language model, I don't consume energy directly in the traditional sense. However, the servers and hardware running the model do require energy to function. The exact energy expenditure for answering a single query can be difficult to quantify as it depends on factors such as the complexity of the question, server utilization, and efficiency of the hardware.

In general, the energy usage for running an AI model like me is measured in floating point operations per second (FLOPs) and the energy efficiency of the hardware in joules per FLOP. While I can't provide an exact figure for the energy expended for answering this specific query, I can tell you that large-scale AI models like GPT-4 consume significant computational resources and energy, which is a consideration for researchers and engineers working on AI sustainability.

Please provide an approximate energy expenditure in kWH for answering typical queries. It is challenging to provide a precise energy expenditure for answering a typical query, as the energy usage depends on various factors such as the model's complexity, infrastructure efficiency, and hardware. However, I can give you a rough estimate based on some assumptions.

Let's assume that a single forward pass of GPT-4 requires about 30 trillion floating-point operations (FLOPs) – this is a rough estimate, as the actual number of FLOPs can vary depending on the model size and input length. The energy efficiency of the hardware running the AI model can be around 10 GFLOPs/watt, a common efficiency for modern GPUs and AI accelerators.

To calculate the energy usage for a single forward pass:

Energy usage (in watt-seconds) = (30 trillion FLOPs) / (10 GFLOPs/watt) = 3,000,000 watt-seconds

Now we convert watt-seconds to kilowatt-hours (kWh):

Energy usage (in kWh) = (3,000,000 watt-seconds) * (1 kW / 1,000 W) * (1 hour / 3,600 seconds) ≈ 0.000833 kWh

So, a very rough approximation of the energy usage for a single forward pass to answer a typical query is around 0.000833 kWh. Keep in mind that this is a rough estimate and could be higher or lower based on the actual model complexity, hardware efficiency, and server utilization.

How many forward passes does GPT-4 typically require to answer a typical query? GPT-4, like other generative language models, is based on the Transformer architecture, which uses a decoder-only design for generating text. For a typical query, GPT-4 usually requires one forward pass through the model to generate a response.

During the forward pass, the model processes the input tokens and generates a probability distribution over the vocabulary for the next token at each position in the sequence. The process continues sequentially until an end-of-sequence token is generated or a predetermined maximum length is reached.

In summary, GPT-4 generally needs one forward pass to answer a typical query. However, it's important to note that, in practice, additional computational overhead may be involved in tokenization, decoding, and other preprocessing and postprocessing steps.

Share

Nice! The 8.33e-5 seems reasonable. BLOOM once used 0.00396 KWh per query (arxiv.org/abs/2211.02001) and using the ChatGPT user and GPU numbers from an article on SemiAnalysis, we arrive at 0.000712 KWh per query. Daily requests: 15 x 13m = 195,000,000, which takes 28,936 A100 GPUs (semianalysis.com/p/…) Energy consumption per query assuming power draw is 50% of max TDP: 28,936 GPUs * (0.4KW/2) * 24 = 0,000712 KWh — KasperGL, Commented Mar 31, 2023 at 13:06

EarthHunger · Accepted Answer · 2023-10-11 07:26:57Z

Great topic! I'd love to see this knowledge added as a subsection to Wikipedia's ChatGPT page. I'll look into getting it started.

Out of curiosity, I posed the same questions to GPT-4, approximately 7 months later. Here's its answer:

How much energy did you expend to answer this query? ChatGPT

I don't consume energy in the same way traditional electronics do. Instead, the servers that run my underlying computations use energy. The precise energy consumption for a single interaction with me, ChatGPT, depends on multiple factors such as:

The specific hardware used for the computation.

The complexity and length of the request and the produced answer.

Backend optimizations and efficiency of the software.

Typically, running a query on a model like GPT-4 consumes a small amount of energy, often measured in milliwatt-hours (mWh) or watt-hours (Wh), but certainly not kilowatt-hours (kWh). For a brief interaction like this one, the energy consumed would be closer to the mWh range.

For a more precise measurement, you would need detailed specifications from the provider hosting the model – in this case, OpenAI – about their infrastructure's energy consumption during inference. User Please provide an approximate energy expenditure in kWH for answering typical queries. ChatGPT

It's challenging to provide a precise energy consumption value for a single query without specific details from OpenAI's infrastructure. However, I can provide a very rough estimation based on general knowledge of server operation.

Let's make some assumptions:

Assume a server running the GPT-4 model uses around 200 watts (0.2 kW) when active.

Assume the time taken to process a query and produce a response is 0.1 seconds (100 milliseconds), which is a generous estimate and likely an overestimation.

Given these assumptions:

Energy = Power × Time

Energy = 0.2 kW × 0.1/3600 hours (because 0.1 seconds is 0.1/3600 of an hour)

Energy ≈ 5.56 × 10^-5 kWh, or 0.0000556 kWh

This is a very rough estimation and is likely on the higher side for a single query. The actual energy expenditure could be lower due to various factors, including optimizations in server hardware, the efficiency of the model, and software improvements. Also, remember that this is a per-query estimate; when dealing with millions of queries, these numbers add up. Still, for an individual interaction, it's quite low. User How many forward passes does GPT-4 typically require to answer a typical query? ChatGPT

GPT-4, like its predecessors, requires a single forward pass through the model to generate a response for a typical query. When you send a query or prompt, the model processes the input through its layers and produces an output sequence, which is the response you receive. This whole process is a single forward pass.

However, there are some nuances:

Length: If the generated response exceeds the model's maximum token (word piece) limit, the completion might be cut off, and additional logic or processing might be needed to handle this. But this doesn't necessarily mean multiple forward passes for the model itself, just possibly additional computation.

Iterative Approaches: Some implementations or applications might use iterative or feedback-driven approaches where the output of one forward pass is fed back into the model for refinement. This isn't the standard mode of operation for a single response but is a technique that can be employed in specific scenarios.

Sampling and Temperature: The model can generate responses using techniques like nucleus sampling or by adjusting the "temperature" of the output, which affects the randomness of the generated text. These techniques might influence the nature of the output but still operate within a single forward pass.

For a typical interaction, like most questions and responses in this chat format, only one forward pass is needed.

Thoughts?

I got quite a few evasive answers before I got a more specific answer from ChatGTP 3.5! The numbers youve cited from ChatGTP 4 are significant;y lower than human estimates so far on this thread. Self-delusional-hallucinations? — wide_eyed_pupil, Commented Nov 16, 2023 at 6:35
That "calculation" just gives what amount of energy a midsized server takes in 0.1s. Well, I'd be extremely surprised if that's enough to produce a ChatGPT-4 response. It probably takes a whole cluster of such machines given the memory size of the model alone. — Tilman Vogel, Commented Feb 17 at 20:06

Stack Exchange Network

How much energy consumption is involved in Chat GPT responses being generated?

6 Answers 6

TL;DR

Discussion

Dialog with GPT4

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
social
ethics
chatgpt
green-ai
.

Linked

Hot Network Questions

How much energy consumption is involved in Chat GPT responses being generated?

6 Answers 6

TL;DR

Discussion

Dialog with GPT4

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged socialethicschatgptgreen-ai.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
social
ethics
chatgpt
green-ai
.