How is GPT 4 able to solve math?

Question

How can GPT 4 solve complex calculus and other math problems. I believe these problems require analytical reasoning and ability to compute numbers. Does it still use a LLM to complete this process or does it add on to this?

Here is the link to the official results published by OpenAI

But you cannot solve mathematics by predicting the next word technique. If that were the case, all English majors could be mathematicians. — desert_ranger, Commented Mar 23, 2023 at 0:25
Sure you can, as long as the prediction model is complex enough. It's also possible they hooked it up to a math machine but I still don't see any examples? — Stack Exchange Supports Israel, Commented Mar 23, 2023 at 0:26
@desert_ranger Your argument is basically "Reductio ad absurdum". — Dr. Snoopy, Commented Mar 23, 2023 at 0:30
This is basically the only technical information available: arxiv.org/abs/2303.08774 — Dr. Snoopy, Commented Mar 23, 2023 at 1:05

Harsh · Accepted Answer · 2023-04-12 22:18:47Z

14

Large Language Models actually can do math. It's an "emergent" property, i.e. it appears only at larger scales. Understanding complex English language does require some analytical ability, which can carry over to math tasks like calculus and even arithmetic. Numbers can be represented as words, so it's definitely not unthinkable that an LLM could learn to add and subtract if it were to see enough examples.

The graph below from the 2022 paper "Emergent Abilities of Large Language Models" shows that these properties "spontaneously" emerge as models get larger. We're interested in subgraph (A) here. Upto 10^22 FLOPS, the models studied (the largest models available at the time) have basically no arithmetic ability, but scaling the models further rapidly improves their capabilities. We don't know the internals of GPT-4, but it should be larger than these models, so it was expected that it would better at arithmetic.

It also goes the other way around, Numeracy enhances the Literacy of Language Models

edited Apr 12, 2023 at 22:18

answered Apr 12, 2023 at 22:12

Harsh

1,3258 silver badges22 bronze badges

1

$\begingroup$ Language models model the probability P(w_k | w_{k - i}, ..., w_{k - 1} for words w_k, that is the probability of a word w_k appearing in context w_{k - i}, ..., w_{k - 1}. Furthermore, models have been trained to do math computations directly. If the training dataset contains enough examples of math tasks, the model may be able to reproduce these examples, and with enough parameters and enough samples even find some rules. However, it's not very reliable (for now), it's always based on statistics in the end. $\endgroup$
– Green绿色
Commented Apr 13, 2023 at 3:10
1

$\begingroup$ Do we have any reason to believe that they're still using a basic LLM? A mixed architecture seems like such low-hanging fruit. $\endgroup$
– Nat
Commented Apr 13, 2023 at 5:21
$\begingroup$ Saying that they can do math makes no sense when they can't even count. Doing math is not just choosing an answer based on what is most likely. You need to do it with probability 1 or 0. There's nothing in between. Everything in between is not math, and that's what all these models do. So, these models do everything, except math. $\endgroup$
– nbro
Commented Apr 17, 2023 at 11:46
$\begingroup$ Damn . . . . . . $\endgroup$
– user1953366
Commented Jul 27, 2023 at 22:38

Add a comment |

Jaume Oliver Lafont · Accepted Answer · 2023-03-23 18:18:50Z

5

ChatGPT now uses Wolfram Alpha to deal with math as well as other factual information.

https://writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers/

answered Mar 23, 2023 at 18:18

Jaume Oliver Lafont

8687 silver badges16 bronze badges

2

$\begingroup$ Wolfram is offered as a plugin and not used by default to solve mathematical equations with ChatGPT. $\endgroup$
– Brian
Commented Jul 25, 2023 at 22:50

Add a comment |

Neil Slater · Accepted Answer · 2023-09-12 07:08:23Z

As far as we know, GPT-4's core capabilities are still based mainly on a Large Language Model (LLM).

If so, then the apparent capabilities to reason are a somewhat surprising emergent phenomenom from a well-trained sequence prediction engine that has been trained on large amounts of data, and has capacity to create highly complex rules that approximate a "next symbol oracle".

Again, assuming this assertion is correct, then maths and logic capabilities of ChatGPT divide into a few different possibilities (these are not formal classifications, just my amateur analysis):

Rote learning of symbol associations. This is likely to occur with commonly-occurring items in the training data. Special values of trigonometry functions for example.
Things that look like logical reasoning, but are simply well-formed sentences that are on-topic. This is something we can easily be fooled by. When ChatGPT gives an explanation for a thing, it may not have any representation of it beyond being in an "explainy" state, and generating text that fits.
Approximate rules and processes. The LLM is a complex neural network, and can in principle learn arbitrary internal functions in order to predict sequence continuation. It learns these statistically, and there will be limitations - for example it is unlikely that it could learn to produce crytpographic hashes given only examples. But it may really learn to add two numbers across a wide range of numerical values given thousands of examples.
Logic processes embedded in the form of output text. I have seen many examples where a LLM gets a correct answer when it is allowed to "think things through" by showing the working out, whilst forcing a direct answer will be wrong.
Accurate rules and processes. Some rules in maths are very language-like and could be learned very well by an LLM. That could include some mathematical symbol manipulations such as variable substitution.

I expect that all the above are occurring in some mix.

For example, you could conjecture that there is a reasonable chance that GPT can internally count accurately up to some moderate number, and re-use that ability in different contexts to predict numerical symbols and associated words (e.g. one has also representation 1) It may also contain more than one such semi-accurate counter, using them in different contexts.

The sheer quantity of training material - more than any single person could consume in their lifetime - plus learning capacity of the neural network means that there are probably a lot of simple rote rules that are subject-dependent. However, in some areas the model will have a "deeper understanding" in that it will have learned reasonably complex manipulations, and used them to predict a sequence of symbols as accurately as possible, using as little of its learning capacity as possible (because it is being asked to predict text in a huge range of contexts, so benefits when it compresses its rules)

GPT has not learned primarily by reasoning and from first principles though. Its inner representations and logical units of work are likely to be quite alien to humans, and may freely combine grammar, sentiment, mathematical building blocks and other symbollic context in ways that could seem very odd if they could even be explained. This heady mix of things that occurs in most neural networks during training, is one reason why it is unlikely that OpenAI have wired in separate logic modules for highly structured processing such as math symbols or calculations. Providing such modules is possible, but detecting when to use them, and how to wire them into the network are both hard problems.

Franck Dernoncourt · Accepted Answer · 2023-04-17 01:04:09Z

2

OpenAI's CEO explicitly mentioned last month in the GPT-4 announcement video that GPT-4 isn't hooked up to a calculator.

One can however install plugins on top of ChatGPT, which may connect it to some other resources such as Wolfram as mentioned in Jaume Oliver Lafont's answer.

edited Apr 17, 2023 at 1:04

answered Apr 17, 2023 at 0:51

Franck Dernoncourt

3,0591 gold badge19 silver badges34 bronze badges

Add a comment |

Kostya · Accepted Answer · 2023-03-23 13:26:38Z

1

There is a folk story about J.W. Gibbs that goes something like:

Being a famous scientist, Gibbs was a member of a number of scientific bodies. He was bored by those and never took a podium. Except for one time. The discussion was about redirecting some effort from teaching mathematics towards more effort at teaching foreign languages. Gibbs decided to give a speech that one time. He said: "Mathematics is a language."

I don't know is this story is true or not, but I share the attitude.

answered Mar 23, 2023 at 13:26

Kostya

2,54410 silver badges24 bronze badges

$\begingroup$ Finally an answer which makes sense. At least GPT's understood it.. Don't know when humans will. $\endgroup$
– user1953366
Commented Jul 27, 2023 at 22:39

Add a comment |

Stack Exchange Network

How is GPT 4 able to solve math?

5 Answers 5

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
chatgpt
gpt
gpt-4
.

Hot Network Questions

How is GPT 4 able to solve math?

5 Answers 5

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged chatgptgptgpt-4.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
chatgpt
gpt
gpt-4
.