24
$\begingroup$

ChatGPT is a language model. As far as I know and If I'm not wrong, it gets text as tokens and word embeddings. So, how can it do math? For example, I asked:

ME: Which one is bigger 5 or 9.
ChatGPT: In this case, 9 is larger than 5.

One can say, GPT saw numbers as tokens and in its training dataset there were some 9s that were bigger than 5s. So, it doesn't have actual math understanding and just sees numbers as some tokens. But I don't think that is true, because of this question:

ME: Which one is bigger? 15648.25 or 9854.2547896
ChatGPT: In this case, 15648.25 is larger than 9854.2547896.

We can't say it actually saw the token of 15648.25 to be bigger than the token of 9854.2547896 in its dataset!

So how does this language model understand the numbers?

$\endgroup$
5
  • 2
    $\begingroup$ Arithmetic is briefly described in this paper starting on page 21. Google also has some interesting discussion here. $\endgroup$ Commented Dec 8, 2022 at 23:27
  • $\begingroup$ I find it curious that people are asking these questions about ChatGPT - not about GPT-2, released in February 2019 and also able to answer these kinds of questions. $\endgroup$ Commented Dec 12, 2022 at 16:43
  • 2
    $\begingroup$ @user253751The GPT 2 that I tried couldn't do these things. $\endgroup$
    – Peyman
    Commented Dec 12, 2022 at 19:05
  • 1
    $\begingroup$ It even can take some integrals (but more complicated integrals it takes wrongly). It even correctly provided the infinite-matrix-form of derivative operator to me. $\endgroup$
    – Anixx
    Commented Dec 26, 2022 at 14:23
  • 1
    $\begingroup$ It does not actually cope well with math. After some interesting and astounding answers to coding questions, I wondered how ut reacts to math. Asked for square- and qubic roots and vice versa, and it failed greatly. Cubic root of 8 was 2, and of 9 was also 2. Later, after some discussion, it came up with the correct answer. But with the opposit way it failed again. $\endgroup$
    – datenheim
    Commented Jan 13, 2023 at 21:38

4 Answers 4

11
$\begingroup$

Adding on to txopen's answer, it is interesting to note that for larger numbers with similar digits ChatGPT is unable to make any useful distinctions. For instance:

Me: Which number is bigger: 1234.12 or 1243.12

ChatGPT: Both numbers are equal.

$\endgroup$
2
  • 7
    $\begingroup$ I expect that if you try repeatedly with the same two "harder" numbers, you will get a range of answers. It is working statistically and stochastically based on text tokens. Posting single examples may not always tell the whole story. $\endgroup$ Commented Jan 12, 2023 at 23:27
  • 2
    $\begingroup$ Today the example above looks like to give the right answer (human model adjustment?). But found another simple example were it fails: Can you divide 1231231231 by 2? Sure, I can divide 1231231231 by 2. The result is 6156156115.5 $\endgroup$ Commented Jan 23, 2023 at 12:45
6
$\begingroup$

I think that the dataset is so large and the model so well trained that it understood the probabilistic correlation of length in a token of numbers before a dot separation, and then the influence of even each digit on the probability of one number being larger than another. The concrete example does not have to be in the dataset, it predicts the correct outcome because the relation of one number being larger than another and the difference in digits and length of those is sufficiently present in the dataset.

$\endgroup$
3
  • 1
    $\begingroup$ But it still says "in this case", which is...wrong $\endgroup$
    – RedSonja
    Commented Dec 13, 2022 at 12:21
  • 1
    $\begingroup$ @RedSonja I notice that the response from ChatGPT not longer states "In this case, " for the stated queries. ("It" is learning!) $\endgroup$
    – MrWhite
    Commented Jan 11, 2023 at 23:31
  • $\begingroup$ @MrWhite Oh dear. There's no hope for us, is there? $\endgroup$
    – RedSonja
    Commented Jan 12, 2023 at 7:43
1
$\begingroup$

The apparent ability of ChatGPT (in particular when using the GPT-4 model) to solve certain mathematical problems is due to the amount of training and the amount of parameters of these machine learning models. ChatGPT or other large language models do not have explicit rules for solving mathematical problems.

The following 2022 paper describes that such capabilities of transformer-based language models occur when a certain threshold of parameter quantity is exceeded: https://arxiv.org/pdf/2206.07682.pdf

This is also the reason why they excel at some maths problems and fail at others, which can be very similar.

$\endgroup$
-5
$\begingroup$

Simple answer, ChatGPT is actually human writers with some kind of autocomplete to speed things up.

This is standard practice for AI companies these days, a "fake it till you make it" approach where they use humans to fill the gaps in the AI in the hopes that down the road they'll automate humans out of the product. Common enough for an academic paper to be written on the topic. So, there is plenty of industry precedent for OpenAI to be using humans to help craft the responses.

Plus, technically OpenAI is not "faking" anything. It is the media and bloggers who think ChatGPT is a pure AI system. OpenAI has made no such claim itself, and the opposite is implied by its InstructGPT whitepaper:

Step 1: Collect demonstration data, and train a supervised policy. Our labelers provide demonstrations of the desired behavior on the input prompt distribution (see Section 3.2 for details on this distribution). We then fine-tune a pretrained GPT-3 model on this data using supervised learning

Additionally, ChatGPT is in "research mode" according to the website, which implies there are still humans training the system during the chats, as described in the quote above.

Final note, I find it amusing no one considers this alternative plausible, as if it were somehow more complicated to have humans tweak chatbot responses than to create an AI with apparent human level understanding that ChatGPT exhibits.

UPDATE: ChatGPT confirms OpenAI team curating its responses

Turns out ChatGPT is indeed human curated, by open admission.

During this conversation ChatGPT outright states the OpenAI team filters and edits the GPT generated responses.

...the response you are receiving is being filtered and edited by the OpenAI team, who ensures that the text generated by the model is coherent, accurate and appropriate for the given prompt.

Apparently, the fact that OpenAI actively curates ChatGPT's responses is indirectly implied in the documentation here.

Human in the loop (HITL): Wherever possible, we recommend having a human review outputs before they are used in practice. This is especially critical in high-stakes domains, and for code generation. Humans should be aware of the limitations of the system, and have access to any information needed to verify the outputs (for example, if the application summarizes notes, a human should have easy access to the original notes to refer back).

So, that explains that :)

$\endgroup$
25
  • 2
    $\begingroup$ We’ve trained a model it's on the front page. Also, you source is press. Better link an academic paper. $\endgroup$ Commented Jan 10, 2023 at 11:29
  • 9
    $\begingroup$ @yters I will give you the benefit of the doubt of not being a troll. The "special treatment" you are getting is because the idea that "ChatGPT is actually human writers with some kind of autocomplete to speed things up." is a "hallucination", with nothing to back it up. The academic paper you linked made no such implication. It just says that human input was used to train the model, not to craft responses after the model is trained. I challenge you to post just one excerpt from any paper to back up your claim. $\endgroup$ Commented Jan 13, 2023 at 19:46
  • 1
    $\begingroup$ The bit you have linked regarding HITL does not imply that is what they do with ChatGPT. $\endgroup$
    – David
    Commented Jan 17, 2023 at 10:34
  • 3
    $\begingroup$ @yters he is saying what I am saying. With all due respect, you need to think critically about your conclusions and triple-check whether they truly follow align with your sources. Many people are telling you it doesn't so please don't be stubborn. OpenAI could be doing what you are suggesting, but nothing that you have provided directly proves it. $\endgroup$ Commented Jan 21, 2023 at 17:57
  • 1
    $\begingroup$ @yters I know we work with data and reason, but stackexchange comments are not the best place for this kind of discourse. If you open a new question on ai.stackexchange.com I am eager to provide a more organized response about how you are wrong and what I am basing my conclusions off of. $\endgroup$ Commented Jan 22, 2023 at 22:55

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .