3

The title says it all. Can they be used in a way that is not for generation of text based on a prompt, but instead for word suggestions based on the context of previously written text?

3 Answers 3

1

Pretty much by definition, yes. The core of any current LLM is exactly what you're asking for: a text predictor. It's simply a matter of programming the surrounding interface so that it stops generating after a single word instead of running until it sees an "end-of-text" token.

You'll probably get better results if you use an LLM that hasn't been fine-tuned for any particular task, but any LLM should be capable of being used this way.

1

Can they be used in a way that is not for generation of text based on a prompt, but instead for word suggestions based on the context of previously written text?

Note that it's the same: the prompt is the context of previously written text.

0

Yes. And interestingly enough, the GPT models actually work like that. Think of the typing suggestions in your phone's keyboard. It can give you suggestions for the next potential token typically based on previous a few tokens. Now think of a system which can give the next token suggestions based on previous thousands of tokens. GPTs are such a system. And the way it does so, is called autoregression and these models are autoregressive models. Simply putting, an autoregressive model cascades its output to its input to generate next output. Which means, the generated text is fed back to the network to generate next token. For example, if you write like:

Once upon a time

The model will take this as an input an generate one of the possible next tokens like in. And this token will be appended to the existing input and again fed to the network to generate the second next token. So, in the second step the input is:

Once upon a time in

And the output can be a. Appending this again, we get the next input:

Once upon a time in a

Now if you continue like this you will get a text completion engine which will output coherent texts like this until it runs out of the maximum token length. So, the process is something like this (output from each step is represented inside [], and the rest tokens are the input tokens):

Your Input:  Once upon a time

Generation
-----------------------------------------------------------------
Input: Once upon a time [in]
Input: Once upon a time in [a]
Input: Once upon a time in a [land]
Input: Once upon a time in a land [far]
Input: Once upon a time in a land far [away]
Input: Once upon a time in a land far away [there]
Input: Once upon a time in a land far away there [existed]
Input: Once upon a time in a land far away there existed [a]
Input: Once upon a time in a land far away there existed a [kingdom]
...

Each output is again sent back to the model as an input to get the next token. Anyway, you got the idea.

Now, for use case, yes, it is possible to do so. Now let's analyze some technical aspects. As you are trying to use the GPTs as a text completion engine, without explicit prompt engineering. Then, you may use the base model. These models are just like the text completion engine. They will complete the text anyway (that may not be always the most appropriate way with respect to your domain). But if you are okay with using prompts, then in that case, instead of using the base model use the chat models. These models are already good at following instructions. You may just give it an idea of the domain such that it can generate more reliable and relevant text completions.

Be note that, these models are trained with internet size data, so, they are capable of completing a text in astronomically various different ways. Not all of them might be applicable for your current use case. So, if you feel that the completion quality is not acceptable, even with prompting the chat model, then you may need to fine tune it, if you want the text completion to work in a relatively constrained way following the domain knowledge.