From the course: GPT-4: The New GPT Release and What You Need to Know

Comparing GPT-4 to GPT-3 and GPT-3.5

From the course: GPT-4: The New GPT Release and What You Need to Know

Comparing GPT-4 to GPT-3 and GPT-3.5

- [Announcer] OpenAI released GPT-3, in June of 2020, to much fanfare. And then different versions of GPT-3.5 were released in 2022, including ChatGPT which is based on GPT-3.5. And, finally, in March of 2023, GPT-4 was released. So how are these models different? And what are they capable of? First let's make sure we understand the naming convention used by OpenAI. Davinci refers to the original GPT-3 model, and it can generate text. The GPT-3.5 models come in a couple of different flavors. So you have text-davinci-002 and 003, code-davinci-002, and gpt-3.5-turbo. The text-davinci models are good for generating text, working with complex text, and determining cause and effect relationships. Now, although text-davinci-002 and text-davinci-003 are both GPT-3.5 models, text-davinci-003 is a more recent version, and performs better than text-davinci-002. Now, if you're more interested in generating programming codes such as Python, then code-davinci-002 is a good choice. And, finally, gpt-3.5-turbo is the most recent of the GPT-3.5 models, and it performs the best of the lot. You can use it for chat, text generation, and programming code generation. So where does GPT-4 fit in? GPT-4 is the biggest and the best of the GPT models available from OpenAI. You can use it for chat, for working with complex text, and programming code generation. The GPT-4 model that takes in images as input, isn't yet available, so we'll only look at the one taking text as input for now. It comes in two variants, gpt-4 and gpt-4-32k. Now, while GPT-3.5 could also generate text, it can't do as complex reasoning tasks as a GPT-4 model. We've also seen that, in general, GPT-4 performs much better on several high-school-level and professional exams, compared to GPT-3.5. Let's take a look at the difference between gpt-4 and gpt-4-32k. A prompt is the text you input into the model, and it's made up of a couple of tokens. The completion is the text outputted from the model, which will also be made up of a couple of tokens. And the sum of the tokens of the prompt and the completion is known as the context window. gpt-4 has a context length of around 8,000 tokens. And gpt-4-32k has a context length of around 32,000 tokens. This means with the context length of 32,000, you could provide almost 50 pages of text as input to a model, and get it summarized. Alternatively, you can get it to generate more text, if you wanted to create a short story, and it would stay on topic and in context longer than models that have a shorter context length. For comparison, the original GPT-3 has a context length of around 2000 tokens, and GPT-3.5 has around 4,000 tokens. Let's take a quick look at pricing. The older GPT-3.5 models, So that's text-davinci-002 and 003, and code-davinci-002, cost 2 cents for 1000 tokens, which is around 750 words. This token-count includes your prompt. So if your prompt is 10 tokens, and 90 tokens are generated, then you'll be charged for 100 tokens. Now what's interesting is that the gpt-3.5-turbo is the best performing, and most recent, of the GPT-3.5 models, and yet it is 10 times cheaper than the other GPT-3.5 models. GPT-4 is the best performing, and most expensive of the lot. So GPT-4, with the context length of 8,000 tokens, cost 3 cents per 1000 prompt tokens, and 6 cents per 1000 completion tokens. And not surprisingly, the GPT-4 model with the 32,000 context length is the most expensive of the lot. Now you're probably wondering if the model architectures are different for GPT-3, 3.5, and GPT-4. Well, the text-davinci-002 and 003 and the code-davinci-002, seems to have similar model architectures to the original GPT-3 model, with 175 billion parameters. Because of the significantly cheaper pricing and the faster response time back from the model, also known as the model latency, it looks like the GPT-3.5 turbo uses a smaller model. Now if you're interested in GPT-3's architecture, check out my Transformers For NLP Using Large Language Models course that's also available here on LinkedIn. And at the time of this recording, there isn't any public information on the GPT-4 model architecture or the model size. Alright, so GPT-4 is the best, and the most expensive, of the GPT models. It can do complex reasoning tasks, you can use it for chat, and it can generate programming code. It's also better at following instructions, compared to GPT-3 and GPT-3.5. And, finally, it has the option of context for 8,000 or 32,000 tokens, meaning it can stay on topic and in context for longer.

Contents