Skip to Main Content
PCMag editors select and review products independently. If you buy through affiliate links, we may earn commissions, which help support our testing.

ChatGPT Writes Decent Computer Code, When It Sticks to the Basics

When asked to solve 728 coding problems, GPT-3.5 mostly makes the grade, but things get stickier when it's presented with data added to the LeetCode testing platform after 2021.

July 8, 2024
The welcome screen for the ChatGPT app (Credit: SOPA Images)

One of the bigger selling points of AI is the ability to write computer code, and a recent study that investigated just how good ChatGPT is at the task finds that it gets at least a passing grade.

The study, published in the June issue of IEEE Transactions on Software Engineering, ran GPT-3.5 through 728 coding problems from the LeetCode testing platform across five programming languages, including C, C++, Java, JavaScript, and Python.

On problems that existed in LeetCode before 2021, ChatGPT solved easy ones 89% of the time, medium-difficulty problems 71% of the time, and hard problems 40% of the time.

However, when tested against problems in LeetCode's platform after 2021, the easy, medium, and hard results dropped to 52%, 40%, and 0.66%, respectively. ChatGPT was initially trained on data up to 2021; that knowledge base did not expand until late 2023.

"When it comes to the algorithm problems after 2021, ChatGPT's ability to generate functionally correct code is affected. It sometimes fails to understand the meaning of questions, even for easy level problems," says Yutian Tang, a lecturer at the University of Glasgow who was involved in the study. "A reasonable hypothesis for why ChatGPT can do better with algorithm problems before 2021 is that these problems are frequently seen in the training dataset."

Researchers also note that ChatGPT is better at fixing human errors than its own errors and can generate code with a smaller runtime and memory overhead 50% of the time compared to humans. ChatGPT-generated code also had a decent amount of errors, though "many of these were easily fixable," says IEEE Spectrum. "Generated code in C was the most complex, followed by C++ and Python, which has a similar complexity to the human-written code."

OpenAI Reveals Its ChatGPT AI Voice Assistant
PCMag Logo OpenAI Reveals Its ChatGPT AI Voice Assistant

Get Our Best Stories!

Sign up for What's New Now to get our top stories delivered to your inbox every morning.

This newsletter may contain advertising, deals, or affiliate links. Subscribing to a newsletter indicates your consent to our Terms of Use and Privacy Policy. You may unsubscribe from the newsletters at any time.


Thanks for signing up!

Your subscription has been confirmed. Keep an eye on your inbox!

Sign up for other newsletters

TRENDING

About Joe Hindy

Contributor

Hello, my name is Joe and I am a tech blogger. My first real experience with tech came at the tender age of 6 when I started playing Final Fantasy IV (II on the SNES) on the family's living room console. As a teenager, I cobbled together my first PC build using old parts from several ancient PCs, and really started getting into things in my 20s. I served in the US Army as a broadcast journalist. Afterward, I served as a news writer for XDA-Developers before I spent 11 years as an Editor, and eventually Senior Editor, of Android Authority. I specialize in gaming, mobile tech, and PC hardware, but I enjoy pretty much anything that has electricity running through it.

Read Joe's full bio

Read the latest from Joe Hindy