ChatGPT Writes Decent Computer Code, When It Sticks to the Basics

When asked to solve 728 coding problems, GPT-3.5 mostly makes the grade, but things get stickier when it's presented with data added to the LeetCode testing platform after 2021.

One of the bigger selling points of AI is the ability to write computer code, and a recent study that investigated just how good ChatGPT is at the task finds that it gets at least a passing grade.

The study, published in the June issue of IEEE Transactions on Software Engineering, ran GPT-3.5 through 728 coding problems from the LeetCode testing platform across five programming languages, including C, C++, Java, JavaScript, and Python.

On problems that existed in LeetCode before 2021, ChatGPT solved easy ones 89% of the time, medium-difficulty problems 71% of the time, and hard problems 40% of the time.

However, when tested against problems in LeetCode's platform after 2021, the easy, medium, and hard results dropped to 52%, 40%, and 0.66%, respectively. ChatGPT was initially trained on data up to 2021; that knowledge base did not expand until late 2023.

"When it comes to the algorithm problems after 2021, ChatGPT's ability to generate functionally correct code is affected. It sometimes fails to understand the meaning of questions, even for easy level problems," says Yutian Tang, a lecturer at the University of Glasgow who was involved in the study. "A reasonable hypothesis for why ChatGPT can do better with algorithm problems before 2021 is that these problems are frequently seen in the training dataset."

Recommended by Our Editors

First Look: The Raspberry Pi AI Kit Is a Budget Add-On for Code Dabblers

Google Launches Coding AIs That Could Rival Microsoft's GitHub Copilot

ChatGPT Passes Google Coding Interview for Level 3 Engineer With $183K Salary

Researchers also note that ChatGPT is better at fixing human errors than its own errors and can generate code with a smaller runtime and memory overhead 50% of the time compared to humans. ChatGPT-generated code also had a decent amount of errors, though "many of these were easily fixable," says IEEE Spectrum. "Generated code in C was the most complex, followed by C++ and Python, which has a similar complexity to the human-written code."

OpenAI Reveals Its ChatGPT AI Voice Assistant

Get Our Best Stories!

This newsletter may contain advertising, deals, or affiliate links. Subscribing to a newsletter indicates your consent to our Terms of Use and Privacy Policy. You may unsubscribe from the newsletters at any time.

Thanks for signing up!

Your subscription has been confirmed. Keep an eye on your inbox!

ChatGPT Writes Decent Computer Code, When It Sticks to the Basics

Recommended by Our Editors

Get Our Best Stories!

Further Reading

TRENDING

About Joe Hindy

Contributor

Read the latest from Joe Hindy