Start United States USA — software I tested GPT-5's coding skills, and it was so bad that I'm...

I tested GPT-5's coding skills, and it was so bad that I'm sticking with GPT-4o (for now)

Von

August 10, 2025

103

In my latest coding benchmark, GPT-5 stumbled badly, delivering broken plugins, flawed scripts, and confidence-laden wrong answers that could derail projects without careful human oversight. Here’s what to know before you use it.
OpenAI’s new GPT-5 flagship failed half of my programming tests.
Previous OpenAI releases have had just about perfect results.
Now that OpenAI has enabled fallbacks to other LLMs, there are options.
So GPT-5 happened. It’s out. It’s released. It’s the talk of the virtual town. And it’s got some problems. I’m not gonna bury the lede. GPT-5 has failed half of my programming tests. That’s the worst that OpenAI’s flagship LLM has ever done on my carefully designed tests.
Before I get into the details, let’s take a moment to discuss one other little feature that’s also a bit wonky. Check out the new Edit button on the top of the code dumps it generates.
Clicking the Edit button takes you into a nice little code editor. Here, I replaced the Author field, right in ChatGPT’s results.
That seemed nice, but it ultimately proved futile. When I closed the editor, it asked me if I wanted to save. I did. Then this unhelpful message showed up.
I never did get back to my original session. I had to submit my original prompt again, and let GPT-5 do its work a second time.
But wait. There’s more. Let’s dig into my test results… 1. Writing a WordPress plugin
This was my very first test of coding prowess for any AI. It’s what gave me that first „the world is about to change“ feeling, and it was done using GPT-3.5.
Subsequent tests, using the same prompt but with different AI models, generated mixed results. Some AIs did great, some didn’t. Some AIs, like those from Microsoft and Google, improved over time.
ChatGPT’s model has been the gold standard for this test since the very beginning. That makes the results of GPT-5 all that much more curious.
So, look, the actual coding with GPT-5 was partially successful. GPT-5 generated a single block of code, which I pasted into a file and was able to run. It provided the requisite UI.
When I pasted in the test names, it dynamically updated the line count, although it described it as „Line to randomize“ instead of „Lines to randomize.“
But then, when I clicked Randomize, it didn’t. Instead, it redirected me to tools.php. What?? ChatGPT has never had a problem with this test, whether GPT-3.5, GPT-4, or GPT-4o. You mean to tell me that OpenAI’s much-anticipated GPT-5 is failing right out of the gate? Ouch.
I then gave GPT-5 this prompt.
When I click randomize, I’m taken to http://testsite.local/wp-admin/tools.php. I do not get a list of randomized results.

I tested GPT-5's coding skills, and it was so bad that I'm sticking with GPT-4o (for now)

NOCH MEHR NEWS

立憲・安住氏「玉木氏でまとまれるなら」野党候補一本化呼びかけ

Urban wręczył prezenty trzem kadrowiczom. Okazja była wyjątkowa

台風22号伊豆諸島南部に暴風波浪の特別警報今後の進路は？風の予想は？(18時) 【気象予報士解説】

BELIEBTE KATEGORIE

VERWANDTE ARTIKELMEHR VOM AUTOR

Drew Struzan’s Perfect 'Star Wars' Posters Are Getting a New Limited Release

Xbox Game Pass Ultimate Highlights 10 Games Coming Soon

Amazon's Latest Prime Day Sale Ends Today—Don't You Dare Miss These Dell Deals

NOCH MEHR NEWS

立憲・安住氏「玉木氏でまとまれるなら」 野党候補一本化呼びかけ

Urban wręczył prezenty trzem kadrowiczom. Okazja była wyjątkowa

台風22号 伊豆諸島南部に暴風 波浪の特別警報 今後の進路は？風の予想は？(18時) 【気象予報士解説】

BELIEBTE KATEGORIE

VERWANDTE ARTIKEL MEHR VOM AUTOR

立憲・安住氏「玉木氏でまとまれるなら」野党候補一本化呼びかけ

台風22号伊豆諸島南部に暴風波浪の特別警報今後の進路は？風の予想は？(18時) 【気象予報士解説】