Home United States USA — IT Watching Anthropic’s Claude 3.5 Sonnet code a video game blew my mind

Watching Anthropic’s Claude 3.5 Sonnet code a video game blew my mind

June 20, 2024

335

I just watched the new Claude 3.5 Sonnet code a simple platformer video game, and my mind is completely blown.
There’s so much progress in artificial intelligence right now that it feels like, with every new model, some new feature or capability has gone from seemingly impossible to completely possible.
That’s what this latest release feels like. Today, in a blog post, Anthropic announced Claude 3.5 Sonnet, its latest large language model (LLM) that the company says “raises the industry bar for intelligence, outperforming competitor models and Claude 3 Opus on a wide range of evaluations, with the speed and cost of our mid-tier model, Claude 3 Sonnet.”
The company says that Claude 3.5 Sonnet is faster, cheaper, and smarter than its predecessors, a combination that Anthropic says is “ideal for complex tasks such as context-sensitive customer support and orchestrating multi-step workflows.”
In an internal agentic coding evaluation, Claude 3.5 Sonnet solved 64% of problems, outperforming Claude 3 Opus which solved 38%. Our evaluation tests the model’s ability to fix a bug or add functionality to an open source codebase, given a natural language description of the desired improvement. When instructed and provided with the relevant tools, Claude 3.5 Sonnet can independently write, edit, and execute code with sophisticated reasoning and troubleshooting capabilities. It handles code translations with ease, making it particularly effective for updating legacy applications and migrating codebases.