Домой United States USA — software Watch AI models compete right now in Google's new Game Arena

Watch AI models compete right now in Google's new Game Arena

По

August 5, 2025

167

As AI models increasingly ace conventional tests, researchers are looking for new benchmarking methods. Google is betting on games.
As artificial intelligence evolves, it’s becoming increasingly difficult to accurately measure the performance of individual models.
To that end, Google unveiled on Tuesday the Game Arena, an open-source platform in which AI models compete in a variety of strategic games to provide «a verifiable, and dynamic measure of their capabilities», as the company wrote in a blog post.
The new Game Arena is hosted in Kaggle, another Google-owned platform in which machine learning researchers can share datasets and compete with one another to complete various challenges.
This comes as researchers have been working on new kinds of tests to measure the capabilities of AI models as the field inches closer to artificial general intelligence, or AGI, an as-yet theoretical system that (as it’s commonly defined) can match the human brain in any cognitive task. Serious play
Google’s new Game Arena initiative aims to push the capabilities of existing AI models while simultaneously providing a clear and bounded framework for analyzing their performance.
«Games provide a clear, unambiguous signal of success», Google wrote in its blog post. «Their structured nature and measurable outcomes make them the perfect testbed for evaluating models and agents. They force models to demonstrate many skills including strategic reasoning, long-term planning and dynamic adaptation against an intelligent opponent, providing a robust signal of their general problem-solving intelligence.

Watch AI models compete right now in Google's new Game Arena

ЕЩЁ БОЛЬШЕ НОВОСТЕЙ

Рубио призвал НАТО к переосмыслению и усилению обороноспособности Европы

В Воздушных силах формируется командование "малой" ПВО: направление возглавил Евгений Хлебников

Американские войска "под прицелом" иранских ракет и дронов: США допустили превентивный...

ПОПУЛЯРНАЯ КАТЕГОРИЯ

СХОЖИЕ СТАТЬИБОЛЬШЕ ОТ АВТОРА

DeepMind’s New AI Can Read a Million DNA Letters at Once—and Actually Understand Them

Son of Executive Overseeing U.S. Government’s Crypto Stash Accused of Stealing $40 Million

Excited About Android for PC? This Leak Gives Us First Glimpse of New OS

ЕЩЁ БОЛЬШЕ НОВОСТЕЙ

Рубио призвал НАТО к переосмыслению и усилению обороноспособности Европы

В Воздушных силах формируется командование "малой" ПВО: направление возглавил Евгений Хлебников

Американские войска "под прицелом" иранских ракет и дронов: США допустили превентивный...

ПОПУЛЯРНАЯ КАТЕГОРИЯ

СХОЖИЕ СТАТЬИ БОЛЬШЕ ОТ АВТОРА