Home United States USA — software Microsoft reveals Windows Agent Arena to benchmark generative AI agents

Microsoft reveals Windows Agent Arena to benchmark generative AI agents

September 14, 2024

100

Microsoft has revealed Windows Agent Arena, a new framework designed to benchmark generative AI agents.
The use of generative AI and large language models to automate and simplify tasks for people who work with PCs continued to grow. However, there’s also a need to see how well AI can work to accomplish tasks. This week, Microsoft Research announced it has developed a benchmark specifically to test out AI agents on Windows PCs.
The benchmark, as revealed on Microsoft’s GitHub page, is called Windows Agent Arena. This framework is designed to test how well and how quickly AI agents can interact with Windows applications that humans usually use. The list of apps that were tested with AI agents in Windows Agent Arena included web browsers like Microsoft Edge and Google Chrome, OS functions like File Explorer Settings, coding apps like Visual Studio Code), simple preinstalled Windows apps like Notepad, Clock, and Paint and even watching videos with VLC Player.