Home United States USA — software Cooperative Multi-Agent Reinforcement Learning and QMIX at NeurIPS 2021

Cooperative Multi-Agent Reinforcement Learning and QMIX at NeurIPS 2021

December 28, 2021

250

This post introduces Cooperative MARL and goes through innovations by S. Whiterson Lab, with QMIX (2019), and their current contributions for NeurIPS 2021.
Join the DZone community and get the full member experience. Authors: Gema Parreño, David Suarez (Apiumhub), with thanks to: Alberto Hernandez (BBVA Innovation Labs). The following post aims to introduce Cooperative MARL and goes through innovations by S. Whiterson Lab, with QMIX (2019), and their current contributions for NeurIPS 2021. Going through this article might imply having certain fundamentals about Reinforcement Learning. A multi-agent system describes multiple distributed entities: so-called « agents » that take decisions autonomously and interact within a shared environment (Weiss 1999). MARL (Multi-Agent Reinforcement Learning) can be understood as a field related to RL in which a system of agents interact within an environment to achieve a goal. The goal of each one of these agents or learnable units is to learn a policy in order to maximize the long-term reward, in which each agent discovers a strategy alongside other entities in a common environment and adapts its policy in response to the behavioral changes of others. Properties of MARL systems are key to their modeling. Depending on these properties, we might branch into specific particularities of areas of research. Table 1. This taxonomic schema (Weiss 1999) proposes to let us know more about the MARL exploration we will talk about today. In cooperative MARL, agents cooperate to achieve a common goal. From the environment perspective, we can enunciate several challenges: When we branch from MARL into Cooperative MARL, we focus on reformulating the challenge into a system of agents that interact within an environment to achieve a common goal. These challenges might have more importance depending on the type of behavior and environment.