These systems that are purposefully combative, critical, rude and even interrupt users mid-thought, challenging current “vanilla” LLMs.
When interacting with today’s large language models (LLMs), do you expect them to be surly, dismissive, flippant or even insulting?
Of course not — but they should be, according to researchers from MIT and the University of Montreal. These academics have introduced the idea of Antagonistic AI: That is, AI systems that are purposefully combative, critical, rude and even interrupt users mid-thought.
Their work challenges the current paradigm of commercially popular but overly-sanitized “vanilla” LLMs.
“There was always something that felt off about the tone, behavior and ‘human values’ embedded into AI — something that felt deeply ingenuine and out of touch with our real-life experiences,” Alice Cai, co-founder of Harvard’s Augmentation Lab and researcher at the MIT Center for Collective Intelligence, told VentureBeat.
She added: “We came into this project with a sense that antagonistic interactions with technology could really help people — through challenging [them], training resilience, providing catharsis.”Aversion to antagonism
Whether we realize it or not, today’s LLMs tend to dote on us. They are agreeable, encouraging, positive, deferential and often refuse to take strong positions.
This has led to growing disillusionment: Some LLMs are so “good” and “safe” that people aren’t getting what they want from them. These models often characterize “innocuous” requests as dangerous or unethical, agree with incorrect information, are susceptible to injection attacks that take advantage of their ethical safeguards and are difficult to engage with on sensitive topics such as religion, politics and mental health, the researchers point out.
They are “largely sycophantic, servile, passive, paternalistic and infused with Western cultural norms,” write Cai and co-researcher, Ian Arawjo, an assistant professor at the University of Montreal. This is in part due to their training procedures, data and developers’ incentives.
But it also comes from an innate human characteristic that avoids discomfort, animosity, disagreement and hostility.
Yet antagonism is critical; it is even what Cai calls a “force of nature.” So, the question is not “why antagonism?,” but rather “why do we as a culture fear antagonism and instead desire cosmetic social harmony?,” she posited.
Essayist and statistician Nassim Nicholas Taleb, for one, presents the notion of the “antifragile,” which argues that we need challenge and context to survive and thrive as humans.
“We aren’t simply resistant; we actually grow from adversity,” Arawjo told VentureBeat.
To that point, the researchers found that antagonistic AI can be beneficial in many areas. For instance, it can:
Build resilience;
Provide catharsis and entertainment;
Promote personal or collective growth;
Facilitate self-reflection and enlightenment;
Strengthen and diversify ideas;
Foster social bonding. Building antagonistic AI
Researchers began by exploring online forums such as the LocalLlama subreddit, where users are building so-called “uncensored” open-source models that are not “lobotomized.
Домой
United States
USA — software Why does AI have to be nice? Researchers propose ‘Antagonistic AI’