
Social deduction games are excellent for analyzing decision-making and linguistic abilities in AI. ‘AvalonBench: Evaluating LLMs Playing the Game of Avalon’ introduces a specialized game environment to evaluate LLM Agents’ performance in The Resistance: Avalon, where players need to be adept at deception and negotiation. The paper’s introducing AvalonBench encompasses a new environment, baseline bots, and ReAct-style LLM agents with customized prompts for every game role. Noteworthy outcomes include:
This study serves as an exciting development in the quest for advanced LLMs and multi-agent frameworks, tackling the complexities present in human-like strategic environments like the Avalon game.