"Diplomacy is a strategic board game set on a map of Europe in 1901 - a time when tensions between the continent's most powerful countries were simmering in the lead-up to World War I."
"I quite like the idea of using games to evaluate LLMs against each other, instead of fixed evals. Everyone knows the usual benchmarks are a bore."
"Noam Brown, a research scientist at OpenAI, suggested the 75-year-old geopolitical strategy game, Diplomacy. 'I would love to see all the leading bots play a game of Diplomacy together.'"
"Alex Duffy published a post titled, 'We Made Top AI Models Compete in a Game of Diplomacy. Here's Who Won.'"
A recent conversation among leading AI experts, including Andrej Karpathy and Elon Musk, proposed using the board game Diplomacy as a method to evaluate large language models. AI researcher Alex Duffy responded by creating a version called 'AI Diplomacy,' where various AI models, including OpenAI's o3 and Anthropic's Claude, competed. The game, rooted in strategic alliances and negotiations, proved to be a compelling environment for assessing intelligence and interaction capabilities among AI systems, highlighting their differing approaches.
 Read at Business Insider
Unable to calculate read time
 Collection 
[
|
 ... 
]