Today's AI models have a poor grasp of world history
Briefly

A recent report from Complexity Science Hub reveals that leading AI models, including GPT-4, Meta's Llama, and Google's Gemini, demonstrate significant inaccuracies in answering historical questions. These models collectively achieved a mere 46% accuracy in their responses. An example includes GPT-4 incorrectly affirming that Ancient Egypt had a standing army, illustrating a tendency of AI to extrapolate from more frequently encountered data. Researcher Maria del Rio-Chanona highlighted the issue of AI memory, noting that it often prioritizes overrepresented information, which can lead to misconceptions about less common historical facts.
In an experiment, OpenAI's GPT-4, Meta's Llama, and Google's Gemini were asked to answer yes or no to historical questions - and only 46% of the answers were correct.
GPT-4, for example, answered 'yes' to the question of whether Ancient Egypt had a standing army, likely because the AI model chose to extrapolate data from other empires such as Persia.
Researcher Maria del Rio-Chanona suggested that AI tends to remember prevalent information, stating, 'If you are told A and B 100 times and C one time, you might just remember A and B.'
Read at Computerworld
[
|
]