Spanish researchers discover the trick AI uses to get such good grades: It's true kryptonite for the models'
Briefly

Elon Musk's xAI has launched Grok 3, hailed as the world's smartest AI, amidst intense competition in the chatbot market. Musk's push for Grok's superiority is met with skepticism, as many claims are seen as promotional. Julio Gonzalo from Spain's Open University emphasizes that benchmark figures might be misleading due to potential manipulation. Researchers tested top AI responses, revealing that these models often rely heavily on pre-existing online data. The results suggest that AI chatbots may simply regurgitate known information rather than demonstrating true understanding or reasoning skills.
Many of these claims are pure marketing. AI chatbots are an extremely competitive field today, and claiming to be the best attracts a lot of investment.
If there is a lot of competitive pressure, there is too much attention on the benchmarks and it would be easy for companies to manipulate them, so we cannot trust the numbers they report.
The developers know that the probability that they have seen the answer to an exam available online is very high, explains Gonzalo.
The basic objective was to find out if the models read and responded like any other student or, instead, only looked for the answer in the huge body of data that has been used for their training.
Read at english.elpais.com
[
|
]