Evaluating large language models for accuracy incentivizes hallucinations - Nature

"Next-word pretraining creates statistical pressure toward hallucination, even with idealized error-free data. Facts lacking repeated support in training data yield unavoidable errors, while recurring regularities do not."

"Dominant headline metrics like accuracy systematically reward guessing over admitting uncertainty. To align incentives, we suggest two additions to the classic approach of adding error penalties to evaluations."

"We propose 'open-rubric' evaluations that explicitly state how errors are penalized, testing whether a model modulates its abstentions to stated stakes while optimizing accuracy."

"Reframing hallucination as an incentive problem opens a practical path toward more reliable language models, suggesting that existing evaluation methods need to be adapted."

Large language models often generate confident falsehoods, known as hallucinations, which undermine their reliability. Current mitigation strategies exist but do not fully resolve the issue. Next-word prediction and accuracy-based evaluations inadvertently encourage guessing, leading to errors, especially for unique facts. While training aims to correct these errors, metrics like accuracy favor guessing over uncertainty. To improve reliability, two new evaluation methods are proposed: open-rubric evaluations that clarify error penalties and open-rubric variants of existing benchmarks to adjust guessing incentives, reframing hallucination as an incentive problem.

#language-models #hallucinations #evaluation-metrics #reliability #machine-learning

Read at Nature

Unable to calculate read time

Collection

[

...

]

Evaluating large language models for accuracy incentivizes hallucinations - NatureEvaluating large language models for accuracy incentivizes hallucinations - Nature Briefly

Evaluating large language models for accuracy incentivizes hallucinations - Nature
Evaluating large language models for accuracy incentivizes hallucinations - Nature
Briefly