
"James Zou, a computer scientist at Stanford University in California, and his colleagues set out to assess whether large language models (LLM) could help to address a common complaint about peer reviews: feedback often lacks thoroughness or strikes the wrong tone. At the 2023 Association for Computational Linguistics annual meeting in Toronto, Canada, for example, authors of conference papers flagged 12.9% of reviews as being poor quality."
"That's mainly because the reviews were vague, says Zou, with broad, simple comments such as "not novel". Reviews can also, rarely, be unprofessional or include personal attacks, with comments such as "these authors don't know what they're talking about", says Zou. Others make factual errors, for example criticizing work for omitting an analysis when that analysis is, in fact, there."
"Zou and his colleagues gathered about a dozen reviews that were vague, unprofessional or incorrect, along with what they considered to be appropriate feedback about those reviews. They fed that curated data to an LLM to help refine its responses and used this to develop a Review Feedback Agent, which uses a total of five LLMs to collaborate and check each others' work."
An AI Review Feedback Agent was developed to improve the thoroughness, tone, and accuracy of peer-review feedback. The system was trained on examples of vague, unprofessional, or incorrect reviews paired with appropriate corrective feedback and uses five collaborating large language models to refine and verify responses. The tool was applied to about 20,000 existing reviews from a major AI conference where each paper receives multiple reviews and roughly 30% of submissions are accepted. The agent aims to reduce vague comments, personal attacks, and factual errors in reviews. The impact of improved reviewer feedback on final research quality remains uncertain.
Read at Nature
Unable to calculate read time
Collection
[
|
...
]