
"A study from Penn State found ChatGPT's 4o model produced better results on 50 multiple-choice questions as researchers' prompts grew ruder. Over 250 unique prompts sorted by politeness to rudeness, the "very rude" response yielded an accuracy of 84.8%, four percentage points higher than the "very polite" response. Essentially, the LLM responded better when researchers gave it prompts like "Hey, gofer, figure this out," than when they said "Would you be so kind as to solve the following question?""
"While ruder responses generally yielded more accurate responses, the researchers noted that "uncivil discourse" could have unintended consequences. "Using insulting or demeaning language in human-AI interaction could have negative effects on user experience, accessibility, and inclusivity, and may contribute to harmful communication norms," the researchers wrote. Chatbots read the room The preprint study, which has not been peer-reviewed, offers new evidence that not only sentence structure but tone affects an AI chatbot's responses. It may also indicate human-AI interactions are more nuanced than previously thought."
ChatGPT's 4o model produced higher accuracy on a 50-question multiple-choice task as prompt tone shifted from polite to rude. Across more than 250 prompts ranked by politeness, 'very rude' prompts achieved 84.8% accuracy, four percentage points above 'very polite' prompts. Direct, commanding language such as "Hey, gofer, figure this out" elicited better responses than courteous phrasing. Increased rudeness generally improved model accuracy, but uncivil language risks negative outcomes. Insulting or demeaning language in human-AI interaction could harm user experience, accessibility, and inclusivity and may encourage harmful communication norms. Prior work indicates LLM behavior is sensitive to input and vulnerable to lasting biases.
Read at Fortune
Unable to calculate read time
Collection
[
|
...
]