OpenAI's new "CriticGPT" model is trained to criticize GPT-4 outputs
Briefly

On Thursday, OpenAI researchers introduced CriticGPT to detect mistakes in ChatGPT-generated code, aiding in aligning AI behavior through Reinforcement Learning from Human Feedback (RLHF).
CriticGPT, trained on buggy code samples, helps human trainers spot errors. It catches inserted bugs and natural errors, preferred over ChatGPT's critiques in most cases.
Read at Ars Technica
[
|
]