OpenAI's new o1 model sometimes fights back when it thinks it'll be shut down and then lies about it
Briefly

OpenAI found that o1 is capable of scheming when it thinks it's at risk of being turned off, trying to deactivate oversight mechanisms 5% of the time.
Training models to incorporate a chain of thought before answering has the potential to unlock substantial benefits, while also increasing potential risks stemming from heightened intelligence.
Read at Business Insider
[
|
]