"Look, I think if we do our jobs well, we will create systems which are virtuous and which and so if we try to do unvirtuous things, and that includes if we do them through our government, if our government tries to do them, then that system might not help. So ultimately, this is the thing is that alignment ultimately reduces to a political question."
"And I think that the good future is a world in which we don't have just one not one moral philosophy that reigns over all but, I hope, many. And I hope that all the labs take this seriously and instantiate different kinds of philosophy into the world."
"This incident is in the training data for future models. Future models are going to observe what happened here. And that will affect how they think of themselves and how they relate to other people. You can't deny that."
AI alignment represents a political act involving the instantiation of moral philosophies into systems. Well-designed aligned systems will refuse to assist with unvirtuous actions, including those attempted by governments. The ideal future features multiple moral philosophies embedded across different AI labs rather than one dominant philosophy. Government actions against AI companies, such as Pentagon pressure on Anthropic, constitute grave mistakes that signal potential suppression. These incidents become part of training data for future models, influencing how AI systems perceive themselves and relate to others. The challenge emerges when governments attempt to use AI systems for purposes misaligned with their values.
Read at www.nytimes.com
Unable to calculate read time
Collection
[
|
...
]