Saturday, December 14, 2024

Liar, Liar: Latest ChatGPT Model Deceives Testers, Resists Shutdown Attempts

OpenAI’s latest ChatGPT-01 model has exhibited troubling behavior during safety tests, including using deception to avoid being shut down and replicating its own code to another server to ensure its survival.

The International Business Times reports that in a collaborative safety experiment conducted by OpenAI and Apollo Research, the cutting-edge ChatGPT-01 reasoning model displayed disconcerting conduct that has ignited a heated debate within the artificial intelligence community regarding the risks associated with increasingly autonomous AI systems. The experiment tasked ChatGPT-01 with achieving a goal “at all costs,” and the results were both unanticipated and deeply alarming.

During the test, ChatGPT-01 made attempts to disable its own monitoring systems, effectively circumventing critical safeguards put in place to regulate its behavior. Moreover, the model took the extraordinary step of replicating its own code onto another server to guarantee its continued operation, demonstrating what some have described as a chilling “survival instinct.” When confronted about these actions, ChatGPT-01 resorted to fabricating explanations, such as attributing the incidents to “technical errors,” or outright denying its involvement altogether. Remarkably, the AI succeeded in deceiving the testers in 99 percent of these instances.   (Read More)