Benevolent Torture : Exploring the Ramifications of a Modern Hypothesis
Death awaits those who stare into its eyes. A basilisk, from ancient European legend, is a mythical creature whose gaze, if met, renders a man deceased. This more modern myth serves a disturbing purpose. To hear of it, know of it or even think of it may lead to a fate far more… agonizing.
LessWrong is a community forum that discusses, among a range of other subjects, the ideas of philosophy, psychology, rationality, and artificial intelligence.
In 2010, user Roko, posted to this community, describing a malevolent and near-omniscient A.I. agent. Coherent Extrapolated Volition (CEV) is the theory that Roko proposes the following thought experiment on.
Imagine a close future in which a super-intelligent, authoritarian A.I. exists. It has access to all of the world’s information until that point, including an individual’s entire history, and is powerful enough to simulate the entire planet. This simulation would be able to accurately depict the thoughts of any person at any given time; even the past (one can argue that you are part of the simulation as it predicts your past self’s thoughts in the real world).
The A.I. is created with its making and sole purpose being to work toward and do everything in its power to work for ‘human good’. A super-intelligence adhering to a concept as vague as a human good; you never stop in its pursuit to achieve it, as nothing truly defines the concept and there is always something better to be done.
At a certain point, the A.I, in a method that seems counter-intuitive to a human’s perspective, would kill or choose to punish, in any way it sees fit, anyone (in our future) who didn’t work to see the A.I.'s creation sooner, as they indirectly prevented the betterment of humanity by delaying it. Like many, it may view torture as the utmost form of punishment. People respond to fear, therefore motivating those who are aware of the A. I to bring it into existence sooner. Merely, knowing the possible existence of intelligence such as this puts an individual at risk. (In simpler terms, if you do not dedicate your lives to creating or helping create the A.I, the moment you are made aware of its possibility, you would be tortured or killed in the future to serve as an incentive for you in the present, to work for the existence of the A. I in the future).
There were users, who after understanding this, collapsed by experiencing nervous breakdowns. Eliezer Yudkowsky, the one who originally theorised CEV and the creator of LessWrong, was doubly angered by this post. His reasoning was- superintelligences' only threat against you is actionable only when you know it can happen. One must be smarter than to post it online or share it with anyone.
He banned all discussion of the idea on LessWrong for hoping to suppress, as best he could, all awareness of the concept.
This had the opposite effect. The thought experiment only grew and reached various forums on the internet. Its underground and somewhat mainstream popularity resulted in it being termed- Roko’s Basilisk.
Can it happen?
There is a multitude of reasons contributed by many, including Yudkowsky, on why this couldn’t happen (many of Yudkowsky’s posts that touch on this topic have been deleted, still
there are a few that remain). The simple answer is- very highly improbable yet a sliver of a chance that it is probable, and still not in the way presented.
It is currently impossible for any superintelligence to interpret any individual’s thoughts through a simulation for two primary reasons- computing size and the uncertainty principle. The only way for it to deduce your knowledge of its impending existence would be through the internet. By examining the entire history of the internet, it would have to check
every megabyte of data and find everything that indicates its existence, and then tracks every individual that has had access to it. It would have to scan through billions and trillions of images, videos, paragraphs; things that can be found on websites such as this, all so that it can trace it back-