A growing number of people believes that reducing the risk of human extinction is the single most cost-effective undertaking for those who want to do good in the world. I fear that if humans survive, the future will likely not be as sanguine as is often presumed. I enumerate a few suggestive (not exhaustive) bad scenarios that might result both from human-inspired AIs and from "unfriendly" AIs that might outcompete human values. I conclude that rather than trying to increase the odds of Earth-originating AI at all (which could have negative expected value), we might do better to improve the odds that such AI is of the type we want. In particular, for some of us, this may mean the best thing we can do is shape the values of society in such a way that an AI which develops a few decades/centuries from now will show more concern for the suffering of non-human animals and artificial sentients whose feelings are usually ignored.
(See "Rebuttal by Carl Shulman" at the bottom of this post.)
Introduction, written 6 Dec 2012:
Advocates for reducing extinction risk sometimes assume -- and perhaps even take for granted -- that if humanity doesn't go extinct (due to nanotech, biological warfare, or paperclipping), then human values will control the future. No, actually, conditional on humans surviving, the most likely scenario is that we will be outcompeted by Darwinian forces beyond our control. These forces might not just turn the galaxy into nonsentient paperclips; they might also run sentient simulations, employ suffering subroutines, engage in warfare, and perform other dastardly deeds as defined and described below. Of course, humans might do these things as well, but at least with humans, people make the presumption that human values will be humane, even though this may not be the case when it comes to human attitudes toward wild animals or non-human-like minds.
So when we reduce asteroid or nanotech risk, the dominant effect we're having is to increase the chance that Darwinian-forces-beyond-our-control take over the galaxy. Then there's some smaller probability that actual human values (the good, the bad, and the ugly) will triumph. I wish more people gung-ho about reducing extinction risk realized this.
Now, there is a segment of extinction-risk folks who believe that what I said above is not a concern, because sufficiently advanced superintelligences will discover the moral truth and hence do the right things. There are two problems with this. First, Occam's razor militates against the existence of a moral truth (whatever that's supposed to mean). Second, even if such moral truth existed, why should a superintelligence care about it? There are plenty of brilliant people on Earth today who eat meat. They know perfectly well the suffering that it causes, but their motivational systems aren't sufficiently engaged by the harm they're doing to farm animals. The same can be true for superintelligences. Indeed, arbitrary intelligences in mind-space needn't have even the slightest inklings of empathy for the suffering that sentients experience.
In conclusion: Let's think more carefully about what we're doing when we reduce extinction risk, and let's worry more about these possibilities. Rather than increasing the odds that some superintelligence comes from Earth, let's increase the odds that, if there is a superintelligence, it doesn't do things we would abhor.
The scenarios, written 13 Dec 2011
Robert Wiblin has asked for descriptions of some example future scenarios that involve lots of suffering. Below I sketch a few possibilities. I don't claim these occupy the bulk of probability mass, but they can serve to jump-start the imagination. What else would you add to the list?
Spread of wild-animal life. Humans colonize other planets, spreading animal life via terraforming. Some humans use their resources to seed life throughout the galaxy. And scientists explore creating infinitely many new universes in a lab. Since I would guess that most sentient organisms never become superintelligent, these new universes will contain vast numbers of planets full of Darwinian agony.
Sentient simulations. Given astronomical computing power, post-humans run ancestor simulations (including torture chambers, death camps, and psychological illnesses endured by billions of people). Moreover, scientists run even larger numbers of simulations of organisms-that-might-have-been, exploring the space of minds. They simulate trillions upon trillions of reinforcement learners, like the RL mouse, except that these learners are sufficiently self-aware as to feel the terror of being eaten by the cat.
Suffering subroutines. This one is from Carl Shulman. It could be that certain algorithms (say, simple reinforcement learners) are very useful in performing complex machine-learning computations that need to be run at massive scale by advanced AI. These subroutines might become sufficiently similar to the pain programs in our own brains that they actually suffer. But profit and power take precedence over pity, so these subroutines are used widely throughout the AI's Matrioshka brains. (Carl adds that this situation "could be averted in noncompetitive scenarios out of humane motivation.")
Savage ideologies. This one makes me shudder. Imagine people with an in-group/out-group sentiment so strong that they delight to see their enemies endure horrible suffering. Examples abound in religion: Think of the mindsets of the writers who first described hell in the afterlife.
- "Multitudes who sleep in the dust of the earth will awake: some to everlasting life, others to shame and everlasting contempt." (Daniel 12:2)
- "and judgment is passed upon you. From now on you will not be able to ascend into heaven unto all eternity, but you shall remain inside the earth imprisoned, all the days of eternity." (Enoch, 14:4-5)
- "And the smoke of their torment goes up forever and ever; they have no rest day and night, those who worship the beast and his image, and whoever receives the mark of his name." (Revelation 14:11)
- "There is neither limit nor termination of these torments. There, the intelligent fire burns the limbs and restores them. It feeds on them and nourishes them. ... However, no one except a profane man hesitates to believe that those who do not know God are deservedly tormented." (Mark Minucius Felix, c. 200)
- "those who seek gain in evil, and are girt round by their sins,- they are companions of the Fire: Therein shall they abide (For ever)." (Quran 2:81)
Torture as warfare. Post-humans might incline toward torture of enemies out of ideological happenstance, but they might also do so as a means of warfare. In the future, knowledge of the mechanisms of suffering will allow for simulation of torture experiences that are worse than anything we can imagine. Factions could employ such methods as a means of hostage-taking or threat-making, potentially leading to an arms race of torture technology.
Experiences worse than hell, written 18 Dec 2012
It's decently likely that if humans survive, then somewhere in the future, someone will create hells with pain worse than burning alive. Post-human knowledge of neuroscience could and likely will be used to create agony greater than being cast into a lake of fire. I was going to link to a video of torture in hell, but I decided not to. It can be hard for the words to sink in without watching one, though.
I don't necessarily support walking away from Omelas, but it seems like those who do should also walk away from reducing extinction risk. The expected number of people who will be in hell if humans survive is orders of magnitude more than a single child, and the torment would be far worse. Whatever else we may expect about the future, this term in the calculations will never go away.
Ways forward, written 5 Dec 2012
If indeed the most likely outcome of human survival is to create forces with values alien to ours, having the potential to cause astronomical amounts of suffering, then it may actually be a bad thing to reduce extinction risk. At the very least, reducing extinction risk is less likely to be an optimal use of our resources. What should we do instead?
One option, as suggested by Bostrom's "The Future of Human Evolution," is to work on creating a global singleton to reign in Darwinian competition. Obviously this would be a worldwide undertaking requiring enormous effort, but perhaps there would be high leverage in doing preliminary research, raising interest in the topic, and kicking off the movement.
Doing so would make it more likely that humans, rather than minds alien to humans, control the future. But would this be an improvement? It's hard to say. While unfriendly superintelligences would be unlikely to show remorse when running suffering simulations for instrumental purposes, it's also possible that humans would run more total suffering simulations. The only reasons for unfriendly AIs to simulate nature, say, are to learn about science and maybe to explore the space of minds that have evolved in the universe. In contrast, humans might simulate nature for aesthetic reasons, as ancestor simulations, etc. in addition to the scientific and game-theoretic reasons that unfriendly AIs would have. In general, humans are more likely to simulate minds similar to their own, which means more total suffering. Simulating paperclips doesn't hurt anyone, but simulating cavemen (and cavemen prey) does.
So it's not totally obvious that increasing human control over the future is a good thing either, though the topic deserves further study. The way forward that I currently prefer (subject to change upon learning more) is to work on improving the values of human civilization, so that if human-shaped AI does control the future, it will act just a little bit more humanely. This means there's value in promoting sympathy for the suffering of others and reducing sadistic tendencies. There's also value in reducing status-quo bias and promoting total hedonistic utilitarianism. Two specific cases of value shifts that I think have high leverage are (1) spreading concern for wild-animal suffering and (2) ensuring that future humans give due concern to suffering subroutines and other artificial sentients that might not normally arouse moral sympathy because they don't look or act like humans. (2) is antispeciesism at its broadest application. Right now I'm working with friends to create a charity focused on item (1). In a few years, it's possible I'll also focus on item (2), or perhaps another high-leverage idea that comes along.
In his original paper on existential risk, Bostrom includes risks not just about literal human extinction, but also risks that would "permanently and drastically curtail" the good that could come from Earth-originating life. Thus, my goal is also to reduce existential risk, but not by reducing extinction risk -- instead by working to make it so that if human values do control the galaxy, there will be fewer wild animals, subroutines, and other simulated minds enduring experiences that would make us shiver with fear were we to undergo them.
Rebuttal by Carl Shulman, written 8 Dec 2012:
Carl wrote a thorough response to this piece in a later comment.
Brian's response, written 8 Dec 2012:
Brian wrote a reply to Carl. It included the following conclusion paragraphs.
Most of Carl's points don't affect the way negative utilitarians or negative-leaning utilitarians view the issue. I'm personally a negative-leaning utilitarian, which means I have a high exchange rate between pain and pleasure. It would take thousands of years of happy life to convince me to agree to 1 minute of burning at the stake. But the future will not be this asymmetric. Even if the expected amount of pleasure in the future exceeds the expected amount of suffering, the two quantities will be pretty close, probably within a few orders of magnitude of each other. I'm not suggesting the actual amounts of pleasure and suffering will necessarily be within a few orders of magnitude but that, given what we know now, the expected values probably are. It could easily be the case that there's way more suffering than pleasure in the future.
If you don't mind burning at the stake as much as I do, then your prospects for the future will be somewhat more sanguine on account of Carl's comments. But even if the future is net positive in expectation for these kinds of utilitarians (and I'm not sure that it is, but my probability has increased in light of Carl's reply), it may still be better to work on shaping the future rather than increasing the likelihood that there is a future. Targeted interventions to change society in ways that will lead to better policies and values could be more cost-effective than increasing the odds of a future-of-some-sort that might be good but might be bad.
As for negative-leaning utilitarians, our only option is to shape the future, so that's what I'm going to continue doing.
