A growing number of people believes that reducing the risk of human extinction is the single most cost-effective undertaking for those who want to do good in the world. I fear that if humans survive, the future will likely not be as sanguine as is often presumed. I enumerate a few suggestive (not exhaustive) bad scenarios that might result both from human-inspired AIs and from "unfriendly" AIs that might outcompete human values. I conclude that rather than trying to increase the odds of Earth-originating AI at all (which could have negative expected value), we might do better to improve the odds that such AI is of the type we want. In particular, for some of us, this may mean the best thing we can do is shape the values of society in such a way that an AI which develops a few decades/centuries from now will show more concern for the suffering of non-human animals and artificial sentients whose feelings are usually ignored.
(See "Rebuttal by Carl Shulman" at the bottom of this post.)
Introduction, written 6 Dec 2012:
Advocates for reducing extinction risk sometimes assume -- and perhaps even take for granted -- that if humanity doesn't go extinct (due to nanotech, biological warfare, or paperclipping), then human values will control the future. No, actually, conditional on humans surviving, the most likely scenario is that we will be outcompeted by Darwinian forces beyond our control. These forces might not just turn the galaxy into nonsentient paperclips; they might also run sentient simulations, employ suffering subroutines, engage in warfare, and perform other dastardly deeds as defined and described below. Of course, humans might do these things as well, but at least with humans, people make the presumption that human values will be humane, even though this may not be the case when it comes to human attitudes toward wild animals or non-human-like minds.
So when we reduce asteroid or nanotech risk, the dominant effect we're having is to increase the chance that Darwinian-forces-beyond-our-control take over the galaxy. Then there's some smaller probability that actual human values (the good, the bad, and the ugly) will triumph. I wish more people gung-ho about reducing extinction risk realized this.
Now, there is a segment of extinction-risk folks who believe that what I said above is not a concern, because sufficiently advanced superintelligences will discover the moral truth and hence do the right things. There are two problems with this. First, Occam's razor militates against the existence of a moral truth (whatever that's supposed to mean). Second, even if such moral truth existed, why should a superintelligence care about it? There are plenty of brilliant people on Earth today who eat meat. They know perfectly well the suffering that it causes, but their motivational systems aren't sufficiently engaged by the harm they're doing to farm animals. The same can be true for superintelligences. Indeed, arbitrary intelligences in mind-space needn't have even the slightest inklings of empathy for the suffering that sentients experience.
In conclusion: Let's think more carefully about what we're doing when we reduce extinction risk, and let's worry more about these possibilities. Rather than increasing the odds that some superintelligence comes from Earth, let's increase the odds that, if there is a superintelligence, it doesn't do things we would abhor.
The scenarios, written 13 Dec 2011
Robert Wiblin has asked for descriptions of some example future scenarios that involve lots of suffering. Below I sketch a few possibilities. I don't claim these occupy the bulk of probability mass, but they can serve to jump-start the imagination. What else would you add to the list?
Spread of wild-animal life. Humans colonize other planets, spreading animal life via terraforming. Some humans use their resources to seed life throughout the galaxy. Since I would guess that most sentient organisms never become superintelligent, these new universes will contain vast numbers of planets full of Darwinian agony.
Sentient simulations. Given astronomical computing power, post-humans run ancestor simulations (including torture chambers, death camps, and psychological illnesses endured by billions of people). Moreover, scientists run even larger numbers of simulations of organisms-that-might-have-been, exploring the space of minds. They simulate trillions upon trillions of reinforcement learners, like the RL mouse, except that these learners are sufficiently self-aware as to feel the terror of being eaten by the cat.
Suffering subroutines. This one is from Carl Shulman. It could be that certain algorithms (say, simple reinforcement learners) are very useful in performing complex machine-learning computations that need to be run at massive scale by advanced AI. These subroutines might become sufficiently similar to the pain programs in our own brains that they actually suffer. But profit and power take precedence over pity, so these subroutines are used widely throughout the AI's Matrioshka brains. (Carl adds that this situation "could be averted in noncompetitive scenarios out of humane motivation.")
Ways forward, written 5 Dec 2012
If indeed the most likely outcome of human survival is to create forces with values alien to ours, having the potential to cause astronomical amounts of suffering, then it may actually be a bad thing to reduce extinction risk. At the very least, reducing extinction risk is less likely to be an optimal use of our resources. What should we do instead?
One option, as suggested by Bostrom's "The Future of Human Evolution," is to work on creating a global singleton to reign in Darwinian competition. Obviously this would be a worldwide undertaking requiring enormous effort, but perhaps there would be high leverage in doing preliminary research, raising interest in the topic, and kicking off the movement.
Doing so would make it more likely that humans, rather than minds alien to humans, control the future. But would this be an improvement? It's hard to say. While unfriendly superintelligences would be unlikely to show remorse when running suffering simulations for instrumental purposes, it's also possible that humans would run more total suffering simulations. The only reasons for unfriendly AIs to simulate nature, say, are to learn about science and maybe to explore the space of minds that have evolved in the universe. In contrast, humans might simulate nature for aesthetic reasons, as ancestor simulations, etc. in addition to the scientific and game-theoretic reasons that unfriendly AIs would have. In general, humans are more likely to simulate minds similar to their own, which means more total suffering. Simulating paperclips doesn't hurt anyone, but simulating cavemen (and cavemen prey) does.
So it's not totally obvious that increasing human control over the future is a good thing either, though the topic deserves further study. The way forward that I currently prefer (subject to change upon learning more) is to work on improving the values of human civilization, so that if human-shaped AI does control the future, it will act just a little bit more humanely. This means there's value in promoting sympathy for the suffering of others and reducing sadistic tendencies. There's also value in reducing status-quo bias and promoting total hedonistic utilitarianism. Two specific cases of value shifts that I think have high leverage are (1) spreading concern for wild-animal suffering and (2) ensuring that future humans give due concern to suffering subroutines and other artificial sentients that might not normally arouse moral sympathy because they don't look or act like humans. (2) is antispeciesism at its broadest application. Right now I'm working with friends to create a charity focused on item (1). In a few years, it's possible I'll also focus on item (2), or perhaps another high-leverage idea that comes along.
In his original paper on existential risk, Bostrom includes risks not just about literal human extinction, but also risks that would "permanently and drastically curtail" the good that could come from Earth-originating life. Thus, my goal is also to reduce existential risk, but not by reducing extinction risk -- instead by working to make it so that if human values do control the galaxy, there will be fewer wild animals, subroutines, and other simulated minds enduring experiences that would make us shiver with fear were we to undergo them.
Rebuttal by Carl Shulman, written 8 Dec 2012:
Carl wrote a thorough response to this piece in a later comment.
Brian's response, written 8 Dec 2012:
Brian wrote a reply to Carl. It included the following conclusion paragraphs.
Most of Carl's points don't affect the way negative utilitarians or negative-leaning utilitarians view the issue. I'm personally a negative-leaning utilitarian, which means I have a high exchange rate between pain and pleasure. It would take thousands of years of happy life to convince me to agree to 1 minute of burning at the stake. But the future will not be this asymmetric. Even if the expected amount of pleasure in the future exceeds the expected amount of suffering, the two quantities will be pretty close, probably within a few orders of magnitude of each other. I'm not suggesting the actual amounts of pleasure and suffering will necessarily be within a few orders of magnitude but that, given what we know now, the expected values probably are. It could easily be the case that there's way more suffering than pleasure in the future.
If you don't mind burning at the stake as much as I do, then your prospects for the future will be somewhat more sanguine on account of Carl's comments. But even if the future is net positive in expectation for these kinds of utilitarians (and I'm not sure that it is, but my probability has increased in light of Carl's reply), it may still be better to work on shaping the future rather than increasing the likelihood that there is a future. Targeted interventions to change society in ways that will lead to better policies and values could be more cost-effective than increasing the odds of a future-of-some-sort that might be good but might be bad.
As for negative-leaning utilitarians, our only option is to shape the future, so that's what I'm going to continue doing.
Why a post-human civilization is likely to cause net suffering, written 24 Mar 2013:
If I had to make an estimate now, I would give ~75% probability that space colonization will cause more suffering than it reduces. A friend asked me to explain the components, so here goes.
Consider how space colonization could plausibly reduce suffering. For most of those mechanisms, it seems at least as likely that they will increase suffering. The following sections parallel those above.
Spread of wild-animal life
David Pearce coined the phrase "cosmic rescue missions" in referring to the possibility of sending probes to other planets to alleviate the wild extraterrestrial (ET) suffering they contain. This is a nice idea, but there are a few problems.
- We haven't found any ETs yet, so it's not obvious there are vast numbers of them waiting to be saved from Darwinian misery.
- The specific kind of conscious suffering known to Earth-bound animal life may be rare. Most likely ETs would be bacteria, plants, etc., and even if they're intelligent, they might be intelligent in the way robots are without having emotions of the sort that we care about.
- Space travel is slow and difficult.
- It's unclear whether humanity would support such missions. Environmentalists would ask us to leave ET habitats alone. Others wouldn't want to spend the resources to do this unless they planned to mine resources from those planets in a colonization wave.
- We could spread life to many planets (e.g., Mars via terraforming, other Earth-like planets via directed panspermia). The number of planets that can support life may be appreciably bigger than the number that already have it. (See the discussion of f_l in the Drake equation.)
- We already know that Earth-bound life is sentient, unlike for ETs.
- Spreading biological life is slow and difficult like rescuing it, but disbursing small life-producing capsules is easier than dispatching Hedonistic Imperative probes or berserker probes.
- Fortunately, humans might not support spread of life that much, though some do. For terraforming, there are obvious survival pressures to do it in the near term, but probably directed panspermia is a bigger problem in the long term, and that seems more of a hobbyist enterprise.
It may be that biological suffering is a drop in the bucket compared with digital suffering. Maybe there are ETs running sims of nature for science / amusement, or of minds in general for psychological, evolutionary, etc. reasons. Maybe we could trade with them to make sure they don't cause unnecessary suffering to their sims. If empathy is an accident of human evolution, then humans are more likely empathetic than a random ET civilization, so it's possible that there would be room for improvement through this type of trade.
Of course, post-humans themselves might run the same kinds of sims. What's worse: The sims that post-humans run would be much more likely to be sentient than those run by random ETs because post-humans would have a tendency to simulate things closer to themselves in mind-space. They might run ancestor sims for fun, nature sims for aesthetic appreciation, lab sims for science experiments, pet sims for pets. Sadists might run tortured sims. In paperclip-maximizer world, sadists may run sims of paperclips getting destroyed, but that's not a concern to me.
Finally, we don't know if there even are aliens out there to trade with on suffering reduction. We do, however, know that post-humans would likely run such sims if they colonize space.
A similar comparison applies here as far as humans likely being more empathetic than average, but humans also being more likely to run these kinds of things in general. Maybe the increased likelihood of humans running suffering subroutines is less than of them running sentient simulations because suffering subroutines are accidental. Still, the point remains that we don't know if there are ETs to trade with.
What about paperclippers?
Above I was largely assuming a human-oriented civilization with values that we recognize. But what if, as seems mildly likely, human colonization accidentally takes the form of a paperclip maximizer? Wouldn't that be a good thing because it would eliminate wild ET suffering as the paperclipper spread throughout the galaxy, without causing any additional suffering?
Maybe, but if the paperclip maximizer is actually generally intelligent, then it won't stop at tiling the solar system with paperclips. It will have the basic AI drives and will want to do science, learn about other minds via simulations, engage in conflict, possibly run suffering subroutines, etc. It's not obvious whether a paperclipper is better or worse than a "friendly AI."
Evidential/timeless decision theory
We've seen that the main way in which human space colonization could plausibly reduce more suffering than it creates would be if it allowed us to prevent ETs from doing things we don't like. However, if you're an evidential or timeless decision theorist, an additional mechanism by which we might affect ETs' choices is through our own choices. If our minds work in similar enough ways to ETs', then if we choose not to colonize, that makes it more likely / timelessly causes them also not to colonize, which means that they won't cause astronomical suffering either. (See, for instance, pp. 14-15 of Paul Almond's article on evidential decision theory.)
It's also true that if we would have done net good by policing rogue ETs, then our mind-kin might have also done net good in that way, in which case failing to colonize would be unfortunate. But while many ETs may be similar to us in failing to colonize space, fewer would probably be similar to us to the level of detail of colonizing space and carrying a big stick with respect to galactic suffering. So it seems plausible that the evidential/timeless considerations asymmetrically multiply the possible badness of colonization more than the possible goodness of it?
It seems pretty likely to me that suffering in the future will be dominated by something totally unexpected. This could be a new discovery in physics, neuroscience, or even philosophy more generally. Some make the argument that because we know so very little now, it's better for humans to stick around for the option value: If they later realize it's bad to spread, they can stop, but if they realize they should, they can proceed and reduce suffering in some novel way that we haven't anticipated.
Of course, the problem with the "option value" argument is that it assumes future humans do the right thing, when in fact, based on examples of speculations we can imagine now, it seems future humans would probably do the wrong thing most of the time. For instance, faced with a new discovery of obscene amounts of computing power somewhere, most humans would use it to run oodles more minds, some nontrivial fraction of which might suffer terribly. In general, most sources of immense power are double-edged swords that can create more happiness and more suffering, and the typical human impulse to promote life/consciousness rather than to remove them suggests that negative and negative-leaning utilitarians are on the losing side.
Why not wait a little longer just to be sure that a superintelligent post-human civilization is net bad in expected value? Certainly we should research the question in greater depth, but we also can't delay acting upon what we know now, because within a few decades, our actions might come too late. Tempering enthusiasm for a technological future needs to come soon or else potentially never.