Friendly AI and utilitarianism

For ethics in the real world - bioethics, law, effective altruist outreach etc.
Vladimir Nesov
Posts: 1
Joined: Fri Aug 19, 2011 6:07 pm

Re: Friendly AI and utilitarianism

Postby Vladimir Nesov » Fri Aug 19, 2011 6:13 pm

I have read some of Luke's (lukeprog) posts over at lesswrong but didn't know what I was supposed to get out of them. I think those people are naive to think that they can mitigate risks from AI by defining a mathematically binding definition of "friendliness". If that is at all possible then it is orders of magnitude more difficult than creating an artificial general intelligence. In other words, I'd focus on fail-safe mechanisms. They might not work, but that's better than wasting all this time on an impossible goal.


I believe that it's harder to make "fail-safe mechanisms" that work than to create a FAI. And if they don't work, it's not actually better.

Alexander Kruel
Posts: 16
Joined: Tue Aug 16, 2011 12:08 pm
Location: Germany
Contact:

Re: Friendly AI and utilitarianism

Postby Alexander Kruel » Fri Aug 19, 2011 6:47 pm

Vladimir Nesov wrote:I believe that it's harder to make "fail-safe mechanisms" that work than to create a FAI. And if they don't work, it's not actually better.


What I meant is that fail-safe mechanisms might help to prevent a full-scale extinction scenario or help us to employ an AGI to develop friendly AI, not that they might work anywhere nearly as good as the friendly AI approach. But if the friendly AI approach, as I suspect, is nearly impossible (very unlikely) to work out before someone does stumble upon AGI, then some sort of fail-safe mechanism is better than nothing. Therefore, if you are not reasonably sure that a success in solving friendly AI is a possibility, you should think really hard about focusing on fail-safe mechanisms instead.

rehoot
Posts: 160
Joined: Wed Dec 15, 2010 7:32 pm

Things that AI won't solve?

Postby rehoot » Fri Aug 19, 2011 8:09 pm

I watched a video on the estimation of the benefit of AI: Anna Salamon with the Singularity Institute. I'm wondering about one key issue: overpopulation. That leads to the question of what AI can't solve ethically.

Let's assume that good AI exists and that evil AI is subjugated. We then cure all disease, find ways to manufacture things very efficiently, build fuel-efficient transportation, and so on. Will overpopulation lead to the entire surface of earth being stacked a mile high with people in sky scrapers, with the bottom 100 yards filled with sewage--no parks, no open spaces for most people--I'd prefer my current conditions to that.

I guess the optimistic response would be that AI will help to find ways to use psychology to convince people to control themselves voluntarily—or stated another way, AI will be used by its controllers as a propaganda machine or mind-control machine (assuming that irate humans do kill each other for natural resources before we get to that point). Another optimistic response would be that AI could be used to genetically engineer smarter people who understand overpopulation. Does that then create a new species and potential combat between old humans and überman? This can lead 1,000 different directions.

Similar issues exist with the inability of humans to understand the antecedents of prejudice. We learn in school about various genocides throughout history, but we don't cultivate the knowledge and skills that would be needed to overcome our tendency to dislike out-groups when economic conditions are bad. We don't teach ethics to kids who join the army at age 18, so they just kill as they are instructed to do so. It seems to me that AI can't solve this without processes that are at high risk of abuse—and the solutions to some of these problems are already on the table but can't be implemented (at least in the U.S.) for political reasons and our existing biases.

It seems that the "good AI" leads to the "evil AI" unless the population problem (and other problems inherent with the minds of individuals) can be solved in an ethical manner. Granted that in the short-term after the AI explosion, wealthy people will have more creature comforts, although past technological advance clearly indicate that such technology will have less impact on the poorest people.
Last edited by rehoot on Fri Aug 19, 2011 8:54 pm, edited 1 time in total.

User avatar
Gedusa
Posts: 110
Joined: Thu Sep 23, 2010 8:50 pm
Location: UK

Re: Friendly AI and utilitarianism

Postby Gedusa » Fri Aug 19, 2011 8:30 pm

FAI that was actually smart would probably develop space travel pretty rapidly, so I don't expect this to be a problem in the near term. And in the longer term... well, I have lots of ideas, most of which mainly postpone the problem. A few would deal with it properly, like we could just pass laws limiting reproduction, assuming we lived in a singleton that could enforce them. Mainly though I'd just say: We'll cross that bridge when we come to it.
The galaxy's pretty big. :D
World domination is such an ugly phrase. I prefer to call it world optimization

User avatar
Brian Tomasik
Posts: 1107
Joined: Tue Oct 28, 2008 3:10 am
Location: USA
Contact:

Re: Friendly AI and utilitarianism

Postby Brian Tomasik » Mon Aug 22, 2011 1:56 pm

Whew -- long thread! Let me reply to lukeprog and return to the remaining posts later on.

lukeprog wrote:I'm less clear on whether you think mind substrates and programming matter. My first guess is that you're focused on conscious subjective experience, and you have some guesses about which types of substrates and programming could manifest conscious subjective experience. But if those guesses turned out to be wrong (say, conscious subjective experience could be implemented by a lookup table), then what you'd care about is conscious subjective experience. Is that about right?

Well, I think what constitutes "conscious subjective experience" isn't a fundamental fact about the world but is determined by our own feelings on the subject. My current intuitions don't give ethical weight to lookup tables. I could have my intuitions changed if you showed me the similarity of lookup tables to minds that I do care about. But it's also possible I would retain my current intuitions. Such intuitions can vary from person to person.

Thanks for the "austere meraethics" and "empathic metaethics" definitions. Those terms help to maniuplate these concepts with greater dexterity.

lukeprog wrote:One way to proceed is by analyzing the implications for Friendly AI given a framework of total hedonistic non-pinkprick negative utilitarianism, while keeping in mind that total hedonistic non-pinkprick negative utilitarianism may not capture even what you mean (non-stipulatively) by 'morally good'. Sound good?

Yes, that sounds good! The one caveat I would add is that I'm not always concerned about figuring out what my future self would want. There are lots of possible changes to my brain over time that would change my intuitions in ways that I wouldn't like, e.g., if I became apathetic to the suffering of others as I became old and curmudgeonly. However, changes in intuitions caused by learning more (e.g., studying the mechanisms of suffering in animals and whether they extend to insects) are almost always welcome.

User avatar
Brian Tomasik
Posts: 1107
Joined: Tue Oct 28, 2008 3:10 am
Location: USA
Contact:

Re: Friendly AI and utilitarianism

Postby Brian Tomasik » Mon Aug 22, 2011 2:55 pm

Jason Kilwala wrote:Can you say more about your sense that a giant lookup table almost certainly doesn't pass?

I'm not totally sure either, but my sense is that it has to do with the same reason "a frozen body, a static digital copy of a mind or the last copy of a book" doesn't have moral significance. What matters is the dynamic process of computation, rather than the end result.

Mike Radivis wrote:In general, I find the focus on human-friendliness misguided. I see anthropocentrism and speciesism as problems that need to be overcome). It would be much preferable to strive for general sentience-friendliness.

Agree!

Mike Radivis wrote:So, I'm interested in the most effective methods of improving my well-being. Is generating virtual warm fuzzies really good for that purpose? What would you suggest?

It seems as though anti-depressants (in one form or another) have helped a number of my utilitarian friends.

Hedonic Treader wrote:
Further, why wouldn't a lookup table or waterfall feature an affective valence of mental states?

This depends on whether we think they are sufficiently similar to the processes that give affective valence to our own mental states.

Yes.

Hedonic Treader wrote:I'm quite happy to focus completely on the latter. Except I would focus more on the liking instead of the wanting.

+1

lukeprog
Posts: 9
Joined: Mon Jan 12, 2009 5:41 am

Re: Friendly AI and utilitarianism

Postby lukeprog » Tue Aug 23, 2011 2:36 am

Hi Alan,

Once again I'll reply only to your latest post, as that's all I have time for.

Thanks for your clarification about conscious subjective experience.

On the subject of changing values, you wrote:

Alan Dawrst wrote:I'm not always concerned about figuring out what my future self would want. There are lots of possible changes to my brain over time that would change my intuitions in ways that I wouldn't like, e.g., if I became apathetic to the suffering of others as I became old and curmudgeonly. However, changes in intuitions caused by learning more (e.g., studying the mechanisms of suffering in animals and whether they extend to insects) are almost always welcome.


Right. Allow me to expand on this; I think we agree.

In simplistic terms, we might say that we have many desires, and those include desires about our desires (e.g. the pedophile might desire sex with children, but also desire that he NOT desire sex with children) and also desires about the processes by which our desires are generated (e.g. I feel like I would rather not have my desires generated by brain mechanisms tweaked by an advanced alien race using humans for scientific experiments).

Within that last category, your probably feel like you want your desires and values to be better informed and more in 'rational' in the VNM sense. But you wouldn't want your desires and values to be influenced by, say, an onset of manic-depressive disorder.

A quick aside: Folk psychology, the theory that includes terms like 'desires' and 'beliefs', may turn out not to be that useful as human cognitive neuroscience progresses, but we can use those terms as metaphors for now. For the most recent view on how human 'desire' works, see my article A Crash Course in the Neuroscience of Human Motivation.

Moving on, it seems you're happy with this approach:

One way to proceed is by analyzing the implications for Friendly AI given a framework of total hedonistic non-pinkprick negative utilitarianism, while keeping in mind that total hedonistic non-pinkprick negative utilitarianism may not capture even what you mean (non-stipulatively) by 'morally good'.


The difficulty with this approach is that we don't know yet what a Friendly AI will do. We don't know what 'Friendliness' is, because we haven't yet solved metaethics and normative ethics and cognitive neuroscience.

Perhaps to get a handle on the issue we can narrow things down. Suppose we run with Singularity Institute's current proposal for Friendly AI, 'Coherent Extrapolated Volition'. A 6-years-old version of the proposal is here. Also, for simplicity, suppose we narrow our discussion to scenarios in which a single machine superintelligence, a machine singleton, emerges from the technological singularity.

I suspect your concern comes from the proposal's continued insistence in talking about preserving and extrapolating humans values, which seems downright speciesist. What if a machine singleton motivated by extrapolated human values causes immense suffering for non-human animals, or even their extinction? What is a machine singleton motivated by extrapolated human values encounters alien civilizations on distant planets and causes them immense suffering or extinction because their values weren't considered for the AI's utility function?

But there is a broader way to construe your original worry. You wrote:

A main reason why I’m less enthusiastic about SIAI is that the organization’s primary focus is on reducing existential risk, but I really don’t know if existential risk is net good or net bad. As I said in one Felicifia discussion: “my current stance is to punt on the question of existential risk and instead to support activities that, if humans do survive, will encourage our descendants to reduce rather than multiply suffering in their light cone.


One might interpret your statements as saying something like this:

"Because I'm a negative utilitarian, I'm not sure if [human] existential risk is net good or net bad. If humanity is wiped out, this might reduce suffering on Earth overall. Humans cause lots of suffering for themselves and other animals. Moreover, it's possible that existence itself almost inevitably brings net suffering (see David Benatar). So maybe it's better if we make sure that either (1) humans go extinct, or (2) if they don't go extinct, they are seriously concerned with reducing the suffering of non-human animals."

Is that a fair interpretation?

And, are you hoping to pursue something more like the former concern, or the latter ('broader') concern?

User avatar
Arepo
Site Admin
Posts: 1097
Joined: Sun Oct 05, 2008 10:49 am

Re: Friendly AI and utilitarianism

Postby Arepo » Tue Aug 23, 2011 6:22 pm

LukeProg wrote:The difficulty with this approach is that we don't know yet what a Friendly AI will do. We don't know what 'Friendliness' is, because we haven't yet solved metaethics and normative ethics and cognitive neuroscience.


On cognitive neuroscience that's clear, but on metaethics and normative ethics it isn't. Given that much of what modern ethicists propose is very similar to what Kant and Bentham were saying a couple of centuries ago, and some of it still basically mapping onto Aristotle and Epicurus, it seems quite plausible that ethics has been solved. If so, the primary challenge of whoever has it right is persuading other people to accept its veracity rather than to find out more about it themselves and the primary challenge of everyone else is to recognise what causes the heavy bias away from the correct view in most people and adjust for it in themselves.

This relates to my own problem with SIAI/the LW community, in that they seem to present a paradox which, if it were coherent, would be basically unsolvable. Starting from views something like this -

1) FAI should have values which are commensurate with a perfectly ethical person’s views'.
2) Ethics is solvable. (where ‘solving’ equals something like ‘becoming able to derive the views of a perfectly ethical person’)
3) Ethics is unsolved.

- the LW community seem to derive the view, roughly, that its goal is to solve ethics in time/well enough that we can derive ultimate values, programme them into the first AI, and still have time for tea. Therefore they put a lot of effort into step one.

My first problem with this is partly that there’s no real evidence for 3. The argument for it seems to be that lots of people still disagree on it, but that’s hardly telling. We know we haven’t solved eg neuroscience because we can’t build a human brain, but just as there’s no test for the success of non-natural philosophy (its perennial bane?), there’s also no test for its failure. This means that, if ethics were solved, you’d expect to see a world much like today’s one, in which various intelligent humans still squabble over it and continue to burn resources looking for a solution that resonates with everyone. And if it isn’t solved, there’s no particular reason to suspect that anyone will be able to tell the difference when it is.

If I’m right, then, whether or not ethics is solved, the instructions we programme into an AI are unlikely to ever be universally agreed on, even among LW’s best and brightest. So how will we know who to trust with the controls?

My second problem is that a lot of the LW commentary seems to miss the point of ethics in the sense relevant to FAI, which gives us motivation and perhaps reason to make one choice rather than another from moment to moment, perhaps differently to the motivation we would have had (and therefore the choice we would have made) if we hadn’t considered it - it is not a study of how people react to situations that someone else designates morally relevant.

The former is what we need to solve to make FAI, the latter is, possible instrumental value aside, completely irrelevant. I recently saw a LW post, which I can’t find now, claiming that giving an AI the instruction to maximise happiness is naive because humans don’t maximise happiness. This seems like a really basic category error - perhaps there are reasons not to set that as your instruction, but that post gave me no reason to suspect their existence.

The last problem is that LW/SIAI seem to be working from the semi-suppressed premise that FAI should be benign towards humans. As Mike pointed out above, this is obviously speciesist, but it’s also at odds with the three premises above - if there is an ethic which is correct and unknown, we clearly can’t conclude that it contains a prescription for human preservation.

From what I’ve seen of LW posters comments, it’s really this suppressed premise that generates the hostility towards a hedonistic utilitarian AI (UAI) - no-one really thinks that a maximally efficient happiness generator would much resemble a human (more to the point, it's hard to imagine a maximum anything generator that would resemble a human), so we suspect that a UAI would quickly go about using our matter and energy to create some sort of utilitronium shockwave without paying much attention to what happened to us in the process.

Even to me, death by utilitronium shockwave is a scary thought, which I intuitively recoil from - but I can’t think of any reason not to support it, and I’ve never seen a LWer even try to criticise it directly.

This leads me to the view that the LW mission is fundamentally selfish, more about self-preservation than ethics. But again, we allegedly don’t know what ethics is and have little reason to think it includes self-preservation (whatever that even means in a universe with no privileged viewpoints). So now we’re talking about creating a near-perfectly logical entity that has (or is very likely to have) a basic contradiction in its programmed instructions. I can’t see any reason to support such a cause, or to have much hope for its success if I did…
"These were my only good shoes."
"You ought to have put on an old pair, if you wished to go a-diving," said Professor Graham, who had not studied moral philosophy in vain.

User avatar
Brian Tomasik
Posts: 1107
Joined: Tue Oct 28, 2008 3:10 am
Location: USA
Contact:

Re: Friendly AI and utilitarianism

Postby Brian Tomasik » Sat Sep 10, 2011 4:54 pm

Hi Luke. A long-delayed reply below.


Totally awesome. I can hardly think of a more fascinating topic than dopamine, TD learning, and motivational algorithms.

lukeprog wrote:I suspect your concern comes from the proposal's continued insistence in talking about preserving and extrapolating humans values, which seems downright speciesist.

Partly, yes. However, even animals might not share all of my values -- e.g., leaning toward negative utilitarianism. I would selfishly prefer extrapolation of just my volition! That said, I think counting the suffering of wild animals in proportion to their numbers would go a long way toward allaying my concern.

lukeprog wrote:What if a machine singleton motivated by extrapolated human values causes immense suffering for non-human animals, or even their extinction? What is a machine singleton motivated by extrapolated human values encounters alien civilizations on distant planets and causes them immense suffering or extinction because their values weren't considered for the AI's utility function?

Extinction is fine, perhaps desirable. ;) Creation of vast numbers of new wild animals is a problem.

lukeprog wrote:One might interpret your statements as saying something like this:

"Because I'm a negative utilitarian, I'm not sure if [human] existential risk is net good or net bad. If humanity is wiped out, this might reduce suffering on Earth overall.

I'm not too concerned about what happens on earth. Indeed, because humans destroy animal habitats, it may be best for humans to survive from the perspective of earth-bound wild animals.

What troubles me are these possibilities:
  • Terraforming of other planets, spreading wild suffering thereto.
  • Directed panspermia.
  • Creating lab universes with infinite amounts of new suffering.
  • Running sentient simulations of nature, whether for sentimental reasons or for scientific research.
  • "Bad Singularity" scenarios: Non-friendly powers take over. They run vast numbers of computations that I would consider to be suffering, e.g., conscious reinforcement learning algorithms, or ancestor simulations, or (worst of all) torture as part of computational warfare.
Now, clearly SIAI aims to reduce the risk of the last item in that list. However, Bad Singularities are more likely if humans survive than if they don't. If humans went extinct tomorrow, the probability of Bad Singularities arising from earth would be 0. (The counterargument is that a human FAI might be able to curb the effects of Bad Singularities elsewhere, just as it might be able to prevent wild extraterrestrial suffering in other galaxies.)

lukeprog wrote:And, are you hoping to pursue something more like the former concern, or the latter ('broader') concern?

By "former concern," do you mean "counting animals in CEV," and by "latter concern," not working to reduce extinction risk but instead spreading concern for wild animals so that human survival won't be so bad? At present, my aim is the latter. The former would be welcome as well, of course. That said, the probability I assign to CEV actually determining the future of humanity is, umm, <0.1%, so it's not clear that working to influence CEV is the best strategy. (Or maybe it is, from the perspective of returns per unit of effort. I'm open to persuasion. :))

User avatar
Brian Tomasik
Posts: 1107
Joined: Tue Oct 28, 2008 3:10 am
Location: USA
Contact:

Re: Friendly AI and utilitarianism

Postby Brian Tomasik » Sat Sep 10, 2011 5:14 pm

Arepo wrote:Even to me, death by utilitronium shockwave is a scary thought, which I intuitively recoil from - but I can’t think of any reason not to support it

I think it's a wonderful thought. IMO, there would be no better outcome for the universe. :)

lukeprog
Posts: 9
Joined: Mon Jan 12, 2009 5:41 am

Re: Friendly AI and utilitarianism

Postby lukeprog » Sat Sep 10, 2011 8:40 pm

Alan,

Thanks again for your reply. I rather like our leisurely discussion pace.

By "former concern," do you mean "counting animals in CEV," and by "latter concern," not working to reduce extinction risk but instead spreading concern for wild animals so that human survival won't be so bad? At present, my aim is the latter.


Okay, good. That might be a more tractable subject matter anyway.

Earlier, I tried to guess at your views by paraphrasing them like this:

"Because I'm a negative utilitarian, I'm not sure if [human] existential risk is net good or net bad. If humanity is wiped out, this might reduce suffering... overall. Humans cause lots of suffering for themselves and other animals. Moreover, it's possible that existence itself almost inevitably brings net suffering (see David Benatar). So maybe it's better if we make sure that either (1) humans go extinct, or (2) if they don't go extinct, they are seriously concerned with reducing the suffering of non-human animals."


In your last post, you commented only on the first sentence of this characterization. Do the other sentences approximate your line of thought?

Also, I'm curious as to what you think is driving your intuitions toward negative rather than positive utilitarianism. Perhaps you have an essay on that, or you can explain your reasons briefly to me now?

Finally, I'm curious about why you adopt value hedonism. In primates at least, 'pleasure' is a very particular operation performed by the brain when certain hedonic hotspots in the brain 'paint' neuronal events with a 'hedonic gloss'. Pain is a similarly particular process. Why are these things the source of value in the universe, the things that matter?

(For some sources on this, see my articles The Neuroscience of Pleasure and Not for the Sake of Pleasure Alone, and especially see Aldridge & Berridge's 2010 article Neural coding of pleasure. We know less about the neuroscience of pain, but for example see the parts on pain in this 2010 review by neuroscientist Bud Craig.)

I've asked you difficult questions, and I don't expect you to have knock-down arguments in favor of your intuitions. I only hope to understand better where you're coming from.

User avatar
Brian Tomasik
Posts: 1107
Joined: Tue Oct 28, 2008 3:10 am
Location: USA
Contact:

Re: Friendly AI and utilitarianism

Postby Brian Tomasik » Mon Sep 19, 2011 4:26 am

lukeprog wrote:In your last post, you commented only on the first sentence of this characterization. Do the other sentences approximate your line of thought?

Because I'm a negative utilitarian,

I'm not necessarily a negative utilitarian, but I play one on TV.

No, seriously, I lean toward NU and might pick it if you pinned me down with thought experiments. When I'm not a negative utilitarian, I at least place immense weight on suffering compared with happiness.

I'm not sure if [human] existential risk is net good or net bad. If humanity is wiped out, this might reduce suffering... overall.

Yes.

Humans cause lots of suffering for themselves and other animals.

That's not the primary reason, and in fact, human existence may be net beneficial for animals on Earth if humans prevent wild-animal lives through habitat destruction. But things like factory-farm suffering and human torture are clearly terrible.

Moreover, it's possible that existence itself almost inevitably brings net suffering (see David Benatar).

I don't take Benatar's position except to the extent that it aligns with near-NU.

So maybe it's better if we make sure that either (1) humans go extinct, or (2) if they don't go extinct, they are seriously concerned with reducing the suffering of non-human animals."

Yes. I'm too sheepish to do (1), so I focus on (2). And (2) is a good thing whether or not human survival is net beneficial or net harmful.

lukeprog wrote:Also, I'm curious as to what you think is driving your intuitions toward negative rather than positive utilitarianism. Perhaps you have an essay on that, or you can explain your reasons briefly to me now?

Hmm, I don't have a persuasive intuition pump. The reason is that when I consider, myself, whether I would agree to be brutally tortured for 70 years in exchange for arbitrarily large amounts of happiness, I wouldn't take the offer. Beyond that, you can reduce the explanation to my neural mechanics of decison-making and past experiences.

lukeprog wrote:Finally, I'm curious about why you adopt value hedonism. In primates at least, 'pleasure' is a very particular operation performed by the brain when certain hedonic hotspots in the brain 'paint' neuronal events with a 'hedonic gloss'. Pain is a similarly particular process. Why are these things the source of value in the universe, the things that matter?

Same reason as above: When I consider myself, the only things that I (selfishly) care about are my own glossy pleasure paint and avoidance of glossy suffering paint. The only other thing that I care about is painting the ventral pallida of other organisms with this gloss as well (or preventing the painting of suffering).

To quote Bertrand Russell (see bottom of utilitarian.net), it always "appeared to me obvious" that positively/negatively valenced subjective experiences are all that matter. Do you feel otherwise?

The harder question is figuring out which kinds of neural processes we decide to classify as conscious subjective experience. But I at least know that the neural process you described does count.

lukeprog wrote:(For some sources on this, see [...].

Great references -- thanks!

User avatar
Hedonic Treader
Posts: 328
Joined: Sun Apr 17, 2011 11:06 am

Re: Friendly AI and utilitarianism

Postby Hedonic Treader » Mon Sep 19, 2011 6:22 pm

Alan Dawrst wrote:The reason is that when I consider, myself, whether I would agree to be brutally tortured for 70 years in exchange for arbitrarily large amounts of happiness, I wouldn't take the offer.

Would you accept 7 seconds of torture for a very large amount of happiness? Would you agree that multiplying both the torture and happiness with a factor x should lead to identical decisions?
"The abolishment of pain in surgery is a chimera. It is absurd to go on seeking it... Knife and pain are two words in surgery that must forever be associated in the consciousness of the patient."

- Dr. Alfred Velpeau (1839), French surgeon

lukeprog
Posts: 9
Joined: Mon Jan 12, 2009 5:41 am

Re: Friendly AI and utilitarianism

Postby lukeprog » Mon Sep 19, 2011 10:00 pm

Alan,

To locate the actual source of our apparent disagreement, let me ask a couple more questions:

1. What if only YOUR values were extrapolated? Would you still prefer to invest marginal effort in spreading a concern for animal suffering instead of in programming Friendly AI built from Alan's CEV?

2. Do you think "things like factory farming and human torture" would plausibly be condoned by the CEV of the human species?

Luke

User avatar
Brian Tomasik
Posts: 1107
Joined: Tue Oct 28, 2008 3:10 am
Location: USA
Contact:

Re: Friendly AI and utilitarianism

Postby Brian Tomasik » Sat Sep 24, 2011 11:51 am

Hedonic Treader wrote:Would you accept 7 seconds of torture for a very large amount of happiness?

Not sure.

Hedonic Treader wrote:Would you agree that multiplying both the torture and happiness with a factor x should lead to identical decisions?

Not sure.

:)

User avatar
Brian Tomasik
Posts: 1107
Joined: Tue Oct 28, 2008 3:10 am
Location: USA
Contact:

Re: Friendly AI and utilitarianism

Postby Brian Tomasik » Sat Sep 24, 2011 12:06 pm

lukeprog wrote:1. What if only YOUR values were extrapolated? Would you still prefer to invest marginal effort in spreading a concern for animal suffering instead of in programming Friendly AI built from Alan's CEV?

It depends on the form of extrapolation -- exactly what sorts of brain changes and built-in biases were involved -- but under reasonable extrapolation scenarios (e.g., reading more, seeing more of the world, experiencing more types of emotions, learning about insect psychology, etc.), I would almost certainly prefer working toward CEV. Ostensibly much higher expected returns.

lukeprog wrote:2. Do you think "things like factory farming and human torture" would plausibly be condoned by the CEV of the human species?

Unlikely. However:

1. Things like "preserving wildlife," "spreading life into space," "creating new universes," or "running ancestor simulations" plausibly could be condoned. And while it's thankfully improbable given current trajectories of the demographics interested in FAI, if CEV were applied to the wrong people (Christian/Muslim fundamentalists), then "torturing people forever" could be priority #1 for the AI. :evil:

2. Furthermore, there's a gap between "reducing risk of extinction / UFAI generally" and "promoting CEV." I find it highly implausible that CEV will be what actually determines the future of humanity. It's quite possible that forces with very different values from humans will take over. And even if the forces remain human-like, things could go badly (e.g, conflict leading to torture, or running massive numbers of simulations that count as suffering for machine-learning or scientific purposes). So reducing extinction risk has the dominant expected effect of contributing to these outcomes.

Do I prefer CEV over paperclipping? Probably paperclipping, because it seems not to entail the risks in #1. Agents far from humans in mind-space have seemingly less tendency to simulate human-like minds, which means less likelihood of suffering. However, I could be persuaded otherwise on this point, e.g., if you argued that paperclippers were likely to simulate lots of suffering for instrumental reasons.

lukeprog
Posts: 9
Joined: Mon Jan 12, 2009 5:41 am

Re: Friendly AI and utilitarianism

Postby lukeprog » Thu Oct 06, 2011 2:20 am

Alan,

It seems that you're not so worried if *your* CEV determines the future of the (local) universe, but you're worried about what happens if the CEV of religious fundamentalists determines (in part) the future of the universe. That makes sense; we don't yet know how to aggregate values or extrapolate them or why you would choose one method for either of those steps over another.

However, the *point* of CEV as a plan for FAI design is that whatever is "bad" or parochial or ignorant or biased gets "washed out" in the extrapolation process. If that doesn't happen, then it's probably not CEV.

That said, these are fuzzy matters. I suspect, however, that if it turned out that CEV like I've described above isn't possible, then Singularity Institute researchers (and many others) would change course toward a more promising solution for the future of the universe.

User avatar
Brian Tomasik
Posts: 1107
Joined: Tue Oct 28, 2008 3:10 am
Location: USA
Contact:

Re: Friendly AI and utilitarianism

Postby Brian Tomasik » Sun Oct 09, 2011 3:47 am

Hi Luke,

lukeprog wrote:but you're worried about what happens if the CEV of religious fundamentalists determines (in part) the future of the universe.

Not just religious fundamentalists (that's an extreme and easy case). I'm also worried about "deep ecologists" and people who (a) value preservation of nature, or (b) want to spread life into space, or (c) want to create new universes, or (d) are willing to take the risk of massive amounts of suffering for a potential happiness gain. (a) - (d) are not uncommon among intellectual elites, including some folks at SIAI.

Now, all of the above is about ideals, which are perilous enough. But things get worse when we consider what is actually likely to happen if humans advance to a galactic civilization. Most likely, our ideals will be swept away by economic and political forces; the dreams "of three little people don't amount to a hill of beans in this crazy world." For every percent we reduce the chance of human extinction, we increase the probability of these bad outcomes by some amount.

lukeprog wrote:However, the *point* of CEV as a plan for FAI design is that whatever is "bad" or parochial or ignorant or biased gets "washed out" in the extrapolation process. If that doesn't happen, then it's probably not CEV.

Well, who says what's "bad" or "parochial"? :) Lots of people don't think utilitronium has value, but I do. Lots of people want to spread life throughout the universe and want to create new universes and want to simulate life in ways that could become sentient.

If anything, I would expect my values to lose out to these, rather than vice versa. But I care about my own negative-leaning intuitions much more than I care about the elegance of applying CEV to people beyond myself. (Yes, I know that sounds selfish, but oh well. :? )

lukeprog wrote:That said, these are fuzzy matters. I suspect, however, that if it turned out that CEV like I've described above isn't possible, then Singularity Institute researchers (and many others) would change course toward a more promising solution for the future of the universe.

Yep, that's encouraging.

By the way, I'm often puzzled by SIAI's focus on CEV in general. Do they really think it has a chance of being implemented? Only if they design an AGI in the basement does it seem possible. Otherwise, these decisions will be muddied by power politics, as most things are.

Maybe CEV can be a playground for thinking about general AGI goal systems. But talking as though it has much chance of coming to fruition, even if AGI does come about, seems odd to me.

Recumbent
Posts: 17
Joined: Sat Dec 26, 2009 8:17 pm

Re: Friendly AI and utilitarianism

Postby Recumbent » Sat Oct 15, 2011 1:35 am

And if you can carry a qualitative argument that the probability is under, say, 1%, then that means AI is probably the wrong use of marginal resources...


When did Yudkowski write this? Please see my post at:
viewtopic.php?f=25&t=484&p=4026#p4026

Even with a 1% risk, AI is the right use of marginal resources by about 20 orders of magnitude.

User avatar
Brian Tomasik
Posts: 1107
Joined: Tue Oct 28, 2008 3:10 am
Location: USA
Contact:

Re: Friendly AI and utilitarianism

Postby Brian Tomasik » Sun Oct 16, 2011 4:03 am

Recumbent wrote:When did Yudkowski write this?

I hadn't seen it before, but I followed the first link in that comment. Here's the full text:
And I don’t think the odds of us being wiped out by badly done AI are small. I think they’re easily larger than 10%. And if you can carry a qualitative argument that the probability is under, say, 1%, then that means AI is probably the wrong use of marginal resources – not because global warming is more important, of course, but because other ignored existential risks like nanotech would be more important. I am not trying to play burden-of-proof tennis. If the chances are under 1%, that’s low enough, we’ll drop the AI business from consideration until everything more realistic has been handled. We could try to carry the argument otherwise, but I do quite agree that it would be a nitwit thing to do in real life, like trying to shut down the Large Hadron Collider.

What I was trying to convey there is that the utility interval for fate of the galaxy is overwhelmingly more important than the fate of 15% of the Earth’s biological species, and that realistically we just shouldn’t be talking about the environmental stuff, there’s no possible way we should be talking about the environmental stuff, there’s enough people talking about it already and we’ve got much bigger fish going unfried.


Return to “Applied ethics”

Who is online

Users browsing this forum: No registered users and 2 guests