An Imminent Threat from Artificial Intelligence

i’m sure

that by now you’ve heard a wealth of

op-eds and talking heads herald the rise

artificial intelligence and along with

that you’ve probably heard

a cascade of risks that we’re going to

face

risks from ai becoming

some nightmarish immortal dictator that

we’ll never escape

from to philosophical questions

surrounding agi and super intelligence

to the end of work as a product of

automation

and fully autonomous weaponry the media

has

tended to focus on these mid to

long-term threats while in my view

neglecting

an extremely imminent if not already in

process

risk the risk that i’m concerned with

is one affecting our well-being our

mental health

our beliefs about the world in this talk

what i want to do is i want to

get us all on a common ground

concerning what we mean when we say

artificial intelligence

where we’re interacting with artificial

intelligence today

and the kinds of risks that arise from

that interaction

so to begin with what is artificial

intelligence

well this contemporary wave of progress

in the field has really been fueled by

a subfield of a subfield called deep

learning and neural networks

a neural network is just a program

consisting of a set of neurons

which are wired together at one end of

the network will feed in inputs

at the other end of the network we’ll

receive outputs

so on that input side we can feed in any

sort of data

for instance we can feed in pictures and

then we can ask the question is a baby

in this image or is it not

in the example on the screen you can see

that the network’s making a mistake

it’s claiming that there’s no baby in an

image which clearly contains a baby

so that brings us to the second point of

using neural networks

teaching them training them so what we

do is we simply let the neural network

know

you’re making a mistake and then we step

back we let the neural network

reconfigure itself in some way in the

example on the screen you’ll see

removing connections between the neurons

and as a product of this change

the correction occurs the neural network

is now expressing the correct function

so there’s this two-step procedure you

specify

some network some objective rather you

ask it to do something

and then you step back and you let the

neural network

come to its own solution its own

solution strategy to solve the problem

you posed okay so that’s a bit about how

neural networks work where are you

interacting with them

today so there’s broadly two categories

which you can kind of

lump your interactions into the first

are these companies that want you to

buy some product the second are the

companies

that collaborate with the first category

keep you on platform and serve you ads

i’m going to focus on the second of

these categories

these social medias these content

delivery platforms

this is where we are we spend an immense

amount of our time

we spend hours of our day interacting

with neural networks

so how do neural networks play into

these platforms well

on something like youtube the primary

method

of navigating the site is this sidebar

which contains recommendations for the

next video you should watch

this is completely determined by a

neural network

on a platform like twitter your home

feed this aggregation point of all the

accounts that you follow

is being filtered and re-ranked

according to neural networks

similarly on instagram those

posts that you’re presented with the

order in which they arrive is decided by

a neural network

okay so this is where we’re kind of

interacting with these models

what’s the risk well we know that the

content you consume has a massive impact

on your well-being

it affects your belief about the world

your mental health

and there’s this famous now infamous

study

conducted by researchers at facebook and

cornell

which seeks to address the extent to

which emotion contagion

exists within digital social networks

so emotion contagion it’s this idea that

when we’re presented with

other humans expressing an emotion we

ourselves will begin to

experience that emotion interestingly

you don’t necessarily have the ability

to recognize

that emotion is external to you you you

believe that it’s authentically your own

and so what these researchers did is

without informing users

they manipulated the news feeds of seven

hundred thousand accounts

the way they manipulated these news

feeds was to bias them towards more

positive or more negative content

and then they would record how those

manipulated users

own posts changed in sentiment

what they found was that there was an

overwhelming

positive correlation with that

manipulation

so if i go and i serve you a bunch of

negative content

your own expressions your own

communications

will become negative similarly for

positive content

so i mentioned that this study was

infamous and it’s

infamous on the level of there are some

strong questions about

the ethics of the conduct of the study

and there’s also some questions about

facebook’s chosen conclusions to draw

from the results

but the work stands as an extremely

compelling

piece of evidence for the hypothesis

that

simply by manipulating the content that

you consume

you can have immense impact on the

well-being the mental states the beliefs

of your downstream users okay so i hope

that

the pieces are starting to fall into

place for

how this current setup this current

framework that we exist in

has risk the content that we consume has

massive impact on our well-being and yet

we’ve

outsourced that role to ai

to neural networks okay so let’s take a

step back

and let’s think about how we train

neural networks

so again we specify some objective we

ask the network to do something

and then we let the algorithm arrive at

its own solution

well what’s the objective that we’re

specifying for these content serving

networks

they broadly fall into two categories

the first

is engagement engagement is the

probability that

you’ll interact with a piece of content

i serve you maybe you’ll like it you’ll

comment on it you’ll reshare it to your

friends

the second category is time

and so how can i serve you content in

order to keep you

on platform which will give me the

company

more opportunities to serve you ads

which means more opportunities

to make money so

i want to talk about this second

category time

this seems like an immensely risky

choice of objective to optimize

humans have this natural pathology of

addiction and we’re

explicitly optimizing for them to spend

time on one thing

and remember that neural networks

they’re going to take whatever strategy

they come up with

that’s most effective at achieving their

goals and if our goal is time

we really need to ask the question what

addicting enraging depressing

our users is the most effective strategy

to keep them on site engaged engrossed

with the content

i think that’s a very plausible

hypothesis and so the next question we

can ask is okay well

what sort of frameworks exist at the

moment to keep track of this risk

or even to mitigate it unfortunately

very few but there’s an extremely simple

and familiar framework that we can rely

on in order to

kind of defend against these risks

to begin with consider your total user

pool of your service

on facebook we’re talking about more

than two billion users

now separate a small baseline pool

and call that call that your baseline

pool and

this baseline pool what’s going to

happen is there’s going to be no

interaction

with your model so for these users in

this baseline set

they’re not going to have their feeds

ranked by a neural network it’s going to

be very simple straightforward

algorithms like

sequential serving okay the next thing

we can do is we can track these metrics

of well-being

on both pools and we can compare them

these metrics are being tracked in real

time

continuously so when we see these

metrics begin to

diverge when some gap appears we can

take

active measures to try to minimize that

gap

or reduce it these active measures might

look

something like reducing the impact a

model has on

users experience of the site so this

framework it’s

it’s incredibly general it could be

applied to pretty much any content

serving platform or

social media that exists today

but there’s still two core issues or

questions that i have with the setup the

first is

well what’s a metric of well-being you

know choosing metrics is an

incredibly difficult thing i really

don’t think

that computer scientists should be

trusted with coming up with

the fundamental metric of well-being

we’re the ones who came up with time in

the first place

so where i think this is going to come

from is a cross-disciplinary

collaboration between computer

scientists

and statisticians and ethicists and

philosophers

and social scientists

i think that this kind of

this issue is approachable and we can

make concrete steps towards solving it

so that’s the first issue i see with

this framework the second question i

have is one that

i don’t have an answer to and i’m going

to leave you with today

so when i was talking about taking some

active measure

in order to close that gap between the

metrics

i was talking about population level

dynamics

so we’re tracking these metrics on

aggregate

we’re tracking among all the model

facing users and among all the baseline

users

and then when those two pools shift in

some way

we’re taking an action across all users

to try to correct for that but that’s

not the only option

we could track metrics on a per user

basis

so we could watch each individual user

we could see how their metrics

begin to shift from some baseline or

from perhaps their

historical average and then we could

take a per user action

to push them in a different direction

i see a massive potential for good in

this framework

you can imagine a suicidal user if our

metrics can pick up

on entry into a depressive episode

we could take steps to try to mitigate

risk consider the fact that

suicidal humans are about 0.1 percent of

the population

if our intervention is only one percent

effective

at the scale of facebook billions of

users we’re talking about

tens of thousands of lives impacted

so massive potential for good but

consider what we have to accept if we

build frameworks for this

we’re allowing companies to track

our mental state and then actively

encouraging them to manipulate it

it’s kind of this huxley and nightmare

so i see both the potential for massive

good and

massive risk i don’t know where the

balance falls but i think that points to

another major problem there’s not a

public discussion

surrounding these issues these systems

these content serving systems they’ve

existed for

well over a decade now and yet there’s

no regulation

there’s no discussion of the types of

checks and balances we’d expect

to find in the technology of this

maturity

if there’s anything i want you to take

away from this talk

it’s a call to action to speak to your

family

your friends your teachers your

politicians about these issues

build an opinion through discourse

your opinions are sorely needed and we

the ones implementing this technology

we desperately want to hear them thank

you so much for your time

[Applause]

我敢肯定

，到目前为止，您已经听到了大量的

专栏文章和谈话首脑预示着人工智能的兴起

，

同时您可能已经听说

过一系列风险，我们将

面临来自人工智能的风险

一些我们永远无法逃避的噩梦般的不朽独裁者，

围绕着敏捷和超级

智能的哲学问题，直到作为自动化和完全自主武器的产物的工作结束时，

媒体

倾向于关注这些

中长期威胁，而在我的考虑

忽略

一个非常迫在眉睫的风险（如果尚未在

过程中）

当我们说

人工智能时

，我们在今天与人工智能交互的地方

以及从这种交互中产生的各种风险的共同点

，

所以从什么是人造的开始

确实，这一领域的当代进步浪潮实际上是由

称为深度学习和神经网络的子领域的一个子领域推动的，

神经网络只是一个

由一组神经元组成的程序，这些神经元

在网络的一端连接在一起将

在网络的另一端输入输入，我们将

接收输出，

因此在输入端我们可以输入任何

类型的数据

，例如我们可以输入图片，

然后我们可以问这个问题

是这个图像中的婴儿还是是不是

在屏幕上的示例中，您可以

看到网络犯了一个错误，

它声称在一个

明显包含婴儿的图像中没有婴儿，

这将我们带到了

使用神经网络

教他们训练它们的第二点，所以什么我们

所做的是我们只是让神经网络

知道

你犯了一个错误然后我们退后一步

让神经网络

以某种方式重新配置自己

在屏幕上的示例中你会看到

删除 conn 神经元之间的相互作用，

并且作为这种变化的产物，

发生校正神经网络

现在正在表达正确的功能，

所以有这个两步过程，你

指定

一些网络某个目标，而不是你

要求它做某事

，然后你退后一步，你让

神经

网络找到自己的解决方案它自己的

解决方案策略来

解决您提出的问题第一个

是这些希望你

购买某些产品的公司第二个

是与第一个类别合作

以

保持你在平台上并为你提供广告的公司

我将专注于这些类别中的第二个

这些社交媒体这些内容

交付平台

这就是我们所在的地方我们花费大量

时间

我们每天花费数小时

与神经网络进行交互，

所以 h 神经网络在

这些平台上的表现

如何，比如 youtube 导航网站的主要

方法

是这个侧边栏

，其中包含

您应该观看的下一个视频的建议

这完全由

twitter 等平台上的神经网络决定

您关注的所有帐户的聚合点

正在根据神经网络在 Instagram 上进行过滤和重新排序

，您看到的帖子

的到达顺序

由神经网络

决定重新

与这些模型互动

风险是什么我们知道

您消费的内容

会对您的幸福感产生巨大影响

它会影响您对世界

的信念您的心理健康

还有

一项由 facebook 的研究人员进行的著名的现在臭名昭著的研究

康奈尔大学

试图解决

数字社交网络中情绪传染的程度，

因此情绪控制 agion 是这样的想法，

当我们遇到

其他表达情感的人时，我们

自己将开始

体验这种情感，有趣的是，

你不一定有

能力认识

到情绪是外在的你你

相信它真的是你自己的，

并且所以这些研究人员所做的是

没有通知用户

他们操纵了 70

万个账户

的新闻提要他们操纵这些新闻

提要的方式是使他们偏向更

积极或更消极的内容

，然后他们会记录这些被

操纵的用户

自己的帖子在

他们发现的情绪与这种操纵存在

压倒性的

正相关，

所以如果我去，我为你提供一堆

负面内容，

你自己的表达你自己的

交流

也会变成负面的

内容，

所以我提到这项研究是

臭名昭著的，这是

臭名昭著的水平有一些

强烈的问题

研究进行的道德规范

，还有一些关于

Facebook 从结果中得出的选择结论的问题，

但这项工作是一个非常有说服力

的证据，证明了这样一个

假设：

仅仅通过操纵你消费的内容，

你就可以产生巨大的影响关于

幸福感，精神状态，

下游用户的信念还可以，所以我希望

这些部分开始

到位，以

了解当前的设置如何

我们存在的当前框架

存在风险，我们消费的

内容对我们的福祉，但

我们已经

将这个角色外包

给了神经网络，所以让我们

退后一步

，让我们考虑一下我们如何训练

神经网络，

所以我们再次指定一些目标，我们

要求网络做某事

，然后我们让该算法很好地得出

了自己的解决方案

，我们为这些内容服务网络指定的目标是什么？

它们大致分为两类

s 第一个

是参与度参与度是

您与我为您服务的内容进行交互的概率

也许您会喜欢它您会对其

发表评论您会将其转发给您的

朋友第二个类别是时间

以及如何我可以为您提供内容

以便让您

留在平台上吗？这将使我

公司有

更多机会为您提供广告

，这意味着更多

赚钱的机会所以

我想谈谈第二

类时间

这似乎是一个非常冒险

的目标选择为了优化

人类有这种

成瘾的自然病态，我们

明确地优化他们花

时间在一件事上

，记住神经网络

他们将采取

他们想出的

任何策略，这对实现他们的目标最有效

，如果我们的目标是时间，

我们真的需要问一个问题，

如果

上瘾、激怒、压抑

我们的用户是

让他们在网站上全神贯注

于内容的最有效策略

怎么办？认为这是一个非常合理的

假设，所以我们

可以提出的下一个问题是

，目前存在什么样的框架

来跟踪这种风险

，甚至减轻它不幸的

是很少，但是有一个非常简单

和熟悉的框架，我们可以

为了抵御这些风险

，首先要考虑您在 facebook

上的服务的总用户池，

我们正在谈论

超过 20 亿用户

现在分开一个小的基线池，

并将该调用称为您的基线

池和

这个基线池将

发生的情况是不会

与您的模型进行交互，因此对于

此基线集中的这些用户，

他们不会让他们的提要

由神经网络排名，这将

是非常简单直接的

算法，如

顺序服务好吧

我们可以做的下一件事是我们可以在两个池中跟踪这些

幸福指标

，我们可以比较它们

这些指标正在被实时跟踪

因此，当我们看到这些

指标开始

出现分歧时，当出现一些差距时，我们可以

采取

积极措施来尽量减少

或减少差距，这些积极措施可能

看起来

像是减少

模型

对网站用户体验的影响，所以这个

框架

这是非常通用的，它可以

应用于当今存在的几乎任何内容

服务平台或

社交媒体，

但我仍然有两个核心问题或

问题，

第一个是

你知道的幸福

指标选择指标是一件

非常困难的事情我真的

不

认为计算机科学家应该被

信任来

提出幸福的基本指标

我们是那些首先提出时间的人

所以我认为这会 come

from 是

计算机

科学家

、统计学家、伦理学家、

哲学家

和社会科学家之间的跨学科合作，

我认为这

这种问题是平易近人的，我们可以

采取具体步骤来解决它，

所以这是我在这个框架中看到的第一个问题，我

遇到的第二个问题

是

我没有答案的问题，我将

留给你

今天，当我谈论采取一些

积极措施

以缩小指标之间的差距时，

我谈论的是人口水平

动态，

所以我们正在跟踪这些指标的

总体情况，

我们正在跟踪所有

面向用户的模型以及所有基线

用户

，然后当这两个池以某种方式发生变化时，

我们将对所有用户采取行动

以尝试纠正这一点，但这

不是

我们可以跟踪每个用户的指标的唯一选择，

因此我们可以观察每个

用户可以看到他们的指标如何

开始从某个基线或

可能从他们的

历史平均值开始转变，然后我们可以

采取每个用户的行动

将它们推向不同的方向

我看到了巨大的潜力

他的框架

你可以想象一个有自杀倾向的用户，如果我们的

指标可以

在进入抑郁发作时得到提升，

我们可以采取措施来降低

风险考虑这样一个事实，如果我们的干预只有 1% 有效，那么有

自杀倾向的人约占

人口的 0.1%

在 facebook 数十亿用户的规模上，

我们谈论的

是成千上万的生命受到了

如此巨大的潜在利益的影响，但

考虑一下如果我们为此建立框架我们必须接受什么，

我们允许公司跟踪

我们的心理状态，然后积极

鼓励他们操纵它，

这是一种赫胥黎和噩梦，

所以我看到了巨大

利益和

巨大风险的潜力我不知道平衡在哪里，

但我认为这指向

另一个主要问题没有

围绕这些问题进行公开讨论这些系统

这些内容服务系统它们已经

存在

了十多年了，但是

没有法规

，没有讨论

我们期望

在这种成熟的技术中找到的制衡类型

如果我想让你

从这次

谈话中得到什么

通过话语

发表意见非常需要您的意见，我们

是实施这项技术的人，

我们迫切希望听到他们的意见，

非常感谢您抽出宝贵的时间

[掌声]