Can a robot pass a university entrance exam Noriko Arai

Today, I’m going to talk about AI and us.

AI researchers have always said

that we humans do not need to worry,

because only menial jobs
will be taken over by machines.

Is that really true?

They have also said
that AI will create new jobs,

so those who lose their jobs
will find a new one.

Of course.

But the real question is:

How many of those
who may lose their jobs to AI

will be able to land a new one,

especially when AI is smart enough
to learn better than most of us?

Let me ask you a question:

How many of you think

that AI will pass the entrance examination
of a top university by 2020?

Oh, so many. OK.

So some of you may say, “Of course, yes!”

Now singularity is the issue.

And some others may say, “Maybe,

because AI already won
against a top Go player.”

And others may say, “No, never. Uh-uh.”

That means we do not know
the answer yet, right?

So that was the reason why
I started Todai Robot Project,

making an AI which passes
the entrance examination

of the University of Tokyo,

the top university in Japan.

This is our Todai Robot.

And, of course, the brain of the robot
is working in the remote server.

It is now writing a 600-word essay

on maritime trade in the 17th century.

How does that sound?

Why did I take the entrance exam
as its benchmark?

Because I thought we had to study
the performance of AI

in comparison to humans,

especially on the skills and expertise

which are believed
to be acquired only by humans

and only through education.

To enter Todai, the University of Tokyo,

you have to pass
two different types of exams.

The first one is
a national standardized test

in multiple-choice style.

You have to take seven subjects

and achieve a high score –

I would say like an 85 percent
or more accuracy rate –

to be allowed to take
the second stage written test

prepared by Todai.

So let me first explain
how modern AI works,

taking the “Jeopardy!” challenge
as an example.

Here is a typical “Jeopardy!” question:

“Mozart’s last symphony
shares its name with this planet.”

Interestingly, a “Jeopardy!”
question always asks,

always ends with “this” something:

“this” planet, “this” country,

“this” rock musician, and so on.

In other words, “Jeopardy!” doesn’t ask
many different types of questions,

but a single type,

which we call “factoid questions.”

By the way, do you know the answer?

If you do not know the answer
and if you want to know the answer,

what would you do?

You Google, right? Of course.

Why not?

But you have to pick appropriate keywords

like “Mozart,” “last”
and “symphony” to search.

The machine basically does the same.

Then this Wikipedia page
will be ranked top.

Then the machine reads the page.

No, uh-uh.

Unfortunately, none of the modern AIs,

including Watson, Siri and Todai Robot,

is able to read.

But they are very good
at searching and optimizing.

It will recognize

that the keywords “Mozart,”
“last” and “symphony”

are appearing heavily around here.

So if it can find a word which is a planet

and which is co-occurring
with these keywords,

that must be the answer.

This is how Watson finds
the answer “Jupiter,” in this case.

Our Todai Robot works similarly,
but a bit smarter

in answering history yes-no questions,

like, “‘Charlemagne repelled the Magyars.’
Is this sentence true or false?”

Our robot starts producing
a factoid question,

like: “Charlemagne repelled
[this person type]” by itself.

Then, “Avars” but not
“Magyars” is ranked top.

This sentence is likely to be false.

Our robot does not read,
does not understand,

but it is statistically
correct in many cases.

For the second stage written test,

it is required to write
a 600-word essay like this one:

[Discuss the rise and fall
of the maritime trade

in East and Southeast Asia
in the 17th century …]

and as I have shown earlier,

our robot took the sentences
from the textbooks and Wikipedia,

combined them together,

and optimized it to produce an essay

without understanding a thing.

(Laughter)

But surprisingly, it wrote a better essay

than most of the students.

(Laughter)

How about mathematics?

A fully automatic math-solving machine

has been a dream

since the birth of the word
“artificial intelligence,”

but it has stayed at the level
of arithmetic for a long, long time.

Last year, we finally succeeded
in developing a system

which solved pre-university-level
problems from end to end,

like this one.

This is the original problem
written in Japanese,

and we had to teach it
2,000 mathematical axioms

and 8,000 Japanese words

to make it accept the problems
written in natural language.

And it is now translating
the original problems

into machine-readable formulas.

Weird, but it is now ready
to solve it, I think.

Go and solve it.

Yes! It is now executing
symbolic computation.

Even more weird,

but probably this is the most
fun part for the machine.

(Laughter)

Now it outputs a perfect answer,

though its proof is impossible to read,
even for mathematicians.

Anyway, last year our robot
was among the top one percent

in the second stage written
exam in mathematics.

(Applause)

Thank you.

So, did it enter Todai?

No, not as I expected.

Why?

Because it doesn’t understand any meaning.

Let me show you a typical error
it made in the English test.

[Nate: We’re almost at the bookstore.
Just a few more minutes.

Sunil: Wait. ______ .
Nate: Thank you! That always happens …]

Two people are talking.

For us, who can understand
the situation –

[1. “We walked for a long time.”
2. “We’re almost there.”

  1. “Your shoes look expensive.”
  2. “Your shoelace is untied."]

it is obvious number four
is the correct answer, right?

But Todai Robot chose number two,

even after learning 15 billion
English sentences

using deep learning technologies.

OK, so now you might
understand what I said:

modern AIs do not read,

do not understand.

They only disguise as if they do.

This is the distribution graph

of half a million students
who took the same exam as Todai Robot.

Now our Todai Robot
is among the top 20 percent,

and it was capable to pass

more than 60 percent
of the universities in Japan –

but not Todai.

But see how it is beyond the volume zone

of to-be white-collar workers.

You might think I was delighted.

After all, my robot was surpassing
students everywhere.

Instead, I was alarmed.

How on earth could this unintelligent
machine outperform students –

our children?

Right?

I decided to investigate
what was going on in the human world.

I took hundreds of sentences
from high school textbooks

and made easy multiple-choice quizzes,

and asked thousands
of high school students to answer.

Here is an example:

[Buddhism spread to … ,
Christianity to … and Oceania,

and Islam to …]

Of course, the original problems
are written in Japanese,

their mother tongue.

[ ______ has spread to Oceania.

  1. Hinduism 2. Christianity
  2. Islam 4. Buddhism ]

Obviously, Christianity
is the answer, isn’t it?

It’s written!

And Todai Robot chose
the correct answer, too.

But one-third of junior
high school students

failed to answer this question.

Do you think it is only the case in Japan?

I do not think so,

because Japan is always ranked
among the top in OECD PISA tests,

measuring 15-year-old
students' performance in mathematics,

science and reading

every three years.

We have been believing

that everybody can learn

and learn well,

as long as we provide
good learning materials

free on the web

so that they can access
through the internet.

But such wonderful materials
may benefit only those who can read well,

and the percentage
of those who can read well

may be much less than we expected.

How we humans will coexist with AI

is something we have
to think about carefully,

based on solid evidence.

At the same time,
we have to think in a hurry

because time is running out.

Thank you.

(Applause)

Chris Anderson: Noriko, thank you.

Noriko Arai: Thank you.

CA: In your talk, you so beautifully
give us a sense of how AIs think,

what they can do amazingly

and what they can’t do.

But – do I read you right,

that you think we really need
quite an urgent revolution in education

to help kids do the things
that humans can do better than AIs?

NA: Yes, yes, yes.

Because we humans
can understand the meaning.

That is something
which is very, very lacking in AI.

But most of the students
just pack the knowledge

without understanding
the meaning of the knowledge,

so that is not knowledge,
that is just memorizing,

and AI can do the same thing.

So we have to think about
a new type of education.

CA: A shift from knowledge,
rote knowledge, to meaning.

NA: Mm-hmm.

CA: Well, there’s a challenge
for the educators. Thank you so much.

NA: Thank you very much. Thank you.

(Applause)

今天,我要谈谈人工智能和我们。

人工智能研究人员一直说

,我们人类不必担心,

因为机器只会接手卑微的工作

这是真的吗?

他们还表示
,人工智能会创造新的工作岗位,

所以那些失业的人
会找到新的工作。

当然。

但真正的问题是:


多少可能因人工智能而失业的

人能够找到一份新工作,

尤其是当人工智能足够聪明,
可以比我们大多数人学得更好的时候?

我问你一个问题:

有多少人

认为2020年AI会通过名牌
大学的入学考试?

哦,这么多。 行。

所以你们中的一些人可能会说,“当然,是的!”

现在奇点是问题。

其他一些人可能会说,“也许,

因为人工智能已经
战胜了顶级围棋选手。”

其他人可能会说,“不,从来没有。呃-呃。”

这意味着我们
还不知道答案,对吧?

所以这就是为什么
我开始了东大机器人项目,

做一个通过

日本顶尖大学东京大学入学考试的人工智能。

这是我们的东大机器人。

当然,机器人的大脑
也在远程服务器中工作。

它现在正在撰写一篇

关于 17 世纪海上贸易的 600 字文章。

听上去怎么样?

为什么我以入学考试
为基准?

因为我认为我们必须研究

人工智能与人类相比的表现,

特别是

那些被认为
只有人类才能获得

并且只能通过教育获得的技能和专业知识。

要进入东京大学东大,

您必须通过
两种不同类型的考试。

第一个是多项选择题
的国家标准化考试

你必须选修七门科目

并取得高分——

我会说
准确率达到 85% 或更高——

才能参加

由东大准备的第二阶段笔试。

所以让我先解释
一下现代人工智能是如何工作的,

采取“危险!” 以挑战
为例。

这是一个典型的“危险!” 问题:

“莫扎特的最后一部交响曲
与这个星球同名。”

有趣的是,“危险!”
问题总是问,

总是以“this”结尾:

“this”星球,“this”国家,

“this”摇滚音乐家,等等。

换句话说,“危险!” 不会问
很多不同类型的问题,

而是单一类型

,我们称之为“类事实问题”。

顺便问一下,你知道答案吗?

如果你不知道答案
,如果你想知道答案,

你会怎么做?

你谷歌,对吧? 当然。

为什么不?

但是您必须选择合适的关键字,

例如“Mozart”、“last”
和“symphony”进行搜索。

机器基本上也是这样。

那么这个维基百科页面
将被排在首位。

然后机器读取页面。

不,嗯。

不幸的是,

包括 Watson、Siri 和 Todai Robot 在内的现代人工智能

都无法阅读。

但他们非常
擅长搜索和优化。

它将识别

出关键字“莫扎特”、
“最后”和“交响乐”

在这里大量出现。

因此,如果它可以找到一个词是行星

并且
与这些关键字同时出现,

那一定是答案。

这就是 Watson
在这种情况下找到答案“木星”的方式。

我们的 Todai 机器人的工作方式类似,

在回答历史是非问题时更聪明一些,

比如“‘查理曼大帝击退了马扎尔人’。”
这句话是真是假?”

我们的机器人开始自己提出
一个事实性的问题,

例如:“查理曼大帝排斥
[此人类型]”。

然后,“Avars”而不是
“Magyars”排名第一。

这句话很可能是假的。

我们的机器人不会阅读,
也不会理解,

但在许多情况下它在统计上是
正确的。

第二阶段笔试

,要求写
一篇600字左右的作文:

【探讨17世纪

东亚和东南亚海上贸易的兴衰
……】

我们的机器人
从教科书和维基百科中提取句子,

将它们组合在一起,

并对其进行优化,以在

不理解任何事物的情况下生成一篇文章。

(笑声)

但令人惊讶的是,它写的论文

比大多数学生都好。

(笑声)

数学呢? 自“人工智能”这个词诞生以来,

一台全自动数学解题机

一直是一个梦想

但它却长期停留在
算术层面。

去年,我们终于
成功开发了一个系统

,可以端到端地解决大学预科
问题,

就像这个系统一样。

这是
用日语写的原始问题

,我们必须教它
2000 条数学公理

和 8000 个日语单词

,才能让它接受
用自然语言写的问题。

它现在正在
将原始问题

转化为机器可读的公式。

很奇怪,但我想它现在已经准备
好解决它了。

去解决它。

是的! 它现在正在执行
符号计算。

更奇怪的是,

但这可能是这
台机器最有趣的部分。

(笑声)

现在它输出了一个完美的答案,

尽管它的证明是不可能阅读的,
即使对数学家来说也是如此。

不管怎样,去年我们的机器人在数学第二阶段笔试
中名列前1%

(掌声)

谢谢。

那么,它进入了东大吗?

不,不像我预期的那样。

为什么?

因为它不明白任何意义。

让我向您展示
它在英语测试中犯的一个典型错误。

[内特:我们快到书店了。
再过几分钟。

苏尼尔:等等。 ______ 。
内特:谢谢! 这总是发生……]

两个人在说话。

对我们来说,谁能
了解情况——

[1. “我们走了很久。”
2.“我们快到了。”

3.“你的鞋子看起来很贵。”
4.“你的鞋带解开了。”]

显然第四个
是正确的答案,对吧?

但东大机器人选择了第二名,

即使在使用深度学习技术学习了 150 亿个
英语句子之后

好的,现在你可能
明白我所说的了:

现代 AI 不会阅读,

也不会理解。

他们只是伪装得好像他们这样做了。

这是与东

大机器人参加相同考试的 50 万学生的分布图。

现在我们的东大机器人
已经跻身前 20% 之列

,它能够通过日本

60%
以上的大学——

但不是东大。

但看看它是如何超出准白领的音量区

的。

你可能认为我很高兴。

毕竟,我的机器人
到处都在超越学生。

相反,我很震惊。

这台不智能的机器到底怎么能
胜过学生——

我们的孩子?

对?

我决定调查
一下人类世界发生了什么。


从高中课本中提取了数百个句子,

并制作了简单的选择题测验,

并要求数千
名高中生回答。

这是一个例子:

[佛教传播到……,
基督教传播到……和大洋洲

,伊斯兰教传播到……]

当然,最初的问题
是用

他们的母语日语写的。

[ ______ 已经蔓延到大洋洲。

  1. 印度教 2. 基督教
  2. 伊斯兰教 4. 佛教]

显然,基督教
是答案,不是吗?

写好了!

东大机器人也选择
了正确的答案。

但有三分之一的初中生

没有回答这个问题。

你以为只有日本这样吗?

我不这么认为,

因为日本
在经合组织 PISA 测试中一直名列前茅,每三年

衡量 15 岁
学生在数学、

科学和阅读方面的表现

我们一直

相信,

只要我们

在网络上免费提供好的学习资料,

让他们可以
通过互联网访问,每个人都能学得好、学得好。

但如此精彩的资料,
可能只会让阅读能力强的人受益,而阅读能力强

的人所占的比例

可能远低于我们的预期。

我们人类将如何与人工智能共存

是我们必须
根据确凿证据仔细考虑的事情

同时,
我们必须赶紧思考,

因为时间不多了。

谢谢你。

(掌声)

Chris Anderson:Noriko,谢谢。

新井纪子:谢谢。

CA:在你的演讲中,你非常精彩地
让我们了解了 AI 的思维方式,

他们可以做什么惊人的

事情,以及他们不能做什么。

但是——我没看错

,你认为我们真的需要
一场非常紧迫的教育革命

来帮助孩子们做人
类比人工智能做得更好的事情吗?

NA:是的,是的,是的。

因为我们人类
可以理解其中的含义。

这是人工智能非常非常缺乏的东西。

但是大部分学生
只是把知识打包,

不理解
知识的含义,

所以那不是知识,
那只是记忆

,AI可以做同样的事情。

所以我们必须考虑
一种新型的教育方式。

CA:从知识、
死记硬背的知识到意义的转变。

NA:嗯,嗯。

CA:嗯,对教育工作者来说是一个挑战
。 太感谢了。

NA:非常感谢。 谢谢你。

(掌声)