Your words may predict your future mental health Mariano Sigman

We have historical records that allow us
to know how the ancient Greeks dressed,

how they lived,

how they fought …

but how did they think?

One natural idea is that the deepest
aspects of human thought –

our ability to imagine,

to be conscious,

to dream –

have always been the same.

Another possibility

is that the social transformations
that have shaped our culture

may have also changed
the structural columns of human thought.

We may all have different
opinions about this.

Actually, it’s a long-standing
philosophical debate.

But is this question
even amenable to science?

Here I’d like to propose

that in the same way we can reconstruct
how the ancient Greek cities looked

just based on a few bricks,

that the writings of a culture
are the archaeological records,

the fossils, of human thought.

And in fact,

doing some form of psychological analysis

of some of the most ancient
books of human culture,

Julian Jaynes came up in the ’70s
with a very wild and radical hypothesis:

that only 3,000 years ago,

humans were what today
we would call schizophrenics.

And he made this claim

based on the fact that the first
humans described in these books

behaved consistently,

in different traditions
and in different places of the world,

as if they were hearing and obeying voices

that they perceived
as coming from the Gods,

or from the muses …

what today we would call hallucinations.

And only then, as time went on,

they began to recognize
that they were the creators,

the owners of these inner voices.

And with this, they gained introspection:

the ability to think
about their own thoughts.

So Jaynes’s theory is that consciousness,

at least in the way we perceive it today,

where we feel that we are the pilots
of our own existence –

is a quite recent cultural development.

And this theory is quite spectacular,

but it has an obvious problem

which is that it’s built on just a few
and very specific examples.

So the question is whether the theory

that introspection built up in human
history only about 3,000 years ago

can be examined in a quantitative
and objective manner.

And the problem of how
to go about this is quite obvious.

It’s not like Plato woke up one day
and then he wrote,

“Hello, I’m Plato,

and as of today, I have
a fully introspective consciousness.”

(Laughter)

And this tells us actually
what is the essence of the problem.

We need to find the emergence
of a concept that’s never said.

The word introspection
does not appear a single time

in the books we want to analyze.

So our way to solve this
is to build the space of words.

This is a huge space
that contains all words

in such a way that the distance
between any two of them

is indicative of how
closely related they are.

So for instance,

you want the words “dog” and “cat”
to be very close together,

but the words “grapefruit” and “logarithm”
to be very far away.

And this has to be true
for any two words within the space.

And there are different ways
that we can construct the space of words.

One is just asking the experts,

a bit like we do with dictionaries.

Another possibility

is following the simple assumption
that when two words are related,

they tend to appear in the same sentences,

in the same paragraphs,

in the same documents,

more often than would be expected
just by pure chance.

And this simple hypothesis,

this simple method,

with some computational tricks

that have to do with the fact

that this is a very complex
and high-dimensional space,

turns out to be quite effective.

And just to give you a flavor
of how well this works,

this is the result we get when
we analyze this for some familiar words.

And you can see first

that words automatically organize
into semantic neighborhoods.

So you get the fruits, the body parts,

the computer parts,
the scientific terms and so on.

The algorithm also identifies
that we organize concepts in a hierarchy.

So for instance,

you can see that the scientific terms
break down into two subcategories

of the astronomic and the physics terms.

And then there are very fine things.

For instance, the word astronomy,

which seems a bit bizarre where it is,

is actually exactly where it should be,

between what it is,

an actual science,

and between what it describes,

the astronomical terms.

And we could go on and on with this.

Actually, if you stare
at this for a while,

and you just build random trajectories,

you will see that it actually feels
a bit like doing poetry.

And this is because, in a way,

walking in this space
is like walking in the mind.

And the last thing

is that this algorithm also identifies
what are our intuitions,

of which words should lead
in the neighborhood of introspection.

So for instance,

words such as “self,” “guilt,”
“reason,” “emotion,”

are very close to “introspection,”

but other words,

such as “red,” “football,”
“candle,” “banana,”

are just very far away.

And so once we’ve built the space,

the question of the history
of introspection,

or of the history of any concept

which before could seem abstract
and somehow vague,

becomes concrete –

becomes amenable to quantitative science.

All that we have to do is take the books,

we digitize them,

and we take this stream
of words as a trajectory

and project them into the space,

and then we ask whether this trajectory
spends significant time

circling closely to the concept
of introspection.

And with this,

we could analyze
the history of introspection

in the ancient Greek tradition,

for which we have the best
available written record.

So what we did is we took all the books –

we just ordered them by time –

for each book we take the words

and we project them to the space,

and then we ask for each word
how close it is to introspection,

and we just average that.

And then we ask whether,
as time goes on and on,

these books get closer,
and closer and closer

to the concept of introspection.

And this is exactly what happens
in the ancient Greek tradition.

So you can see that for the oldest books
in the Homeric tradition,

there is a small increase with books
getting closer to introspection.

But about four centuries before Christ,

this starts ramping up very rapidly
to an almost five-fold increase

of books getting closer,
and closer and closer

to the concept of introspection.

And one of the nice things about this

is that now we can ask

whether this is also true
in a different, independent tradition.

So we just ran this same analysis
on the Judeo-Christian tradition,

and we got virtually the same pattern.

Again, you see a small increase
for the oldest books in the Old Testament,

and then it increases much more rapidly

in the new books of the New Testament.

And then we get the peak of introspection

in “The Confessions of Saint Augustine,”

about four centuries after Christ.

And this was very important,

because Saint Augustine
had been recognized by scholars,

philologists, historians,

as one of the founders of introspection.

Actually, some believe him to be
the father of modern psychology.

So our algorithm,

which has the virtue
of being quantitative,

of being objective,

and of course of being extremely fast –

it just runs in a fraction of a second –

can capture some of the most
important conclusions

of this long tradition of investigation.

And this is in a way
one of the beauties of science,

which is that now this idea
can be translated

and generalized to a whole lot
of different domains.

So in the same way that we asked
about the past of human consciousness,

maybe the most challenging question
we can pose to ourselves

is whether this can tell us something
about the future of our own consciousness.

To put it more precisely,

whether the words we say today

can tell us something
of where our minds will be in a few days,

in a few months

or a few years from now.

And in the same way many of us
are now wearing sensors

that detect our heart rate,

our respiration,

our genes,

on the hopes that this may
help us prevent diseases,

we can ask whether monitoring
and analyzing the words we speak,

we tweet, we email, we write,

can tell us ahead of time whether
something may go wrong with our minds.

And with Guillermo Cecchi,

who has been my brother in this adventure,

we took on this task.

And we did so by analyzing
the recorded speech of 34 young people

who were at a high risk
of developing schizophrenia.

And so what we did is,
we measured speech at day one,

and then we asked whether the properties
of the speech could predict,

within a window of almost three years,

the future development of psychosis.

But despite our hopes,

we got failure after failure.

There was just not enough
information in semantics

to predict the future
organization of the mind.

It was good enough

to distinguish between a group
of schizophrenics and a control group,

a bit like we had done
for the ancient texts,

but not to predict the future
onset of psychosis.

But then we realized

that maybe the most important thing
was not so much what they were saying,

but how they were saying it.

More specifically,

it was not in which semantic
neighborhoods the words were,

but how far and fast they jumped

from one semantic neighborhood
to the other one.

And so we came up with this measure,

which we termed semantic coherence,

which essentially measures the persistence
of speech within one semantic topic,

within one semantic category.

And it turned out to be
that for this group of 34 people,

the algorithm based on semantic
coherence could predict,

with 100 percent accuracy,

who developed psychosis and who will not.

And this was something
that could not be achieved –

not even close –

with all the other
existing clinical measures.

And I remember vividly,
while I was working on this,

I was sitting at my computer

and I saw a bunch of tweets by Polo –

Polo had been my first student
back in Buenos Aires,

and at the time
he was living in New York.

And there was something in this tweets –

I could not tell exactly what
because nothing was said explicitly –

but I got this strong hunch,

this strong intuition,
that something was going wrong.

So I picked up the phone,
and I called Polo,

and in fact he was not feeling well.

And this simple fact,

that reading in between the lines,

I could sense,
through words, his feelings,

was a simple, but very
effective way to help.

What I tell you today

is that we’re getting
close to understanding

how we can convert this intuition
that we all have,

that we all share,

into an algorithm.

And in doing so,

we may be seeing in the future
a very different form of mental health,

based on objective, quantitative
and automated analysis

of the words we write,

of the words we say.

Gracias.

(Applause)

我们有历史记录，可以让
我们了解古希腊人的穿着、

生活方式、

战斗方式……

但他们是怎么想的？

一个自然的想法是，
人类思想的最深层次——

我们想象

、有意识

、做梦的能力——

一直都是一样的。

另一种可能性

是，塑造我们文化的社会变革

也可能改变
了人类思想的结构柱。对此

我们可能都有不同的
看法。

实际上，这是一场长期存在的
哲学辩论。

但是，这个问题
甚至适合科学吗？

在这里，我想

建议，我们可以用几块砖块
重建古希腊城市的样子

，一种文化的文字
是考古记录

，是人类思想的化石。

事实上，朱利安·杰恩斯在

对一些最古老
的人类文化书籍进行某种形式的心理分析

后，在 70 年代提出
了一个非常狂野和激进的假设

：仅在 3000 年前，

人类就是
我们今天所说的精神分裂症。

他提出这一主张

的基础是，
这些书中描述的第一批人类

在不同的传统
和世界的不同地方的行为始终如一，

就好像他们听到并服从

他们
认为来自神灵

或来自神灵的声音。缪斯们……

今天我们称之为幻觉。

只有到那时，随着时间的推移，

他们才开始认识
到他们是创造者，

是这些内心声音的拥有者。

有了这个，他们获得了自省：

思考自己想法的能力。

所以杰恩斯的理论是，意识，

至少在我们今天感知它的方式上

，我们觉得我们
是我们自己存在的飞行员——

是一个相当近期的文化发展。

这个理论非常壮观，

但它有一个明显的问题

，那就是它只建立在
几个非常具体的例子之上。

所以问题是

，
仅在大约 3000 年前人类历史上建立的内省理论是否

可以用定量
和客观的方式来检验。

如何解决这个问题
是非常明显的。

不像柏拉图有一天醒来
，然后他写道，

“你好，我是柏拉图

，从今天开始，我有
一个完全内省的意识。”

（笑声

）这实际上告诉我们
问题的本质是什么。

我们需要找到
一个从未说过的概念的出现。在我们要分析的书中，

内省这个词
一次都没有出现过

。

所以我们解决这个问题的方法
是建立单词空间。

这是一个巨大的空间
，其中包含所有

单词，任何两个单词之间的距离都表明
它们之间的关系

有多
密切。

例如，

您希望单词“dog”和“cat
”非常接近，

但单词“grapefruit”和“logarithm
”则非常远离。

这
对于空间内的任何两个单词都必须是正确的。

我们可以通过不同的方式
来构建单词空间。

一个是问专家，

有点像我们用字典做的。

另一种可能性

是遵循一个简单的假设
，即当两个单词相关时，

它们往往会出现在相同的句子

、相同的段落

、相同的文档中，

这
比纯粹偶然的预期要多。

这个简单的假设，

这个简单的方法，

加上一些

与

这是一个非常复杂
和高维空间这一事实有关的计算技巧，

结果证明是非常有效的。

只是为了让您
了解它的效果如何，

这是我们在
分析一些熟悉的单词时得到的结果。

你可以首先

看到单词自动组织
成语义邻域。

所以你得到了水果、身体部位

、电脑部件
、科学术语等等。

该算法还
确定我们以层次结构组织概念。

例如，

您可以看到科学
术语分为

天文学和物理术语的两个子类别。

然后是非常好的事情。

例如，天文学这个词，

它在哪里看起来有点奇怪

，实际上它应该在它应该在的地方，

介于它是什么，

一门实际的科学，

以及它所描述的

，天文术语之间。

我们可以继续这样做。

实际上，如果你
盯着它看一会儿

，你只是随机地构建轨迹，

你会发现它实际上
有点像写诗。

这是因为，在某种程度上，

在这个空间
中行走就像在头脑中行走。

最后一件事

是，该算法还识别
出我们的直觉是什么

，哪些词应该引导
到自省附近。

例如，

“自我”、“内疚”、
“理性”、“情感

”等词与“内省”非常接近，

但其他词，

如“红色”、“足球”、
“蜡烛”、“ 香蕉，

”就在很远的地方。

因此，一旦我们建立了空间，内省

的历史问题
，

或者任何

以前可能看起来抽象
和模糊的概念的历史问题，

就变得具体了——

变得适合定量科学。

我们所要做的就是把书拿走，

把它们数字化

，我们把这个
词流作为一个轨迹

，把它们投射到空间中

，然后我们问这个轨迹是否
花费了大量时间来

密切关注内省的
概念。

有了这个，

我们可以分析

古希腊传统中内省

的历史，我们有最好
的书面记录。

所以我们所做的是我们拿走了所有的书——

我们只是按时间排序——

对于每本书，我们提取单词

并将它们投影到空间，

然后我们询问每个单词
与内省的距离有多近，

并且我们只是平均一下。

然后我们问，
随着时间的推移

，这些书是否越来越

接近内省的概念。

这正是
古希腊传统中发生的事情。

所以你可以看到，对于
荷马传统中最古老的书籍，

随着书籍
越来越接近内省，有小幅增长。

但是在基督之前大约四个世纪，

这开始非常迅速
地增长到几乎增加

了五倍的书籍越来越

接近内省的概念。

关于这一点的好处之一

是，现在我们可以询问

在不同的独立传统中这是否也是正确的。

所以我们只是
对犹太-基督教传统进行了同样的分析

，我们得到了几乎相同的模式。

再一次，你看到
旧约中最古老的书卷有小幅增长，

然后

在新约的新书卷中增长得更快。

然后我们

在“圣奥古斯丁的自白”中达到了自省的顶峰，

大约在基督之后四个世纪。

这非常重要，

因为圣奥古斯丁
已经被学者、

语言学家、历史学家认可

为内省的创始人之一。

实际上，有些人认为他
是现代心理学之父。

因此，我们的算法

具有定量

、客观

、当然速度极快的优点——

它只在几分之一秒内运行——

可以捕捉

到这一长期调查传统中的一些最重要的结论 .

这在某种程度上
是科学的美妙之处之一，

那就是现在这个想法
可以被转化

和推广到
很多不同的领域。

因此，就像我们
询问人类意识的过去一样，

也许我们可以向自己提出的最具挑战性的问题

是，这是否可以告诉我们一些
关于我们自己意识的未来的事情。

更准确地说，

我们今天所说的话是否

能告诉我们
几天、几个月或几年后我们的思想会在

哪里。

同样，我们中的许多人
现在都戴着传感器

来检测我们的心率

、呼吸

和基因

，希望这可以
帮助我们预防疾病，

我们可以询问是否监控
和分析我们所说的话、

我们发推文、我们发邮件、写信，

可以提前告诉我们大脑是否会出现问题。

在这次冒险中，我和我的兄弟吉列尔莫·切奇一起

承担了这项任务。

我们通过分析
34 名精神分裂症高风险年轻人的录音来做到这一点

。

所以我们所做的是，
我们在第一天测量语音，

然后我们询问语音的特性是否
可以

在近三年的窗口内预测

精神病的未来发展。

但是，尽管我们有希望，但我们还是一次又一次地

失败了。语义中

没有足够的
信息

来预测未来
的思想组织。

区分一
组精神分裂症患者和对照组就足够了，

有点像我们对古代文献所做的那样
，

但不能预测未来
精神病的发作。

但后来我们意识到

，也许最重要
的不是他们在说什么，

而是他们怎么说。

更具体地说

，不是单词在哪个语义
邻域中，

而是它们

从一个语义邻域跳到另一个语义邻域的距离和速度有多快
。

所以我们提出了这个度量

，我们称之为语义连贯性，

它本质上是衡量
一个语义主题、

一个语义类别内语音的持久性。

事实证明
，对于这 34 个人来说，

基于语义
连贯性的算法可以

100% 准确地预测

谁患上了精神病，谁不会患上精神病。

这
是所有其他现有临床措施无法实现的

——甚至无法接近

。

我清楚地记得，
当我做这个的时候，

我坐在电脑前

，看到了 Polo 的一堆推文——

Polo 是我回到布宜诺斯艾利斯的第一个学生

，当时
他住在纽约 .

这条推文中有一些东西——

我无法确切地说出是什么，
因为没有明确说明——

但我有这种强烈的预感，

这种强烈的直觉
，有些事情出了问题。

于是我拿起电话，给波罗打了电话

，其实他感觉不舒服。

而这个简单的事实，

即在字里行间，

我可以
通过文字感受到他的感受，这

是一种简单但非常
有效的帮助方式。

今天我要告诉你的

是，我们正在
接近理解

如何
将我们所有人都拥有

、我们都共享的这种直觉

转化为算法。

在这样做的过程中，

我们可能会在未来看到
一种非常不同的心理健康形式，

基于对

我们所写单词和所说单词的客观、定量和自动分析

。

格拉西亚斯。

（掌声）