How do our brains process speech Gareth Gaskell

The average 20 year old knows
between 27,000 and 52,000 different words.

By age 60, that number averages
between 35,000 and 56,000.

Spoken out loud, most of these words
last less than a second.

So with every word, the brain
has a quick decision to make:

which of those thousands of options
matches the signal?

About 98% of the time, the brain chooses
the correct word.

But how?

Speech comprehension is different
from reading comprehension,

but it’s similar to sign language
comprehension—

though spoken word recognition
has been studied more than sign language.

The key to our ability
to understand speech

is the brain’s role
as a parallel processor,

meaning that it can do multiple
different things at the same time.

Most theories assume
that each word we know

is represented by a separate processing
unit that has just one job:

to assess the likelihood of incoming
speech matching that particular word.

In the context of the brain,
the processing unit that represents a word

is likely a pattern of firing activity
across a group of neurons

in the brain’s cortex.

When we hear the beginning of a word,

several thousand such units
may become active,

because with just the beginning
of a word,

there are many possible matches.

Then, as the word goes on,
more and more units register

that some vital piece of information
is missing and lose activity.

Possibly well before the end of the word,

just one firing pattern remains active,
corresponding to one word.

This is called the “recognition point.”

In the process of honing in on one word,

the active units suppress
the activity of others,

saving vital milliseconds.

Most people can comprehend
up to about 8 syllables per second.

Yet, the goal is not only
to recognize the word,

but also to access its stored meaning.

The brain accesses many possible meanings
at the same time,

before the word has been fully identified.

We know this from studies which show
that even upon hearing a word fragment—

like “cap”—

listeners will start to register
multiple possible meanings,

like captain or capital,
before the full word emerges.

This suggests that every time
we hear a word

there’s a brief explosion of meanings
in our minds,

and by the recognition point the brain
has settled on one interpretation.

The recognition process moves
more rapidly

with a sentence that gives us context
than in a random string of words.

Context also helps guide us towards
the intended meaning of words

with multiple interpretations,
like “bat,” or “crane,”

or in cases of homophones
like “no” or “know.”

For multilingual people, the language
they are listening to is another cue,

used to eliminate potential words
that don’t match the language context.

So, what about adding completely
new words to this system?

Even as adults, we may come across
a new word every few days.

But if every word is represented
as a fine-tuned pattern of activity

distributed over many neurons,

how do we prevent new words
from overwriting old ones?

We think that to avoid this problem,

new words are initially stored in a part
of the brain called the hippocampus,

well away from the main store
of words in the cortex,

so they don’t share neurons
with others words.

Then, over multiple nights of sleep,

the new words gradually transfer
over and interweave with old ones.

Researchers think this gradual
acquisition process

helps avoid disrupting existing words.

So in the daytime,

unconscious activity generates explosions
of meaning as we chat away.

At night, we rest, but our brains
are busy integrating new knowledge

into the word network.

When we wake up, this process ensures
that we’re ready

for the ever-changing world of language.

20 岁的人平均
知道 27,000 到 52,000 个不同的单词。

到 60 岁时,这个数字平均
在 35,000 到 56,000 之间。

大声说出来,这些话大部分
持续不到一秒钟。

因此,对于每个单词,大脑
都会快速做出决定:

这数千个选项
中的哪一个与信号相匹配?

大约 98% 的时间,大脑会
选择正确的单词。

但是怎么做?

语音理解
与阅读理解不同,

但与手语理解相似——

尽管口语识别
的研究比手语更多。

我们理解语音的能力的关键

是大脑
作为并行处理器的作用,

这意味着它可以同时做多种
不同的事情。

大多数理论假设
我们知道的每个单词

都由一个单独的处理单元表示,该处理
单元只有一项工作

:评估传入
语音与该特定单词匹配的可能性。

在大脑的上下文中
,代表一个单词的处理单元

很可能是
大脑皮层中一组神经元的放电活动模式

当我们听到一个词的开头时,可能会激活

数千个这样的单元

因为只要
一个词的开头,

就有很多可能的匹配项。

然后,随着消息的继续,
越来越多的单位记录

到某些重要
信息丢失并失去活动。

可能早在单词结束之前,

只有一种触发模式保持活动状态,
对应于一个单词。

这被称为“识别点”。

在一个词的磨练过程中

,活动单元抑制
了其他单元的活动,

从而节省了重要的毫秒数。

大多数人
每秒最多可以理解 8 个音节。

然而,目标不仅
是识别单词,

而且是访问其存储的含义。

在单词被完全识别之前,大脑会同时访问许多可能的含义

我们从研究中了解到这一点,这些研究表明
,即使听到一个词片段——

比如“cap”——

听众也会在完整的词出现之前开始记录
多种可能的含义

,比如船长或资本

这表明,每次
我们听到一个词时

,我们的脑海中都会有短暂的意义爆炸,

并且到识别点时,大脑
已经确定了一种解释。

识别过程在

一个给我们上下文的句子中
比在一个随机的单词串中移动得更快。

上下文还有助于引导我们了解具有多种解释
的单词的预期含义


例如“bat”或“crane”,

或者在
“no”或“know”等同音词的情况下。

对于多语种的人来说,
他们正在听的语言是另一个提示,

用于消除
与语言上下文不匹配的潜在单词。

那么,
向这个系统添加全新的单词呢?

即使是成年人,我们也可能
每隔几天就会遇到一个新词。

但是,如果每个单词都被表示

分布在许多神经元上的微调活动模式,

那么我们如何防止新
单词覆盖旧单词呢?

我们认为,为了避免这个问题,

新词最初存储在
大脑中称为海马体的部分

,远离
皮层中的主要词库,

因此它们不会
与其他词共享神经元。

然后,经过几个晚上的睡眠

,新词逐渐
转移并与旧词交织在一起。

研究人员认为,这种渐进式的
获取过程

有助于避免破坏现有单词。

所以在白天,当我们聊天时,

无意识的活动会产生
意义的爆炸。

晚上,我们休息,但我们的
大脑忙于将新知识

整合到单词网络中。

当我们醒来时,这个过程可以
确保我们

为不断变化的语言世界做好准备。