Laura Schulz The surprisingly logical minds of babies

Mark Twain summed up
what I take to be

one of the fundamental problems
of cognitive science

with a single witticism.

He said, “There’s something
fascinating about science.

One gets such wholesale
returns of conjecture

out of such a trifling
investment in fact.”

(Laughter)

Twain meant it as a joke,
of course, but he’s right:

There’s something
fascinating about science.

From a few bones, we infer
the existence of dinosuars.

From spectral lines,
the composition of nebulae.

From fruit flies,

the mechanisms of heredity,

and from reconstructed images
of blood flowing through the brain,

or in my case, from the behavior
of very young children,

we try to say something about
the fundamental mechanisms

of human cognition.

In particular, in my lab in the Department
of Brain and Cognitive Sciences at MIT,

I have spent the past decade
trying to understand the mystery

of how children learn so much
from so little so quickly.

Because, it turns out that
the fascinating thing about science

is also a fascinating
thing about children,

which, to put a gentler
spin on Mark Twain,

is precisely their ability
to draw rich, abstract inferences

rapidly and accurately
from sparse, noisy data.

I’m going to give you
just two examples today.

One is about a problem of generalization,

and the other is about a problem
of causal reasoning.

And although I’m going to talk
about work in my lab,

this work is inspired by
and indebted to a field.

I’m grateful to mentors, colleagues,
and collaborators around the world.

Let me start with the problem
of generalization.

Generalizing from small samples of data
is the bread and butter of science.

We poll a tiny fraction of the electorate

and we predict the outcome
of national elections.

We see how a handful of patients
responds to treatment in a clinical trial,

and we bring drugs to a national market.

But this only works if our sample
is randomly drawn from the population.

If our sample is cherry-picked
in some way –

say, we poll only urban voters,

or say, in our clinical trials
for treatments for heart disease,

we include only men –

the results may not generalize
to the broader population.

So scientists care whether evidence
is randomly sampled or not,

but what does that have to do with babies?

Well, babies have to generalize
from small samples of data all the time.

They see a few rubber ducks
and learn that they float,

or a few balls and learn that they bounce.

And they develop expectations
about ducks and balls

that they’re going to extend
to rubber ducks and balls

for the rest of their lives.

And the kinds of generalizations
babies have to make about ducks and balls

they have to make about almost everything:

shoes and ships and sealing wax
and cabbages and kings.

So do babies care whether
the tiny bit of evidence they see

is plausibly representative
of a larger population?

Let’s find out.

I’m going to show you two movies,

one from each of two conditions
of an experiment,

and because you’re going to see
just two movies,

you’re going to see just two babies,

and any two babies differ from each other
in innumerable ways.

But these babies, of course,
here stand in for groups of babies,

and the differences you’re going to see

represent average group differences
in babies' behavior across conditions.

In each movie, you’re going to see
a baby doing maybe

just exactly what you might
expect a baby to do,

and we can hardly make babies
more magical than they already are.

But to my mind the magical thing,

and what I want you to pay attention to,

is the contrast between
these two conditions,

because the only thing
that differs between these two movies

is the statistical evidence
the babies are going to observe.

We’re going to show babies
a box of blue and yellow balls,

and my then-graduate student,
now colleague at Stanford, Hyowon Gweon,

is going to pull three blue balls
in a row out of this box,

and when she pulls those balls out,
she’s going to squeeze them,

and the balls are going to squeak.

And if you’re a baby,
that’s like a TED Talk.

It doesn’t get better than that.

(Laughter)

But the important point is it’s really
easy to pull three blue balls in a row

out of a box of mostly blue balls.

You could do that with your eyes closed.

It’s plausibly a random sample
from this population.

And if you can reach into a box at random
and pull out things that squeak,

then maybe everything in the box squeaks.

So maybe babies should expect
those yellow balls to squeak as well.

Now, those yellow balls
have funny sticks on the end,

so babies could do other things
with them if they wanted to.

They could pound them or whack them.

But let’s see what the baby does.

(Video) Hyowon Gweon: See this?
(Ball squeaks)

Did you see that?
(Ball squeaks)

Cool.

See this one?

(Ball squeaks)

Wow.

Laura Schulz: Told you. (Laughs)

(Video) HG: See this one?
(Ball squeaks)

Hey Clara, this one’s for you.
You can go ahead and play.

(Laughter)

LS: I don’t even have to talk, right?

All right, it’s nice that babies
will generalize properties

of blue balls to yellow balls,

and it’s impressive that babies
can learn from imitating us,

but we’ve known those things about babies
for a very long time.

The really interesting question

is what happens when we show babies
exactly the same thing,

and we can ensure it’s exactly the same
because we have a secret compartment

and we actually pull the balls from there,

but this time, all we change
is the apparent population

from which that evidence was drawn.

This time, we’re going to show babies
three blue balls

pulled out of a box
of mostly yellow balls,

and guess what?

You [probably won’t] randomly draw
three blue balls in a row

out of a box of mostly yellow balls.

That is not plausibly
randomly sampled evidence.

That evidence suggests that maybe Hyowon
was deliberately sampling the blue balls.

Maybe there’s something special
about the blue balls.

Maybe only the blue balls squeak.

Let’s see what the baby does.

(Video) HG: See this?
(Ball squeaks)

See this toy?
(Ball squeaks)

Oh, that was cool. See?
(Ball squeaks)

Now this one’s for you to play.
You can go ahead and play.

(Fussing)
(Laughter)

LS: So you just saw
two 15-month-old babies

do entirely different things

based only on the probability
of the sample they observed.

Let me show you the experimental results.

On the vertical axis, you’ll see
the percentage of babies

who squeezed the ball in each condition,

and as you’ll see, babies are much
more likely to generalize the evidence

when it’s plausibly representative
of the population

than when the evidence
is clearly cherry-picked.

And this leads to a fun prediction:

Suppose you pulled just one blue ball
out of the mostly yellow box.

You [probably won’t] pull three blue balls
in a row at random out of a yellow box,

but you could randomly sample
just one blue ball.

That’s not an improbable sample.

And if you could reach into
a box at random

and pull out something that squeaks,
maybe everything in the box squeaks.

So even though babies are going to see
much less evidence for squeaking,

and have many fewer actions to imitate

in this one ball condition than in
the condition you just saw,

we predicted that babies themselves
would squeeze more,

and that’s exactly what we found.

So 15-month-old babies,
in this respect, like scientists,

care whether evidence
is randomly sampled or not,

and they use this to develop
expectations about the world:

what squeaks and what doesn’t,

what to explore and what to ignore.

Let me show you another example now,

this time about a problem
of causal reasoning.

And it starts with a problem
of confounded evidence

that all of us have,

which is that we are part of the world.

And this might not seem like a problem
to you, but like most problems,

it’s only a problem when things go wrong.

Take this baby, for instance.

Things are going wrong for him.

He would like to make
this toy go, and he can’t.

I’ll show you a few-second clip.

And there’s two possibilities, broadly:

Maybe he’s doing something wrong,

or maybe there’s something
wrong with the toy.

So in this next experiment,

we’re going to give babies
just a tiny bit of statistical data

supporting one hypothesis over the other,

and we’re going to see if babies
can use that to make different decisions

about what to do.

Here’s the setup.

Hyowon is going to try to make
the toy go and succeed.

I am then going to try twice
and fail both times,

and then Hyowon is going
to try again and succeed,

and this roughly sums up my relationship
to my graduate students

in technology across the board.

But the important point here is
it provides a little bit of evidence

that the problem isn’t with the toy,
it’s with the person.

Some people can make this toy go,

and some can’t.

Now, when the baby gets the toy,
he’s going to have a choice.

His mom is right there,

so he can go ahead and hand off the toy
and change the person,

but there’s also going to be
another toy at the end of that cloth,

and he can pull the cloth towards him
and change the toy.

So let’s see what the baby does.

(Video) HG: Two, three. Go!
(Music)

LS: One, two, three, go!

Arthur, I’m going to try again.
One, two, three, go!

YG: Arthur, let me try again, okay?

One, two, three, go!
(Music)

Look at that. Remember these toys?

See these toys? Yeah, I’m going
to put this one over here,

and I’m going to give this one to you.

You can go ahead and play.

LS: Okay, Laura, but of course,
babies love their mommies.

Of course babies give toys
to their mommies

when they can’t make them work.

So again, the really important question
is what happens when we change

the statistical data ever so slightly.

This time, babies are going to see the toy
work and fail in exactly the same order,

but we’re changing
the distribution of evidence.

This time, Hyowon is going to succeed
once and fail once, and so am I.

And this suggests it doesn’t matter
who tries this toy, the toy is broken.

It doesn’t work all the time.

Again, the baby’s going to have a choice.

Her mom is right next to her,
so she can change the person,

and there’s going to be another toy
at the end of the cloth.

Let’s watch what she does.

(Video) HG: Two, three, go!
(Music)

Let me try one more time.
One, two, three, go!

Hmm.

LS: Let me try, Clara.

One, two, three, go!

Hmm, let me try again.

One, two, three, go!
(Music)

HG: I’m going
to put this one over here,

and I’m going to give this one to you.

You can go ahead and play.

(Applause)

LS: Let me show you
the experimental results.

On the vertical axis,
you’ll see the distribution

of children’s choices in each condition,

and you’ll see that the distribution
of the choices children make

depends on the evidence they observe.

So in the second year of life,

babies can use a tiny bit
of statistical data

to decide between two
fundamentally different strategies

for acting in the world:

asking for help and exploring.

I’ve just shown you
two laboratory experiments

out of literally hundreds in the field
that make similar points,

because the really critical point

is that children’s ability
to make rich inferences from sparse data

underlies all the species-specific
cultural learning that we do.

Children learn about new tools
from just a few examples.

They learn new causal relationships
from just a few examples.

They even learn new words,
in this case in American Sign Language.

I want to close with just two points.

If you’ve been following my world,
the field of brain and cognitive sciences,

for the past few years,

three big ideas will have come
to your attention.

The first is that this is
the era of the brain.

And indeed, there have been
staggering discoveries in neuroscience:

localizing functionally specialized
regions of cortex,

turning mouse brains transparent,

activating neurons with light.

A second big idea

is that this is the era of big data
and machine learning,

and machine learning promises
to revolutionize our understanding

of everything from social networks
to epidemiology.

And maybe, as it tackles problems
of scene understanding

and natural language processing,

to tell us something
about human cognition.

And the final big idea you’ll have heard

is that maybe it’s a good idea we’re going
to know so much about brains

and have so much access to big data,

because left to our own devices,

humans are fallible, we take shortcuts,

we err, we make mistakes,

we’re biased, and in innumerable ways,

we get the world wrong.

I think these are all important stories,

and they have a lot to tell us
about what it means to be human,

but I want you to note that today
I told you a very different story.

It’s a story about minds and not brains,

and in particular, it’s a story
about the kinds of computations

that uniquely human minds can perform,

which involve rich, structured knowledge
and the ability to learn

from small amounts of data,
the evidence of just a few examples.

And fundamentally, it’s a story
about how starting as very small children

and continuing out all the way
to the greatest accomplishments

of our culture,

we get the world right.

Folks, human minds do not only learn
from small amounts of data.

Human minds think
of altogether new ideas.

Human minds generate
research and discovery,

and human minds generate
art and literature and poetry and theater,

and human minds take care of other humans:

our old, our young, our sick.

We even heal them.

In the years to come, we’re going
to see technological innovations

beyond anything I can even envision,

but we are very unlikely

to see anything even approximating
the computational power of a human child

in my lifetime or in yours.

If we invest in these most powerful
learners and their development,

in babies and children

and mothers and fathers

and caregivers and teachers

the ways we invest in our other
most powerful and elegant forms

of technology, engineering and design,

we will not just be dreaming
of a better future,

we will be planning for one.

Thank you very much.

(Applause)

Chris Anderson: Laura, thank you.
I do actually have a question for you.

First of all, the research is insane.

I mean, who would design
an experiment like that? (Laughter)

I’ve seen that a couple of times,

and I still don’t honestly believe
that that can truly be happening,

but other people have done
similar experiments; it checks out.

The babies really are that genius.

LS: You know, they look really impressive
in our experiments,

but think about what they
look like in real life, right?

It starts out as a baby.

Eighteen months later,
it’s talking to you,

and babies' first words aren’t just
things like balls and ducks,

they’re things like “all gone,”
which refer to disappearance,

or “uh-oh,” which refer
to unintentional actions.

It has to be that powerful.

It has to be much more powerful
than anything I showed you.

They’re figuring out the entire world.

A four-year-old can talk to you
about almost anything.

(Applause)

CA: And if I understand you right,
the other key point you’re making is,

we’ve been through these years
where there’s all this talk

of how quirky and buggy our minds are,

that behavioral economics
and the whole theories behind that

that we’re not rational agents.

You’re really saying that the bigger
story is how extraordinary,

and there really is genius there
that is underappreciated.

LS: One of my favorite
quotes in psychology

comes from the social
psychologist Solomon Asch,

and he said the fundamental task
of psychology is to remove

the veil of self-evidence from things.

There are orders of magnitude
more decisions you make every day

that get the world right.

You know about objects
and their properties.

You know them when they’re occluded.
You know them in the dark.

You can walk through rooms.

You can figure out what other people
are thinking. You can talk to them.

You can navigate space.
You know about numbers.

You know causal relationships.
You know about moral reasoning.

You do this effortlessly,
so we don’t see it,

but that is how we get the world right,
and it’s a remarkable

and very difficult-to-understand
accomplishment.

CA: I suspect there are people
in the audience who have

this view of accelerating
technological power

who might dispute your statement
that never in our lifetimes

will a computer do what
a three-year-old child can do,

but what’s clear is that in any scenario,

our machines have so much to learn
from our toddlers.

LS: I think so. You’ll have some
machine learning folks up here.

I mean, you should never bet
against babies or chimpanzees

or technology as a matter of practice,

but it’s not just
a difference in quantity,

it’s a difference in kind.

We have incredibly powerful computers,

and they do do amazingly
sophisticated things,

often with very big amounts of data.

Human minds do, I think,
something quite different,

and I think it’s the structured,
hierarchical nature of human knowledge

that remains a real challenge.

CA: Laura Schulz, wonderful
food for thought. Thank you so much.

LS: Thank you.
(Applause)