The birth of a word Deb Roy
[Music]
[Music]
imagine if you could record your life
everything you said everything you did
available in a perfect memory store at
your fingertips so you could go back and
find memorable moments and relive them
or sift through traces of time and
discover patterns in your own life that
previously had gone undiscovered well
that’s exactly the journey that my
family began five and a half years ago
this is my wife and collaborator rupal
and on this day at this moment we walked
into the house with their first child
our beautiful baby boy and we walked
into a house with a very special home
video recording system this moment and
thousands of other moments special for
us were captured in our home because in
every room in the house if you looked up
you’d see a camera and a microphone and
if you look down you get this bird’s-eye
view of the room here’s our living room
the baby bedroom kitchen dining room and
the rest of the house and all of these
fed into a disc array that was designed
for a continuous capture so here we are
flying through a day in our home as we
move from sunlit morning through
incandescent evening and finally lights
out for the day over the course of three
years we’ve recorded eight to ten hours
a day amassing roughly a quarter million
hours of multitrack audio and video so
you’re looking at a piece of what is by
far the largest home video collection
ever made
and what this data represents for our
family at a personal level the the
impact has already been immense and
we’re still learning its value countless
moments of unsolicited natural moments
not posed moments are captured there and
we’re starting to learn how to discover
them and find them but there’s also a
scientific reason that drove this
project which was to use this kind of
natural longitudinal data to understand
the process of how a child learns
language that child being my son and so
with many privacy provisions put in
place to protect everyone who’s recorded
in the data we made elements of the data
available to my trusted research team at
MIT so we could start teasing apart
patterns in this massive data set trying
to understand the influence of social
environments on language acquisition so
we’re looking here at one of the first
things we started to do this is my wife
and I cooking breakfast in the kitchen
and as we move through space and through
time a very everyday pattern of life in
the kitchen in order to convert this
opaque 90 thousand hours of video into
something we can start to see we use
motion analysis to pull out as we move
through space and through time what we
call space-time worms and this has
become a part of our toolkit for being
able to look and see where the
activities are in the data and with it
trace the patterns of in particular
where my son moved throughout the home
so we could focus our transcription
efforts all the speech environment
around my son all the words that he
heard for myself my wife our nanny and
over time the words he began to produce
so with that technology and that data
and the ability to with machine
assistants transcribe speech we’ve now
transcribed well over seven million
words of our home transcripts and with
that let me take you now for a first
tour into the data so you’ve all I’m
sure see
time-lapse videos where a flower will
blossom as you accelerate time I’d like
you to now experience the blossoming of
a speech form my son soon after his
first birthday would say Gaga to mean
water and over the course the next half
year he slowly learned to approximate
the proper adult form water so we’re
going to cruise through half a year in
about 40 seconds
no video here so you can focus on the
sound the acoustics of a new kind of
trajectory
[Music]
so he didn’t just learn water over the
course of the 24 months the first two
years that we really focused on this is
a map of every word he learned in
chronological order and because we have
full transcripts we’ve identified each
of the 503 words that he learned to
produce by his second birthday he was an
early talker
and so we started to analyze why why
were certain words born before others
this is one of the first results that
came out of our study a little over a
year ago that really surprised us the
way to interpret this apparently simple
graph is on the vertical is an
indication of how complex caregiver
utterances are based on the length of
utterances and the vertical axis is time
and all of the data we aligned based on
the following idea every time my son
would learn a word we would trace back
and look at all of the language he heard
that contain that word and we would plot
the relative length of the utterances
and what we found was this curious
phenomena that caregiver speech would
systematically dip to a minimum making
language as simple as possible and then
slowly ascend back up in complexity and
the amazing thing was that the that
bounce that dip lined up almost
precisely with when each word was born
word after word systematically so it
appears that all three primary
caregivers myself my wife and our nanny
were systematically and I would think
subconsciously restructuring our
language to meet him at the moment of
the birth of a word and bring him gently
into more complex language and the
implications of this there are many but
one I just want to point out is that
there must be amazing feedback loops
it’s not of course my son is learning
from his linguistic environment but the
environment is learning from him that
environment people are in these type
feedback loops and creating a kind of
scaffolding that has not been noticed
until now but that’s looking at the
speech context what about the visual
context we’re now looking at think of
this as a dollhouse cutaway of the of
our
we’ve taken those circular fisheye lens
cameras and we’ve done some optical
correction and then we can bring it into
a three dimensional life so welcome to
my home this is a moment one moment
captured across multiple cameras the
reason we did this is to create the
ultimate memory machine where you can go
back and interactively fly around and
then breathe video life into this system
what I’m going to do is give you an
accelerated view of 30 minutes again of
just life in the living room that’s me
and my son on the floor and there’s
video analytics that are tracking our
movements my son is leaving red ink I’m
leaving green ink we’re now on the couch
looking out through the window at cars
passing by and finally my son playing in
a walking toy by himself
now we freeze the action 30 minutes we
turn time into the vertical axis and we
open up for a view of these interaction
traces we’ve just left behind and we see
these amazing structures these little
knots of two colors of thread we call
social hotspots the spiral thread we
call a solo hotspot and we think that
these affect the way language is learned
what we’d like to do is start
understanding the interaction between
these patterns and the language that my
son is exposed to to see if we can
predict how the structure of when words
are heard affects when they’re learned
so in other words the relationship
between words and what they’re about in
the world so here’s how we’re
approaching this in this video again my
son is being traced out he’s leaving red
ink behind and there’s our nanny by the
door
she offers water and off go the two
worms over to the kitchen to get water
and what we’ve done is used the word
water to tag that moment that bit of
activity and now we take the power of
data and take every time my son ever
heard the word water and the context he
saw it in and we use it to penetrate
through the video and find every
activity trace that Co occurred with the
instance of water and what this data
leaves in its wake is a landscape we
call these word scapes this is the word
scape for the word water and you can see
most of the action is in the kitchen
that’s where those big Peaks are over to
the left and just for contrast we can do
this with any word we can take the word
by as a goodbye
and we’re now sumed in over the entrance
to the house and we look and we find as
you’d expect a contrast in the landscape
where the word by occurs much more in a
structured way so we’re using these
structures to start predicting the order
of language acquisition and that’s your
ongoing worth now in my lab which we’re
peering into now at MIT this is at the
Media Lab this has become my favorite
way of video graphing just about any
space three of the key people in this
project Philip the camp Ronny cubot and
Brendan Roy are pictured here Philip has
been a close collaborator and all the
visualizations you’re seeing and Michael
Fleischman was another PhD student in my
lab who worked with me on this home
video analysis and he made the following
observation that just the way that we’re
analyzing how language connects to
events which provide common ground for
language that same idea we can take out
of your home Deb and we can apply it to
the world of public media and so our
effort took an unexpected turn
think of mass media as providing common
ground and you have the recipe for
taking this idea to a whole new place
we’ve started analyzing television
content using the same principles
analyzing event structure of a TV signal
episodes of shows commercials all of the
components that make up the event
structure we’re now with satellite
dishes pulling in and analyzing a good
part of all the TV being watched in the
United States and you don’t have to now
go an instrument living rooms with
microphones to get people’s
conversations you just tuned in to
publicly available social media feeds so
we’re pulling in about 3 billion
comments a month and then the magic
happens you have the event structure the
common ground that the words are about
coming out of the television feeds
you’ve got the conversations that are
about that those topics and through
semantic analysis and this is actually
real data you’re looking at from our
data our processing each yellow line is
showing a link being made between a
comment in the wild and a piece of event
structure coming out of the television
signal and the same idea now can be
built up and we get this word scape
except now words are not assembled in my
living room instead the context the
common ground the activities are the
content on television that’s driving the
conversations and so what we’re seeing
here these skyscrapers now are
commentary that are linked to content on
television same concept but looking at
communication dynamics in a different
very different sphere so fundamentally
rather than for example measuring
content based on how many people are
watching this gives us the basic data
for looking at engagement properties of
content and just like we can look at
feedback cycles and dynamics in you know
in a family we can now open up the same
concepts and look at much larger groups
of people this is a subset of data from
our database just 50 thousand out of
several million and the social graph
that connects them through publicly
available sources and if you put them on
one plane a second plane is where the
content lives so we have the programs
and the the sporting events and the
commercials and all of the link
structures that tie them together make a
Content graph and then the important
that our dimension each of the links
that you’re seeing rendered here is an
actual connection made between something
someone said and a piece of content and
there are again now tens of millions of
these links that give us the connective
tissue of social graphs and how they
relate to content and we can now start
to probe the structure in interesting
ways so if we for example trace the path
of one piece of content that drives
someone to comment on it and then we
follower that comment goes and look at
the entire social graph that becomes
activated and then trace back to see the
relationship between that social graph
and content very interesting structure
becomes visible we call this a
co-viewing clique a virtual living room
if you will and there are fascinating
dynamics at play it’s not one-way a
piece of content an event causes someone
to talk they talk to other people that
drives TuneIn behavior back into mass
media and you have these cycles that
drive the overall behavior another
example very different another actual
person in our database and we’re finding
at least hundreds if not thousands of
these we’ve given this person a name
this is a pro amateur or pro a media
critic who has this high fan out race a
lot of people are following this person
very influential and they have a
propensity to talk about what’s on TV so
this person is a key link in connecting
mass media and social media together one
last example from this data sometimes
it’s actually the piece of content that
is special so if we go and look at this
piece of content President Obama’s State
of the Union address from just a few
weeks ago and look at what we find in in
the same data set at the same scale the
engagement properties of this piece of
content are truly remarkable a nation
exploding in conversation in real time
in response to what’s on on the
broadcast and of course through all of
these lines are flowing unstructured
language we can x-ray and get a
real-time pulse of a nation real-time
sand
of the social reactions in the different
circuits in the social graph being
activated by content so to summarize the
idea is this as our world becomes
increasingly instrumented and we have
the capabilities to collect and connect
the dots between what people are saying
in the context they’re saying and what’s
emerging is an ability to see new social
structures and dynamics that have
previously not been seen it’s like
building a microscope or telescope and
revealing new structures about our own
behavior around communication and I
think the implications here are profound
whether it’s for science
for commerce for government or perhaps
most of all for us as individuals and so
just to return to my son when I was
preparing this talk he was looking over
my shoulder and I showed him the clips I
was gonna show to you today and I asked
him for permission granted
and and then I went on to reflect isn’t
it amazing
this entire database all these
recordings I’m gonna hand up to you and
to your sister who arrived two years
later and you guys are gonna be able to
go back and re-experience moments that
you could never with your biological
memory possibly remember the way you can
now and he was quiet for a moment I
thought what am I thinking he’s he’s
five years old he’s not gonna understand
this and just as I was having that
thought he looked up at me and said so
that when I grow up I can show this to
my kids and I thought wow this is this
is powerful stuff so I want to leave you
with one last memorable moment from our
family this is our the first time our
son took more than two steps at once
captured on film and I really want you
to focus on something as I take you
through it’s a cluttered environment its
natural life my mother’s in the kitchen
cooking and of all places in the hallway
I realize he’s about to do it about to
take more than two steps and so you hear
me encouraging him realizing what’s
happening and then the magic happens
listen very carefully about three steps
in he realizes
something magic is happening and the
most amazing feedback loop of all kicks
in and he takes a breath in and he
whispers Wow and instinctively I echo I
echo back the same
so let’s fly back in time to that
memorable moment nice walking
[Music]
[Applause]
when I think about succeeding in
business I think there’s a couple of
considerations first and foremost is
teamwork today’s problems are just too
complicated to be solved as an
individual and the opportunity to work
with others collaboratively where you
can build on each other’s ideas I think
is particularly important I think a
second area that’s quite important is
the understanding of different
disciplines it is critical that one be
an expert in your major but it is
absolutely essential that you have the
ability to understand where other
disciplines the input from other
disciplines and be able to incorporate
that to make effective decisions and
then last but not least we are in a
global business community and so the
opportunity to understand cultures
different cultures around the world to
be able to incorporate some of the
learning from those cultures and to
incorporate that into your business
decisions is essential to success I have
three degrees from Cornell so I’d say
just about everything that has prepared
me came from Cornell and I am indebted
to the University for that experience
but as I think particularly about my
business school career I think the
exposure to a variety of disciplines the
wealth of resources across the
university
we’re exceptionally helpful