A Rosetta Stone for the Indus script Rajesh Rao
[Music]
[Applause]
I’d like to begin with a thought
experiment imagine that it’s 4,000 years
into the future civilization as we know
it has ceased to exist no books no
electronic devices no Facebook or
Twitter all knowledge of the English
language and the English alphabet has
been lost
now imagine archaeologists digging
through the rubble of one of our cities
what might they find well perhaps some
rectangular pieces of plastic with
strange symbols on them perhaps some
circular pieces of metal maybe some
cylindrical containers with some symbols
on them and perhaps when archaeologists
becomes an instant celebrity when she
discovers buried in the hills somewhere
in North America masu versions of these
same symbols now let’s ask ourselves
what could such artifacts say about us
to people 4,000 years into the future
now this is no hypothetical question in
fact this is exactly the kind of
question we’re face to it when we try to
understand the Indus Valley Civilisation
which existed 4000 years ago now the
Indus civilization was roughly
contemporaneous with the much better
known Egyptian and the Mesopotamian
civilizations was actually much larger
than either of these two civilizations
so it occupies an area of approximately
1 million square kilometers covering
what is now Pakistan northwestern India
and parts of Afghanistan and Iran now
given that it was such a vast
civilization you might expect to find
really powerful rulers kings and huge
monuments glorifying these powerful
kings in fact what archaeologists have
found is none of that they found small
objects such as these so here’s an
example of one of these objects well
obviously there’s a replica but who is
this person a king a god a priest or
perhaps an ordinary person like you or
me we don’t know
but the Indus people also left behind
artifacts with writing on them well no
not pieces of plastic but stone seals
copper tablets pottery and surprisingly
one large signboard which was found
buried near the gate of a city now we
don’t know if it says Hollywood or even
Bollywood for that matter in fact we
don’t even know what any of these
objects say and that’s because the Indus
script is undeciphered we don’t know
what any of these symbols mean
now the symbols are most commonly found
on seals so you see up there one such
object it’s the square object with the
unicorn like animal on it now that’s the
magnificent piece of art so how big do
you think that is perhaps that big that
might be that big well let me show you
so here’s a replica of one such seal
it’s only about one inch by one inch in
size pretty tiny so what were these used
for we know that these were used for
stamping clay tags that were attached to
bundles of goods that were sent from one
place to the other so you know those
packing slips you get on your FedEx
boxes so these were used to make those
kinds of packing slips now you might
wonder what these objects contain in
terms of their text so perhaps they’re
the name of the sender or some
information about the goods that are
being sent from one place to the other
we don’t know we need to decipher the
script to answer that question now
deciphering the script is not just an
intellectual puzzle it’s actually become
a question that’s become deeply
intertwined with the politics and the
cultural history of South Asia in fact
the script has become a battleground of
sorts between three different groups of
people so first there’s a group of
people who are very passionate in their
belief that the Indus script does not
represent a language at all these people
believe that these symbols are very
similar to the kind of symbols you find
on traffic signs or the emblems you find
on shields now there’s a second group of
people who believe that the Indus script
represents an indo-european language so
if you look at a map of India today
you’ll see that most of the languages
spoken in North India belong to the
indo-european language family so some
people believe that the Indus script
represents an ancient indo-european
language such as Sanskrit now there’s a
scoop of people who believe that the
Indus people but the ancestors of people
living in South India today now these
people believe that the Indus script
represents an ancient form of the
Dravidian language family which is the
language family spoken in much of South
India today and the proponents of this
theory point to that small pocket of
Dravidian speaking people in the North
actually near Afghanistan and they say
that perhaps sometime in the past the
region languages were spoken all over
India and that this suggests that the
Indus civilizations perhaps also
Dravidian now which of these hypotheses
can be true we don’t know but perhaps if
you decipher the script you would be
able to answer this question but
deciphering the script is a very
challenging task first there’s no
rosetta stone I don’t mean the software
I mean an ancient artifact that contains
the same text in both in known texts and
in an unknown text so we don’t have such
an artifact for the Indus script and
furthermore we don’t even know what
language they spoke and to make matters
even worse most of the texts that we
have are extremely short so as I showed
you they’re usually found on these seals
that are very very tiny and so given
these formidable obstacles one might
wonder and worried whether one will ever
be able to decipher the Indus script so
in the rest of my talk I’d like to tell
you about how I learned to stop worrying
and love the challenge posed by the
Indus script now I’ve always been
fascinated by the Indus script ever
since I read about it in a middle school
textbook and what was the fascinated
well it’s the last major undeciphered
script in the ancient world now my
career path led me to become a
computational neuroscientist so in my
day job I create computer models of the
brain to try to understand how the brain
makes predictions how the brain makes
decisions how the brain learns and so on
but in 2007 my paths crossed again with
the Indus script that’s when I was in
India and I had the wonderful
opportunity to meet with some Indian
scientists who were using computer
models to try to analyze the script and
so it was then that I realized it was an
opportunity for me to collaborate with
these scientists and so I jumped at the
opportunity and I’d like to describe
some of the results that we’ve found or
better yet let’s talk collectively
decipher you ready
the first thing that you need to do when
you have an undersized
strip is try to figure out the direction
of riding so here are two tacks that
contain some symbols on them so can you
tell me if the direction of writing is
right to left or left to right I’ll give
you a couple of seconds okay right to
left how many okay
left to right oh it’s almost 50/50 okay
so the answer is if you look at the
left-hand side of the two texts you’ll
notice that there’s a cramping of signs
and it seems like four thousand years
ago when the scribe was writing from
right to left
they ran out of space and so they had to
cram the signs and one of the signs is
also below the the text on the top and
so this suggests the direction of
writing is probably from right to left
and so that’s one of the first things we
know the directionality is a very key
aspect of linguistics scripts and the
Indus script now has this particular
property what are the properties of
language does it show so languages
contain patterns right so if I give you
the letter Q and I ask you to predict
the next letter what do you think that
would be most of you said you which is
right now if I ask you to predict one
more letter what do you think that would
be another several possibilities
e it could be I it could be a but
certainly not B C or D right now the
Indus script also exhibits similar kinds
of patterns so it’s a lot of texts that
start with this diamond-shaped symbol
and this in turn tends to be followed by
this quotation marks like symbol and
this is very similar to a Q a new
example this symbol can in turn be
followed by these fish-like symbols and
some other signs but never by these
other signs at the bottom and
furthermore there’s some signs that
really prefer the end of texts such as
this er shaped sign and this sign in
fact happens to be also the most
frequently occurring sign in this script
now given such patterns here was her
idea right so the idea was to use a
computer to learn these patterns and so
if we give the computer the existing
texts and the computer learned a
statistical model of which symbols tend
to occur together and with symbols tend
to follow each other
now given the computer model we can test
the model by essentially quizzing it so
we could deliberately erase some symbols
and we can ask it to predict the missing
symbols so here are some examples
so you may regard this as perhaps the
most ancient game of the Wheel of
Fortune so what we found was that the
computer was successful in 75% of the
cases and predicting the correct symbol
now in the rest of the cases typically
the second best guess from the third
best guess was the right answer
now there’s also practical use for this
particular procedure there’s a lot of
these texts that are damaged so here’s
an example of one such text and we can
use the computer model now to try to
complete this text and make a best guess
prediction so here’s an example of a
symbol that was predicted and this could
be really useful as we try to decipher
the script by generating more data that
we can analyze now here’s one other
thing you could do with the computer
model
so imagine a monkey sitting at a
keyboard okay you might get a random
jumble of letters that looks like this
now such a random jumble of letters is
said to have a very high entropy this
that physics and information theory term
but just imagine it’s a very random
jumble of letters now how many of yours
ever spilled coffee on a keyboard you
might have encountered the stuck key
problem so basically the same symbol
being repeated over and over again now
this kind of a sequence is said to have
a very low entropy because there’s no
variation at all
now language on the other hand has an
intermediate level of entropy it’s
neither too rigid nor is it too random
what about the Indus script so here’s a
graph that plots the entropy is a whole
bunch of sequences so the very top you
find the uniformly random sequence which
is a random jumble of letters and
interesting we also find a DNA sequence
from the human genome and instrumental
music and both of these are very very
flexible which is why would you find
them at the very high range now the
lower end of the scale you find a rigid
sequence the sequence of all A’s and you
also find a computer program in this
case in the language Fortran which are
based really strict rules linguistic
scripts occupy the middle range now what
about the Indus script so we found that
the industry actually falls within the
range of the linguistic scripts now when
this result was first published it was
highly controversial there were people
who are raising a human cry and these
people were the ones who believed that
the Indus script does not represent
language
started to get some hate mail my
students said that I should really
seriously consider getting some
protection now who would have thought
that deciphering could be a dangerous
profession now what does this result
really show it shows that the Indus
script shares an important property of
language so as the old saying goes if it
looks like a linguistic script and it
acts like a linguistic script then
perhaps we may have a linguistic script
on our hands so what are their evidence
is there that the script could actually
encode language well linguistic scripts
can actually encode multiple languages
so for example here’s the same sentence
with written in English and the same
sentence with written and Dutch using
the same letters of the alphabet now if
you don’t know Dutch and you only know
English and I give you some words in
Dutch you’ll tell me that these words
contain some very unusual patterns some
things are not right and you say these
words are probably not English words now
the same thing also happens in the case
of the Indus script so the computer
found several texts two of them are
shown here that have very unusual
patterns so for example the first text
there’s a doubling of this jar shaped
sign now this sign is the most
frequently occurring sign in the
industry that’s only in this text that
it occurs as a doubling pair so why is
that the case we went back and looked at
where these particular texts were found
it turns out that they were found very
very far away from the Indus Valley they
were found in present-day Iraq and Iran
and why would they found there so what
haven’t told you is that the Indus
people were very very enterprising they
used to trade with people pretty far
away from where they lived and so in
this case they were traveling by sea all
the way to Mesopotamia present-day Iraq
and what seems to happen here is that
the Indus traders the the merchants were
using the scrip to write a foreign
language it’s just like our English and
Dutch example and that would explain why
we have these strange patterns that are
very different from the kinds of
patterns you see in the texts that are
found within the Indus Valley so this
suggests the same script the Indus
script could be used to write different
languages the results we have so far
seem to point to the conclusion that the
Indus script probably does flips in
language so if it does rips in language
then how do we read the symbol that’s
the next big challenge so you’ll notice
that many of the symbols look like
pictures of humans of insects of fishes
of birds so most ancient scripts use the
rebus principle which is using pictures
to riffs and words so as an example
here’s a word can you write it using
pictures I’ll give you a couple of
seconds got it okay great so here’s my
solution so you could use the picture of
a bee forward by picture of a leaf and
that’s belief right it could be other
solutions now in the case of the in
descript the problem is the reverse so
you have to figure out the sounds of
each of these pictures such that the
entire sequence makes sense so this is
just like a crossword puzzle except that
this is the mother of all crossword
puzzles because the stakes are so high
if you solve it now my colleagues here
what the Mahadevan and Oscar parpola
have been making some headway on this
particular problem and I’d like to give
you a quick example of papoulas work so
here’s a really short text it contains
seven vertical strokes followed by this
fish like sign and I already mentioned
that these seals were used for stamping
clay tags and they’re attached to
bundles of goods so it’s quite likely
that these texts at least some of them
contain names of merchants and it turns
out that in India
there’s a long tradition of names being
based on horoscopes and the and star
constellations present at the time of
birth in Dravidian languages the word
for fish is mean which happens to sound
just like the word for star and so seven
stars would stand for a lumen which is
the Dravidian word for the Big Dipper
Star constellation now similarly there’s
another sequence of six jars and that
translates to arrow mean which is the
old Dravidian name for the Star
constellation Pleiades and finally
there’s other combinations such as this
fish sign with something that looks like
a roof on top of it and that could be
translated into may mean which is the
old Dravidian name for the planet Saturn
so this is pretty exciting it looks like
we’re getting somewhere but does this
prove that the these seals contain
Dravidian names based on planets and
star constellation
well not yet so we cannot we have no way
of validating these particular readings
but if more and more of these readings
start making sense and if more if longer
and longer sequences are appeared to be
correct then we know that we’re on the
right track so today we can write a word
such as Ted in Egyptian hieroglyphics
and in the cuneiform script because both
of these were deciphered in the 19th
century the decipherment of these two
scripts enables these civilizations to
speak to us again directly now the
Mayans started speaking to us in the
20th century but the Indus civilization
remains silent so why should we care
well the Indus civilization does not
belong to just the South Indians or the
North Indians or the Pakistanis it
belongs to all of us so these are our
ancestors yours and mine they were
silenced by an unfortunate accident of
history now if you decipher the script
you would enable them to speak to us
again so what would they tell us what
would we find out about them about us I
can’t wait to find out thank you
[Applause]