The Turing test Can a computer pass for a human Alex Gendler

What is consciousness?

Can an artificial machine really think?

Does the mind just consist of neurons
in the brain,

or is there some intangible spark
at its core?

For many, these have been
vital considerations

for the future of artificial intelligence.

But British computer scientist Alan Turing
decided to disregard all these questions

in favor of a much simpler one:

can a computer talk like a human?

This question led to an idea for measuring
aritificial intelligence

that would famously come to be known
as the Turing test.

In the 1950 paper, “Computing Machinery
and Intelligence,”

Turing proposed the following game.

A human judge has a text conversation
with unseen players

and evaluates their responses.

To pass the test, a computer must
be able to replace one of the players

without substantially
changing the results.

In other words, a computer would be
considered intelligent

if its conversation couldn’t be easily
distinguished from a human’s.

Turing predicted that by the year 2000,

machines with 100 megabytes of memory
would be able to easily pass his test.

But he may have jumped the gun.

Even though today’s computers
have far more memory than that,

few have succeeded,

and those that have done well

focused more on finding clever ways
to fool judges

than using overwhelming computing power.

Though it was never subjected
to a real test,

the first program with
some claim to success was called ELIZA.

With only a fairly short
and simple script,

it managed to mislead many people
by mimicking a psychologist,

encouraging them to talk more

and reflecting their own questions
back at them.

Another early script PARRY
took the opposite approach

by imitating a paranoid schizophrenic

who kept steering the conversation
back to his own preprogrammed obsessions.

Their success in fooling people
highlighted one weakness of the test.

Humans regularly attribute intelligence
to a whole range of things

that are not actually intelligent.

Nonetheless, annual competitions
like the Loebner Prize,

have made the test more formal

with judges knowing ahead of time

that some of their conversation partners
are machines.

But while the quality has improved,

many chatbot programmers have used
similar strategies to ELIZA and PARRY.

1997’s winner Catherine

could carry on amazingly focused
and intelligent conversation,

but mostly if the judge wanted
to talk about Bill Clinton.

And the more recent winner
Eugene Goostman

was given the persona of a
13-year-old Ukrainian boy,

so judges interpreted its nonsequiturs
and awkward grammar

as language and culture barriers.

Meanwhile, other programs like Cleverbot
have taken a different approach

by statistically analyzing huge databases
of real conversations

to determine the best responses.

Some also store memories
of previous conversations

in order to improve over time.

But while Cleverbot’s individual responses
can sound incredibly human,

its lack of a consistent personality

and inability to deal
with brand new topics

are a dead giveaway.

Who in Turing’s day could have predicted
that today’s computers

would be able to pilot spacecraft,

perform delicate surgeries,

and solve massive equations,

but still struggle with
the most basic small talk?

Human language turns out to be
an amazingly complex phenomenon

that can’t be captured by even
the largest dictionary.

Chatbots can be baffled by simple pauses,
like “umm…”

or questions with no correct answer.

And a simple conversational sentence,

like, “I took the juice out of the fridge
and gave it to him,

but forgot to check the date,”

requires a wealth of underlying knowledge
and intuition to parse.

It turns out that simulating
a human conversation

takes more than just increasing
memory and processing power,

and as we get closer to Turing’s goal,

we may have to deal with all those big
questions about consciousness after all.