The human insights missing from big data Tricia Wang

In ancient Greece,

when anyone from slaves to soldiers,
poets and politicians,

needed to make a big decision
on life’s most important questions,

like, “Should I get married?”

or “Should we embark on this voyage?”

or “Should our army
advance into this territory?”

they all consulted the oracle.

So this is how it worked:

you would bring her a question
and you would get on your knees,

and then she would go into this trance.

It would take a couple of days,

and then eventually
she would come out of it,

giving you her predictions as your answer.

From the oracle bones of ancient China

to ancient Greece to Mayan calendars,

people have craved for prophecy

in order to find out
what’s going to happen next.

And that’s because we all want
to make the right decision.

We don’t want to miss something.

The future is scary,

so it’s much nicer
knowing that we can make a decision

with some assurance of the outcome.

Well, we have a new oracle,

and it’s name is big data,

or we call it “Watson”
or “deep learning” or “neural net.”

And these are the kinds of questions
we ask of our oracle now,

like, “What’s the most efficient way
to ship these phones

from China to Sweden?”

Or, “What are the odds

of my child being born
with a genetic disorder?”

Or, “What are the sales volume
we can predict for this product?”

I have a dog. Her name is Elle,
and she hates the rain.

And I have tried everything
to untrain her.

But because I have failed at this,

I also have to consult
an oracle, called Dark Sky,

every time before we go on a walk,

for very accurate weather predictions
in the next 10 minutes.

She’s so sweet.

So because of all of this,
our oracle is a $122 billion industry.

Now, despite the size of this industry,

the returns are surprisingly low.

Investing in big data is easy,

but using it is hard.

Over 73 percent of big data projects
aren’t even profitable,

and I have executives
coming up to me saying,

“We’re experiencing the same thing.

We invested in some big data system,

and our employees aren’t making
better decisions.

And they’re certainly not coming up
with more breakthrough ideas.”

So this is all really interesting to me,

because I’m a technology ethnographer.

I study and I advise companies

on the patterns
of how people use technology,

and one of my interest areas is data.

So why is having more data
not helping us make better decisions,

especially for companies
who have all these resources

to invest in these big data systems?

Why isn’t it getting any easier for them?

So, I’ve witnessed the struggle firsthand.

In 2009, I started
a research position with Nokia.

And at the time,

Nokia was one of the largest
cell phone companies in the world,

dominating emerging markets
like China, Mexico and India –

all places where I had done
a lot of research

on how low-income people use technology.

And I spent a lot of extra time in China

getting to know the informal economy.

So I did things like working
as a street vendor

selling dumplings to construction workers.

Or I did fieldwork,

spending nights and days
in internet cafés,

hanging out with Chinese youth,
so I could understand

how they were using
games and mobile phones

and using it between moving
from the rural areas to the cities.

Through all of this qualitative evidence
that I was gathering,

I was starting to see so clearly

that a big change was about to happen
among low-income Chinese people.

Even though they were surrounded
by advertisements for luxury products

like fancy toilets –
who wouldn’t want one? –

and apartments and cars,

through my conversations with them,

I found out that the ads
the actually enticed them the most

were the ones for iPhones,

promising them this entry
into this high-tech life.

And even when I was living with them
in urban slums like this one,

I saw people investing
over half of their monthly income

into buying a phone,

and increasingly, they were “shanzhai,”

which are affordable knock-offs
of iPhones and other brands.

They’re very usable.

Does the job.

And after years of living
with migrants and working with them

and just really doing everything
that they were doing,

I started piecing
all these data points together –

from the things that seem random,
like me selling dumplings,

to the things that were more obvious,

like tracking how much they were spending
on their cell phone bills.

And I was able to create
this much more holistic picture

of what was happening.

And that’s when I started to realize

that even the poorest in China
would want a smartphone,

and that they would do almost anything
to get their hands on one.

You have to keep in mind,

iPhones had just come out, it was 2009,

so this was, like, eight years ago,

and Androids had just started
looking like iPhones.

And a lot of very smart
and realistic people said,

“Those smartphones – that’s just a fad.

Who wants to carry around
these heavy things

where batteries drain quickly
and they break every time you drop them?”

But I had a lot of data,

and I was very confident
about my insights,

so I was very excited
to share them with Nokia.

But Nokia was not convinced,

because it wasn’t big data.

They said, “We have
millions of data points,

and we don’t see any indicators
of anyone wanting to buy a smartphone,

and your data set of 100,
as diverse as it is, is too weak

for us to even take seriously.”

And I said, “Nokia, you’re right.

Of course you wouldn’t see this,

because you’re sending out surveys
assuming that people don’t know

what a smartphone is,

so of course you’re not going
to get any data back

about people wanting to buy
a smartphone in two years.

Your surveys, your methods
have been designed

to optimize an existing business model,

and I’m looking
at these emergent human dynamics

that haven’t happened yet.

We’re looking outside of market dynamics

so that we can get ahead of it.”

Well, you know what happened to Nokia?

Their business fell off a cliff.

This – this is the cost
of missing something.

It was unfathomable.

But Nokia’s not alone.

I see organizations
throwing out data all the time

because it didn’t come from a quant model

or it doesn’t fit in one.

But it’s not big data’s fault.

It’s the way we use big data;
it’s our responsibility.

Big data’s reputation for success

comes from quantifying
very specific environments,

like electricity power grids
or delivery logistics or genetic code,

when we’re quantifying in systems
that are more or less contained.

But not all systems
are as neatly contained.

When you’re quantifying
and systems are more dynamic,

especially systems
that involve human beings,

forces are complex and unpredictable,

and these are things
that we don’t know how to model so well.

Once you predict something
about human behavior,

new factors emerge,

because conditions
are constantly changing.

That’s why it’s a never-ending cycle.

You think you know something,

and then something unknown
enters the picture.

And that’s why just relying
on big data alone

increases the chance
that we’ll miss something,

while giving us this illusion
that we already know everything.

And what makes it really hard
to see this paradox

and even wrap our brains around it

is that we have this thing
that I call the quantification bias,

which is the unconscious belief
of valuing the measurable

over the immeasurable.

And we often experience this at our work.

Maybe we work alongside
colleagues who are like this,

or even our whole entire
company may be like this,

where people become
so fixated on that number,

that they can’t see anything
outside of it,

even when you present them evidence
right in front of their face.

And this is a very appealing message,

because there’s nothing
wrong with quantifying;

it’s actually very satisfying.

I get a great sense of comfort
from looking at an Excel spreadsheet,

even very simple ones.

(Laughter)

It’s just kind of like,

“Yes! The formula worked. It’s all OK.
Everything is under control.”

But the problem is

that quantifying is addictive.

And when we forget that

and when we don’t have something
to kind of keep that in check,

it’s very easy to just throw out data

because it can’t be expressed
as a numerical value.

It’s very easy just to slip
into silver-bullet thinking,

as if some simple solution existed.

Because this is a great moment of danger
for any organization,

because oftentimes,
the future we need to predict –

it isn’t in that haystack,

but it’s that tornado
that’s bearing down on us

outside of the barn.

There is no greater risk

than being blind to the unknown.

It can cause you to make
the wrong decisions.

It can cause you to miss something big.

But we don’t have to go down this path.

It turns out that the oracle
of ancient Greece

holds the secret key
that shows us the path forward.

Now, recent geological research has shown

that the Temple of Apollo,
where the most famous oracle sat,

was actually built
over two earthquake faults.

And these faults would release
these petrochemical fumes

from underneath the Earth’s crust,

and the oracle literally sat
right above these faults,

inhaling enormous amounts
of ethylene gas, these fissures.

(Laughter)

It’s true.

(Laughter)

It’s all true, and that’s what made her
babble and hallucinate

and go into this trance-like state.

She was high as a kite!

(Laughter)

So how did anyone –

How did anyone get
any useful advice out of her

in this state?

Well, you see those people
surrounding the oracle?

You see those people holding her up,

because she’s, like, a little woozy?

And you see that guy
on your left-hand side

holding the orange notebook?

Well, those were the temple guides,

and they worked hand in hand
with the oracle.

When inquisitors would come
and get on their knees,

that’s when the temple guides
would get to work,

because after they asked her questions,

they would observe their emotional state,

and then they would ask them
follow-up questions,

like, “Why do you want to know
this prophecy? Who are you?

What are you going to do
with this information?”

And then the temple guides would take
this more ethnographic,

this more qualitative information,

and interpret the oracle’s babblings.

So the oracle didn’t stand alone,

and neither should our big data systems.

Now to be clear,

I’m not saying that big data systems
are huffing ethylene gas,

or that they’re even giving
invalid predictions.

The total opposite.

But what I am saying

is that in the same way
that the oracle needed her temple guides,

our big data systems need them, too.

They need people like ethnographers
and user researchers

who can gather what I call thick data.

This is precious data from humans,

like stories, emotions and interactions
that cannot be quantified.

It’s the kind of data
that I collected for Nokia

that comes in in the form
of a very small sample size,

but delivers incredible depth of meaning.

And what makes it so thick and meaty

is the experience of understanding
the human narrative.

And that’s what helps to see
what’s missing in our models.

Thick data grounds our business questions
in human questions,

and that’s why integrating
big and thick data

forms a more complete picture.

Big data is able to offer
insights at scale

and leverage the best
of machine intelligence,

whereas thick data can help us
rescue the context loss

that comes from making big data usable,

and leverage the best
of human intelligence.

And when you actually integrate the two,
that’s when things get really fun,

because then you’re no longer
just working with data

you’ve already collected.

You get to also work with data
that hasn’t been collected.

You get to ask questions about why:

Why is this happening?

Now, when Netflix did this,

they unlocked a whole new way
to transform their business.

Netflix is known for their really great
recommendation algorithm,

and they had this $1 million prize
for anyone who could improve it.

And there were winners.

But Netflix discovered
the improvements were only incremental.

So to really find out what was going on,

they hired an ethnographer,
Grant McCracken,

to gather thick data insights.

And what he discovered was something
that they hadn’t seen initially

in the quantitative data.

He discovered that people loved
to binge-watch.

In fact, people didn’t even
feel guilty about it.

They enjoyed it.

(Laughter)

So Netflix was like,
“Oh. This is a new insight.”

So they went to their data science team,

and they were able to scale
this big data insight

in with their quantitative data.

And once they verified it
and validated it,

Netflix decided to do something
very simple but impactful.

They said, instead of offering
the same show from different genres

or more of the different shows
from similar users,

we’ll just offer more of the same show.

We’ll make it easier
for you to binge-watch.

And they didn’t stop there.

They did all these things

to redesign their entire
viewer experience,

to really encourage binge-watching.

It’s why people and friends disappear
for whole weekends at a time,

catching up on shows
like “Master of None.”

By integrating big data and thick data,
they not only improved their business,

but they transformed how we consume media.

And now their stocks are projected
to double in the next few years.

But this isn’t just about
watching more videos

or selling more smartphones.

For some, integrating thick data
insights into the algorithm

could mean life or death,

especially for the marginalized.

All around the country,
police departments are using big data

for predictive policing,

to set bond amounts
and sentencing recommendations

in ways that reinforce existing biases.

NSA’s Skynet machine learning algorithm

has possibly aided in the deaths
of thousands of civilians in Pakistan

from misreading cellular device metadata.

As all of our lives become more automated,

from automobiles to health insurance
or to employment,

it is likely that all of us

will be impacted
by the quantification bias.

Now, the good news
is that we’ve come a long way

from huffing ethylene gas
to make predictions.

We have better tools,
so let’s just use them better.

Let’s integrate the big data
with the thick data.

Let’s bring our temple guides
with the oracles,

and whether this work happens
in companies or nonprofits

or government or even in the software,

all of it matters,

because that means
we’re collectively committed

to making better data,

better algorithms, better outputs

and better decisions.

This is how we’ll avoid
missing that something.

(Applause)