Bias in Tech Algorithms

Transcriber: Gm Choi
Reviewer: Lucas Kaimaras

It’s that time of week again,
Friday night, pandemic style.

Αround this time last year,
like most of you,

I decided to settle in for a cozy night
on the couch with the TV remote

instead of any wild Friday night
adventures with COVID hanging around.

As I browse through Netflix.

I passed the movie I watched yesterday,

a cheesy rom-com recommended by a friend,

completely not my style.

I quickly moved past it and landed
on the Recommended For You category.

I eagerly began browsing through this one,

since this is where I found
most of the good movies I ended up liking.

Suddenly, though, I sat bolt upright.

Since yesterday, almost all of the movies
in my Recommended For You

were now rom-coms,
each one cheesier than the last.

I’m sure I’m not the only one that feels

like Netflix seems to know me
better than myself sometimes,

but this scenario got me thinking.

How did Netflix know
what I watched yesterday

and how were they able to recommend movies
similar to that one for me to watch today?

To answer that question,

we need to think about
what an algorithm is.

This is a pretty popular buzzword
many of you might’ve heard before,

but what actually is an algorithm?

An algorithm is simply
a list of steps to solve a problem.

In the technology world algorithms
consist of computer implementable commands

that allow you to perform computations.

In fact, algorithms are what Netflix uses

to generate the movies
in the Recommended For You category

that I had such a bitter
experience with last year.

Oftentimes, algorithms that are used

in artificial intelligence
or machine learning,

where we try to get the computer
to imitate human behavior,

are created using data.

The best way to mimic human behavior

is to analyze how we humans
behave through our actions.

For instance, when Netflix writes
their movie recommendation algorithm,

they likely harness data

from the millions of people
who use their services

to predict how humans behave
when it comes to watching movies.

For example, if one user such as myself,
watches the first To All the Boys movie,

then they’ll likely watch
the second one and then the third one.

What Netflix does is compile this trend
of movie watching into a big database,

which it then sends to the computer
to look for patterns.

Most people, after they’ve seen
the first part of a movie series,

will likely watch the second.

The algorithm’s job is
to recognize this pattern from the data,

use it to create a model of what
the correct output is, given an input,

and then apply this model to other users.

This is one of the ways
in which Netflix’s algorithm

produces the movies we see
in the Recommended For You section.

Which is why those movies are often
similar to ones we’ve seen previously,

whether that is by content,
genre or actors.

The idea is that computers
can learn on their own

when given instructions
to follow or data to mine,

and thus mimic human behavior.

But what this idea doesn’t encompass

is the fact that computers
are fundamentally not humans.

As such, they lack many of the things
we humans take for granted

when it comes to making decisions

such as common sense,
ethics and logical reasoning.

This is where the problem arises.

Before algorithms existed,
humans made all the decisions.

Now computers
either make decisions for us

or heavily influence
how we make decisions.

Whether it’s something simple,
like deciding which movie to watch next

or something more serious,

like determining the fiscal credit
worthiness of an individual

to give them a loan.

But these computers
aren’t learning from thin air.

They’re learning from us.

Machine learning algorithms
rely on human data.

They need data to understand
how we humans behave and act

in terms of concrete, hard behaviors.

They then use this data to replicate
human behavior and make tasks simpler.

The problem arises when the data that is
being passed to these machines is flawed.

This is when we run the risk
of teaching a computer the wrong thing,

or in other words,
create a flawed algorithm.

The fundamental cause behind
why such a phenomenon occurs, is this:

computers rely on data
and data comes from human behavior.

However, human behavior
often represents what is,

rather than what should be,

and what is is often racist,
sexist, xenophobic and so on.

As recent events in the US
have illustrated, our society

holds systems of institutionalized
discrimination and oppression

that have been at play for generations.

As such, it isn’t shocking

that any data we collect from people
in our society will be biased,

whether that is against certain ideas,
beliefs, groups of people or traditions,

The machines we are more heavily
relying on day by day are biased.

Why?

Because they’re learning from us
and we are biased.

It may seem like this issue
is so far removed from society.

“What does bias in tech algorithms
have to do with us?”

Well, aside from influencing
the movies Netflix recommends for us,

there are other uses for algorithms
that can impact our lives quite heavily.

One such example
is employee hiring algorithms.

A recent prime example
of how this algorithm became flawed

lies with tech giant Amazon.

Taken together, Amazon’s global workforce
is over 60% male,

with 75% of managerial positions
also held by males.

The data that was fed
to their employee hiring algorithm

allowed it to learn that a woman
were a minority at this company,

and thus it penalized
any resumé it came across

that contained the word “woman” in it,

leading to gender bias in how employees
were hired at this top company.

Another example can be seen
with offender risk assessments.

What are these?

Well, US judges use automated risk
assessments to determine things

like bail and sentencing limits
for individuals accused of a crime.

These assessments rely
on large data sets that go back ages

and include variables like arrest records,
demographics,

financial stability and so on.

The algorithms that these
assessments are based on

have been found to be inherently biased
against African-Americans

by recommending things
such as detainments,

longer prison sentences
and higher bails for them,

in comparison to an equally likely
to reoffend white counterpart.

When we think about it,
this isn’t entirely surprising,

though completely atrocious.

There has been a long history of
marginalisation and racism in our society,

that the data set to use to create
this algorithm most likely reflect.

As such, the algorithm
has learned to recommend actions

that continue to oppress African-Americans

because that is what the data set
it was trained on shows.

A more close to home example can be seen
with facial recognition software.

We use this algorithm
every time we unlock our iPhones.

However, what we may not know
is that facial recognition,

though outwardly proven
to be over 90% accurate,

is actually not this accurate
for everyone.

As the image illustrates,
facial recognition varies in accuracy

depending on the demographic of the user

and is most inaccurate for darker females,

illustrating the bias hidden
within this algorithm.

Though companies that make the software
have since announced commitments

to modify testing
and improve data collection,

it’s important to note

the widespread prevalence of facial
recognition in our society today,

and as such, the damage that this
flawed algorithm has already caused.

Not only do we use it in our phones,

but law enforcement, employment offices,
airports, security firms and more.

All use facial recognition
in multiple capacities.

What if law enforcement incorrectly
flagged an innocent female as a criminal

while failing to identify
the real perpetrator?

What if this happened over
and over and over again?

Can we imagine the terrible
effects that would have?

It may seem bleak
to begin scratching the surface

of how technology can be taught
to imitate human fallacies and bias,

creating long lasting
and far reaching negative impacts.

But there are ways we can
improve the situation.

One way is to widen the breadth of data
used to create these algorithms.

If we give computers more varied
and diverse data to learn from,

we can help ensure that they are learning
correct patterns within human behavior

that accurately reflect
what we want them to do.

Another way is to rally together
in demanding increased transparency

when it comes
to creating these algorithms.

Within the past few decades,
tech giants like Facebook and Google

have harnessed tremendous
amounts of personal data

to create powerful
machine learning technology

with little insight or oversight
from the public and government.

This means that only
a small subset of people

oversaw the creation of something
that a large subset of society is using,

which can inherently lead to bias.

We can fix this by instituting policies

that govern when and how
personal data can be used,

as well as by incorporating the work
of diversity and equity leaders

in the creation of these algorithms.

A final way we can improve the situation
is to explore more deeply

into the idea of teaching machines
societal ethics.

For instance, widely held beliefs
like “innocent until proven guilty”,

common sense reasoning and the elimination
of logical or emotional fallacies.

It is more important now than ever

for us to educate ourselves
about bias in machine learning algorithms,

a dangerous phenomenon
that gives us a dark insight

into how technology, often thought
to be created by humans for humans,

can turn ugly.

As we’ve seen, our society has
systems of bias embedded within it.

It is our job to ensure
that this inequity and unfairness

does not widen and spread
into the realm of technology,

specifically within
artificial intelligence.

I hope for you all today to leave
not only with new knowledge

about an issue
within the technology world,

but also a heart filled with empathy

towards those oppressed
by systems in place,

and what we can do
to combat those systems

and create meaningful change
using computer science.

Technology is a powerful field.

One that only seems to be getting
stronger and stronger

as our society pivots and remains
in a remote workspace.

It is incredibly important for us
to recognize and understand this power

so we can use it to our advantage.

It is up to us to ensure

that technology and artificial
intelligence is working for us,

not against us.

Thank you.