Why I Want my Voice Assistant to Speak Spanglish
greetings
computers phones smart voice assistants
and other devices use human language to
communicate with us
humans that language exchange can be in
audio form
but more often it’s in written and
textual form today i’m going to show you
examples of how human computer
interaction breaks down
when the user is a bilingual individual
computing is struggling to diversify its
ranks
and without a diverse workforce we run
the risk of creating products that
poorly serve a diverse user base
as you will see in my talk products that
on the surface seem fine
for monolingual users breakdown when
used by bilingual users
i claim this is a side effect of a
homogeneous workforce that is
inadvertently
ignoring the richness that exists in the
majority of the users of computing
technology
let me start with a quick definition
bilinguals are those
individuals who use two or more
languages or dialects
in their everyday lives it’s worth
pointing out that while my presentation
talks about bilinguals everything
that i’m going to show and talk about
here it’s also
also true for people that speak more
than two languages
turns out there’s a lot of myth and
misinformation about bilingualism
one is that bilingualism is not a common
phenomenon it’s an exception to the
rules so to speak
the reality is that 60 of the world
population speaks more than one language
and in europe close to 30 percent speak
at least three languages
here in the u.s around 20 of the
population speak more than one language
but some believe that this estimate is
low it’s an undercount
because of how the data is collected but
just as important the percentage of
bilingual users in the u.s
has been steadily increasing from 11
in 1980 to just over 20 in 2011
and we shall see the results from the
2020 census to see where the number is
today
we also know that bilinguals follow some
very well known patterns of
communication for example
bilingual’s code switch meaning that
they often change
language in mid-sentence to the
untrained eye
this seems random and people thought it
was a sign of a lack of
education we now know that code
switching follows
well-defined patterns and that the
phrases that are inserted from one
language into the other
often conform to the grammatical rules
of both languages
today we understand that bilinguals that
code switch demonstrate
a level of linguistic sophistication
that goes beyond
someone speaking a single language
bilinguals also borrow words and phrases
from one language to use in another
and sometimes they change the base
language altogether
there’s even what is called
domain-specific language
where somebody speaking in one language
uses words from another language from
another domain
for example when i taught computer
science at the university of puerto rico
in mayaguez
my classes were in spanish but a lot of
my technical language of computing
which i learned in the u.s was in
english
so my domain specific language computing
was in english and i used it to
complement my classes in spanish
the key message is the following
bilinguals have more than one language
at their disposal
when they’re communicating when they’re
interacting with others and
all those languages are active at the
same time we don’t switch one to the
other we just have them all available at
the same time
and that statement should stand in stark
contrast with how computers work
if you play with a computer lately you
will notice that the language setting is
typically a choice
of select one of these languages
from that point forward the computer is
monolingual
the systems are designed to communicate
only one language at a time
and that places them an odd with how
bilinguals communicate
and these settings are typically
available at installation time
or buried deep into some rarely used
settings
that you’d never see another area where
we encounter this is in keyboard
settings
in word processors and text messaging
applications
if you’re bilingual and communicating
both languages
you end up doing a dance between
keyboards to be able to communicate
effectively
some bilinguals simply forgo the use of
features like autocorrect
because it creates more problems than it
solves as a bilingual communicating with
other bilinguals
the computer gets in the way in the last
few years software designers have added
the ability to change dictionaries and
language settings
in specific applications and contexts
for example browsers
but the choice is still largely you use
one language or the other
changing the language of a user
interface is often a task that is
considered
under the globalization and localization
of interfaces
globalization is the process of
designing a user interface
by identifying and separating those
parts that are different for each
country or culture
language is often one of those features
localization on the other hand
is the process of making that change is
taking that part from one country and
putting the part in place for the other
it is no surprise then that the notion
of using multiple languages
is often paired up with country or
nationality
multilingual sometimes is found under
international features
confusing bilingual individuals as
people who must be international
it’s almost as if the interface is
saying you speak more than one language
you must not be from around here
allow me to show you some more examples
of how technology ignores bilinguals
the examples are all based on personal
experience
as an english spanish bilingual user
they come in four areas of language use
reading writing listening and speaking
let’s start with reading the spanish
language uses a few characters set
outside of those using the english
language
spanish keyboards have a separate key
for the enya
the n and with the tilde and have
support for the question and exclamation
marks as well as the accents for the
vowels that are typical in spanish
by the way the enya is not just a funny
end with a mexican hat like some
students call it it really is a
different character and it makes a
difference on the words
the words that just having an n versus
anena
change meaning completely mono and mono
is one example mono is monkey
mono is a hair bond if you think that
little hats
matilda makes no difference you might
want to ask npr why they had to delete
this tweet
software often requires a special
setting to display these characters
correctly
needless to say my name peres quinones
is often printed in some creative ways
the funny thing is that this is an easy
fix
all these interfaces are displayed over
the web and there’s one line of code
that is needed to make that
display correctly rather than what you
see on the screen
conversely writing the spanish
characters can also trip up software
entering my name into software often
triggers
errors because i am using quote unquote
illegal characters
and needless to say this is just spanish
that has only a handful of characters
different from english
can you imagine what it would be like
with all the languages like arabic
russian and chinese
another issue is the pronunciation
particularly of names
your name is part of your identity in a
society like the u.s with so many
multinational influences
it can be tricky to pronounce names
correctly i know this because i’ve heard
my name pronounced in many different
ways
but i’m very accommodating because
people try they’re trying to pronounce
my name correctly
and they’re phonetic sounds in spanish
that are foreign
to english speakers just like their
sounds in english that are foreign to me
difficult for me to pronounce but when
it goes to my computer pronouncing my
name it really shouldn’t be an issue of
what language i’m using
or even what language the computer is
using the computer should be able to
pronounce my name correctly
let me play you my name pronounced in
english and spanish
using the apple speech synthetizer the
first is an english pronunciation
the next is spanish pronunciation manuel
manuel both of those sounds were
generated from the very same computer
using the very same software
if nothing else the system should not
change how it pronounces my name
depending on the language it’s using
my name has one pronunciation and
there’s no technological reason why it’s
pronouncing it wrong
if you’re from spain or latin america
you know that people named manuel like
me
often go by an apollo nickname of manolo
i won’t even insult the manolos watching
this by playing the english
pronunciation of my apollo
it’s awful i cringe every time i hear
one of the systems i
pronounce my name this is not a
technological problem as i showed you in
the audio
the system can’t pronounce my name
correctly this is a socio-technical
problem that we need to address
you can imagine that if a system has
been fine-tuned to pronounce
names one way voice recognition follows
closely behind
try to use one of these smart assistants
to call someone in my family
and what you get is often the comedy
sketch similar to
albert and costello susan first when
trying to call olga
my wife i have to pronounce her name as
alga
and when she tries to call me by saying
siri call manolo
city often responds by saying i can’t
find madonna in your address book
note that if i switch the device to
spanish all these problems go away
but i am forced to use a device in
spanish only for everything and i live
in north carolina
and there are a lot of businesses that i
can’t pronounce their names in spanish
because their names in english
how about accents another myth about
bilinguals is that if you have an accent
you’re somehow
part of a different class status or your
intelligence in question
will siri alexa and google understand
accents and make some weird inferences
based on the accents
let me play you this audio from my
friend dr carlos evia
alexa que horace diaz
cincuenta alexa what time is it
the time is 10 58 a.m
alexa que ora s
it’s 10 59 in this example the system
replied in spanish to a spanish question
in english to an english question and in
english to a spanish question that was
asked with a heavy fake
accent the system understood one
language and replied in another
and alexa is not the only one google
home is also known for some odd
exchanges across languages
i’ve seen this exchange happen hey
google queue is
and google replies is what time is it in
spanish
google understood the question but
proceeded to
translate it for me not to answer it
the good news is these systems
understand both languages
that’s pretty cool the bad news is
they’re responding in a way that is not
socially
expected by bilingual and it doesn’t
follow the language conventions used by
bilinguals
so i’ve shown you how bilinguals are
disadvantaged when it comes to reading
writing listening and speaking with
computing systems
i wish i could tell you the problem is
just there
if it was the next update of the
software will fix this and we’re done
but the problem is deeper once you start
looking at information organization and
classifications
as a bilingual person i don’t live in
two distinct worlds
instead i experience the world from two
points of view at the same time
i can understand multiple languages
consider the following situation
i’m going to buy a birthday card for a
bilingual member of my family
i go to a famous online store looking
for an e-card
i would have to search or navigate
occasions birthday
daughter but i would only find in their
english cards because the spanish cards
are at a separate top-tier
classification
so we’d also have to search under
spanish but there is no set of
occasion under spanish you see all the
cards in spanish
and i would have to sort of find the one
that is about birthday
all them in one category so for me a
bilingual person
buying a card for another bilingual
person i have to look for car in two
different places
and i cannot even compare the two i
cannot even say this is the spanish this
is the english which one i like better
because they’re not in the same place so
seems like the use case for building
this interface was more influenced by
monolingual users
ignoring the fact that 60 of the world
population
can use two or more languages these
things add up
and it’s easy to see how this type of
language disenfranchisement
of bilinguals can spill from technology
to other social and political systems
let me give you one last example google
news displays news
in a combination of language and region
this is a step in the right direction
it used to be that you selected espanol
and you would get news feeds
from espana and latin america even when
you were in the united states
today at least you get to pick the
region and the language so i could get
news
from the united states in english or
news from estados unidos
in espanol why not united states
in english espanol why not both i speak
both why can’t i see news together
you might be surprised that selecting
the language still shows some
algorithmic bias and editorial decisions
based on the language choice even though
i’m in the united states
i’m still going to see news that are
influenced by the type of language i’m
looking at
basically i get a different set of news
depending on which language i want to
use
let me show you some examples march 13
2020
the top story on both sides is about the
early days of the kobe 19 pandemic
the english side however is about the
conflict between the house democrats and
the white house
but the spanish side it’s all about the
meaning of the declaration of a national
emergency in the united states by the
president
that’s two very different perspectives
and the rest of the stories have little
to do with each other across the
languages
this week august 16 2020 i captured the
top stories
a couple of days ago on the english side
the top story is about the turmoil with
the postal service
on the spanish side the top story is
about accusations against a church
leader
in the hispanic community the rest of
the stories have some commonality
but the organization of this sort the
stories makes sense from a one language
point of view
if you only speak spanish then the
likelihood that the stories on the right
are right for you is pretty high
if you speak both why don’t you see both
why do i have to see stories in one
language or the other i can read them
both
but even if the stories are in common
but have different point of view
i would finally see them all without
having to switch interface
i don’t want to have to go into
different rooms to read stories in
different languages
if there’s this significant difference
between two languages
that is yet one more reason that i want
both languages intermixed
i’m bilingual i can read them both let
me decide if i want to read new stories
from
univision or fox news my choice
if you only read english then you don’t
get to see univision and that’s fine
if you only read spanish then you don’t
get to see fox news and that’s fine
but if you can read them both let me see
them both let me be more informed
by seeing all the new stories that i can
read with the languages that i
can read and write linguist max wayne
wright said
that a language is a dialect with an
army and a navy
to my friends building computer systems
we’re becoming the military force
imposing language use on the bilingual
world
we’re creating products that force
bilingual users to use one language at a
time
and ignores the cultural wealth that
bilinguals like me provide
to the educators trained in the future
computer scientists and software
developers please
insist that your students at least at
the very least take some foreign
language classes maybe even do a minor
we’re building products used by the
whole world population at large
and that population is 60 bilingual the
least our developers should do is have a
passing knowledge of how bilingual
people live and communicate
imagine in my experience reading news
and searching for information
was not limited by the computer
interface imagine that i could interact
with others
using the natural mixing of languages
that is humanly possible
imagine if the computer as a mediator
would present me information
in all the languages i can understand
instead of feeding me
piecemeal information and hiding others
because i haven’t switched to the right
language
i would like technology to help us close
the gap that already exists in society
between us
and them i would like to see a future
where my smart assistant
can pronounce my name correctly and
understand it if it’s pronounced using
the pronunciation of its language of
origin
i would like all software developers to
understand
that languages aren’t foreign and the
character sets are illegal
many of our neighbors family friends
often use more than one language it’s
time our technology does too
thank you