Scientific American - USA (2020-10)

(Antfer) #1

14 Scientific American, October 2020


FORUM
COMMENTARY ON SCIENCE IN
THE NEWS FROM THE EXPERTS


Illustration by Martin Gee

Claudia Lopez-Lloreda is a freelance science writer and
neuroscience graduate student at the University of Pennsylvania.

“Clow-dia,” I say once. Twice. A third time. Defeated, I say the
Americanized version of my name: “Claw-dee-ah.” Finally, Siri
recognizes it. Having to adapt our way of speaking to interact
with speech-recognition technologies is a familiar experience for
people whose first language is not English or who do not have
conventionally American-sounding names. I have now stopped
using Siri, Apple’s voice-based virtual assistant, because of it.
The growth of this tech in the past decade—not just Siri but
Alexa and Cortana and others—has unveiled a problem in it: racial
bias. One recent study, published in the Proceedings of the Nation-
al Academy of Sciences USA, showed that speech-recognition pro-
grams are biased against Black speakers. On average, the authors
found, all five programs from leading technology companies,
including Apple and Microsoft, showed significant race dispari-
ties; they were roughly twice as likely to incorrectly transcribe
audio from Black speakers compared with white speakers.
This effectively censors voices that are not part of the “stan-
dard” languages or accents used to create these technologies. “I
don’t get to negotiate with these devices unless I adapt my lan-
guage patterns,” says Halcyon Lawrence, an assistant professor
of technical communication and information design at Towson


University, who was not part of the study. “That is problematic.”
For Lawrence, who has a Trinidad and Tobagonian accent, or for
me as a Puerto Rican, part of our identity comes from speaking
a particular language, having an accent or using a set of speech
forms such as African American Vernacular English (AAVE). Hav-
ing to change such an integral part of an identity to be able to be
recognized is inherently cruel.
The inability to be understood impacts other marginalized com-
munities, such as people with visual or movement disabilities who
rely on voice recognition and speech-to-text tools, says Allison Koe-
necke, a computational graduate student and first author of the
PNAS study. For someone with a disability who is dependent on
these technologies, being misunderstood could have serious con-
sequences. There are probably many culprits for these disparities,
but Koenecke points to the most likely: the data used for training,
which are predominantly from white, native speakers of American
English. By using databases that are narrow both in the words that
are used and how they are said, training systems exclude accents
and other ways of speaking that have unique linguistic features.
Hu mans, presumably including those who create these technolo-
gies, have accent and language biases. For example, research shows
that the presence of an accent affects whether jurors find people
guilty and whether patients find their doctors competent.
Recognizing these biases would be an important way to avoid
implementing them in technologies. But developing more inclu-
sive technology takes time, effort and money, and often the deci-
sion to invest these are market-driven. (In response to several
queries, only a Google spokesperson responded in time for pub -
lication, saying, in part, “We’ve been working on the challenge of
accurately recognizing variations of speech for several years and
will continue to do so.”)
Safiya Noble, an associate professor of information studies at
the University of California, Los Angeles, admits that it’s a tricky
challenge. “Language is contextual,” says Noble, who was not
involved in the study. “But that doesn’t mean that companies
shouldn’t strive to decrease bias and disparities.” To do this, they
need the input of humanists and social scientists who understand
how language actually works.
From the tech side, feeding more diverse training data into the
programs could close this gap, Koenecke says. Noble adds that
tech companies should also test their products more widely and
have more diverse workforces so people from different back-
grounds and perspectives can directly influence the design of
speech technologies. Koenecke suggests that automated speech-
recognition companies use the PNAS study as a preliminary
benchmark and keep using it to assess their systems over time.
In the meantime, many of us will continue to struggle between
identity and being understood when interacting with Alexa, Cor-
tana or Siri. But Lawrence chooses identity every time: “I’m not
switching,” she says. “I’m not doing it.”

JOIN THE CONVERSATION ONLINE
Visit Scientific American on Facebook and Twitter
or send a letter to the editor: [email protected]

Siri Is a Biased


Listener


Most popular speech-recognition


software has trouble with minority voices


By Claudia Lopez-Lloreda


© 2020 Scientific American
Free download pdf