Scientific American - USA (2012-12)

(Antfer) #1

44 Scientific American, December 2021


John Pedin, NY Daily News Archive and Getty Images

Yet such systems are still far from perfect, and emotion AI
tackles a particularly formidable task. Algorithms are supposed
to reflect a “ground truth” about the world: they should identify
an apple as an apple, not as a peach. The “learning” in machine
learning consists of repeatedly comparing raw data—often from
images but also from video, audio, and other sources—to train-
ing data labeled with the desired feature. This is how the system
learns to extract the underlying commonalities, such as the “ap-
pleness” from images of apples. Once the training is finished, an
algorithm can identify apples in any image.
But when the task is identifying hard-to-define qualities such as
personality or emotion, ground truth becomes more elusive. What
does “happiness” or “neuroticism” look like? Emotion-AI algorithms
cannot directly intuit emotions, personality or intentions. Instead
they are trained, through a kind of computational crowdsourcing,
to mimic the judgments humans make about other humans. Crit-
ics say that process introduces too many subjective variables. “There
is a profound slippage between what these things show us and what
might be going on in somebody’s mind or emotional space,” says
Kate Crawford of the University of Southern California Annenberg
School for Communication and Journalism, who studies the social
consequences of artificial intelligence. “That is the profound and
dangerous leap that some of these technologies are making.”
The process that generates those judgments is complicated,
and each stage has potential pitfalls. Deep learning, for example,
is notoriously data-hungry. For emotion AI, it requires huge data
sets that combine thousands or sometimes billions of human judg-


ments—images of people labeled as “happy” or “smiling” by data
workers, for instance. But algorithms can inadvertently “learn” the
collective, systematic biases of the people who assembled the data.
That bias may come from skewed demographics in training sets,
unconscious attitudes of the labelers, or other sources.
Even identifying a smile is far from a straightforward task. A
2020 study by Carsten Schwemmer of the GESIS–Leibniz Insti-
tute for the Social Sciences in Cologne, Germany, and his col-
leagues ran pictures of members of Congress through cloud-based
emotion-recognition apps by Amazon, Micro soft and Google. The
scientists’ own review found 86 percent of men and 91 percent of
women were smiling—but the apps were much more likely to find
women smiling. Google Cloud Vision, for instance, applied the
“smile” label to more than 90  percent of the women but to less
than 25  percent of the men. The authors suggested gender bias
might be present in the training data. They also wrote that in
their own review of the images, ambiguity—ignored by the ma-
chines—was common: “Many facial expressions seemed border-
line. Was that really a smile? Do smirks count? What if teeth are
showing, but they do not seem happy?”
Facial-recognition systems, most also based on deep learning,
have been widely criticized for bias. Researchers at the M.I.T. Me-
dia Lab, for instance, found these systems were less accurate when
matching the identities of nonwhite, nonmale faces. Typically these
errors arise from using training data sets that skew white and male.
Identifying emotional expressions adds additional layers of com-
plexity: these expressions are dynamic, and faces in posed photos
can have subtle differences from those in spontaneous snapshots.
Rhue, the University of Maryland researcher, used a public data
set of pictures of professional basketball players to test two emo-
tion-recognition services, one from Micro soft and one from Face++,
a facial-recognition company based in China. Both consistently as-
cribed more negative emotions to Black players than to white play-
ers, although each did it differently: Face++ saw Black players as
angry twice as often as white players; Micro soft viewed Black play-

CONTEXT COUNTS: A woman looks upset in a cropped photo
from 1964 ( left ). But the complete image shows she is part of
a joyous crowd (above). These are ecstatic Beatles fans outside
the band’s hotel in New York City.
Free download pdf