bo
ok
s (^) i
n (^) b
rie
f
158
stra
tegy
+bu
sine
ss^ is
sue^
96
tion — ones in which personified AIs become our help-
ers, watchdogs, oracles, and friends.”
This fanciful future is barely evident in the present
relationship I have with the cylindrical smart speaker
sitting on my desk. The ring around the top of the
speaker is usually glowing red. That’s because it’s on
mute, which, I am assured (but not completely reas-
sured), precludes its maker’s employees from listening
to me at will. But I do get a glimmer of how valuable
that relationship could become when I unmute the de-
vice and ask it for the weather forecast, or order up a
particularly tasty Grateful Dead jam. Voice is the simplest, most natural interface
with technology yet invented.
“With voice,” Vlahos explains, “computers are finally doing it our way. They
are learning our preferred way of communication: through language. Voice, op-
timally realized, has the potential to be so easy to use that it hardly feels like an
interface at all. We know how to speak because we’ve been doing it for all of our
lives.”
The key words here are “optimally realized.” It is abundantly clear that voice
technology is far from that state. Vlahos describes why in the remaining two-
thirds of Ta l k t o Me, which is devoted to explaining the technology and to explor-
ing the challenges and decisions that lie ahead.
Voice computing is enabled by a mashup of technologies. “The sound waves
emanating from your mouth must be converted into words, a process known as
automated speech recognition,” writes Vlahos. “Determining what you were try-
ing to communicate with those words is called natural-language understanding.
Formulating a suitable reply is natural-language generation. And finally, speech
synthesis allows voice-computing devices to audibly reply.”
The reason we’re seeing an explosion in voice computing now is that deep
learning has enabled researchers to overcome a host of challenges in the above
technologies. For instance, instead of having a person author every line a com-
puter speaks, recently developed generative methods for training neural networks
enable computers to come up with responses on their own.