PC Magazine - 09.2019

Here we demonstrate real-time decoding of perceived and produced speech from high-density ECoG activity in humans during a task that mimics natural question-and-answer dialogue. While this task still provides explicit external cueing and timing to participants, the interactive and goal- oriented aspects of a question-and-answer paradigm represent a major step towards more naturalistic applications. During ECoG recording, SDUWLFLSDQWV¿UVWOLVWHQHGWRDVHWRISUHUHFRUGHGTXHVWLRQVDQGWKHQ verbally produced a set of answer responses. These data served as input to train speech detection and decoding models. After training, participants performed a task in which, during each trial, they listened to a question and responded aloud with an answer of their choice. Using only neural signals, we detect when participants are listening or speaking and predict the identity of each detected utterance using phone-level Viterbi decoding. Because certain answers are valid responses only to certain questions, we integrate the question and answer predictions by dynamically updating the prior probabilities of each answer using the preceding predicted question likelihoods.

Participants provided live answers to prerecorded questions, and researchers
used their brain-signal data to train models to understand both what they said
and heard. On average, the software correctly detected questions 76 percent
percent of the time and the response of the participant at a lower rate of 61
percent. While it’s easy to concoct theories of nefarious uses for this technology,
it shows a promise in communication with non-verbal people with injuries or
neurodegenerative disorders.

PC Magazine - 09.2019

Get our desktop app

Company

Features

Documentation

Resources