Researchers at UC San Francisco have successfully developed a “speech neuroprosthesis” that enables a man with severe paralysis to communicate in sentences and translate signals from his brain to the vocal tract directly into words that appear as text on a screen.
Developed in collaboration with the first participant in a clinical research study, the performance builds on more than a decade of efforts by the UCSF neurosurgeon Edward Chang, MDto develop technology that would enable paralyzed people to communicate even when they are unable to speak alone. The study published July 15 in the New England Journal of Medicine.
“To the best of our knowledge, this is the first successful demonstration of the direct decoding of full words from the brain activity of someone who is paralyzed and unable to speak,” said Chang, the Joan and Sanford Weill Chair of Neurological Surgery at UCSF, Jeanne Robertson Distinguished Professor, and lead author of the study. “It makes a strong promise to restore communication by tapping into the brain’s natural language machinery.”
Every year thousands of people lose the ability to speak due to a stroke, accident, or illness. As the approach outlined in this study evolves, one day these people may be able to communicate fully.
Translation of brain signals into speech
Previously, work in the field of communication neuroprosthetics focused on restoring communication through spelling-based approaches to typing letters one at a time in text. Chang’s study differs significantly from these efforts: his team translates signals designed to control the muscles of the vocal system to speak words, rather than signals to move the arm or hand to enable typing. Chang said this approach unlocks the natural and fluid aspects of language and holds the promise of faster and more organic communication.
âWith speech, we usually communicate information at a very high speed, up to 150 or 200 words per minute,â he said, noting that spelling-based approaches of typing, writing and controlling a cursor are significantly slower and more tedious. “Going straight to words like we do here has great advantages because it is closer to normal language.”
Over the past decade, Chang’s progress toward this goal has been facilitated by patients at the UCSF Epilepsy Center who had to undergo neurosurgery to determine the origin of their seizures using electrode arrays placed on the surface of their brain. These patients, all of whom spoke normal language, volunteered to have their brain records analyzed for language-related activity. Early success with these volunteer patients paved the way for the current study in people with paralysis.
Previously, Chang and colleagues at the UCSF Weill Institute for Neurosciences mapped the cortical activity patterns associated with movements of the vocal tract that produce each consonant and vowel. To translate these results into whole-word speech recognition, David Moses, PhD, a postdoctoral fellow in the Chang Laboratory and one of the lead authors on the new study, developed new methods for real-time decoding of these patterns and statistical language models to improve accuracy.
But their success in deciphering the language of participants who were able to speak did not guarantee the technology would work on a person with a paralyzed vocal tract. “Our models had to learn the association between complex patterns of brain activity and intended language,” said Moses. “That is a great challenge when the participant cannot speak.”
In addition, the team did not know whether brain signals that control the vocal tract would still be intact in people who had not been able to move their vocal muscles for many years. âThe best way to see if this could work was to try it,â said Moses.
The first 50 words
To investigate the potential of this technology in paralyzed patients, Chang worked with a colleague Karunesh Ganguly, MD, PhD, Associate Professor of Neurology, to start a study called BRAVO (Brain-Computer Interface Restoration of Arm and Voice). The first participant in the study is a man in his late 30s who more than 15 years ago suffered a devastating stroke that severely damaged the connection between his brain and his vocal tract and limbs. Since his injury, he has had extremely limited head, neck and limb movements and communicates with a pointer attached to a baseball cap to post letters on a screen.
The participant, who asked to be named BRAVO1, worked with the researchers to create a 50-word vocabulary that Chang’s team could recognize based on brain activity using advanced computer algorithms. The vocabulary – which includes words like “water,” “family” and “good” – was sufficient to create hundreds of sentences expressing concepts applicable to the daily life of BRAVO1.
For the study, Chang surgically implanted a high-density electrode array over the BRAVO1 speech motor cortex. After the participant made a full recovery, his team recorded 22 hours of neural activity in this brain region over 48 sessions and several months. During each session, BRAVO1 tried to say each of the 50 vocabulary words many times while the electrodes recorded brain signals from its speech cortex.
Tried to translate speech into text
In order to translate the patterns of recorded neural activity into specific intended words, the study’s other two lead authors, Sean Metzger, MS and Jessie Liu, BS, both bioengineering PhD students at Chang Lab, used custom neural network models using artificial intelligence. When the participant tried to speak, these networks distinguished subtle patterns in brain activity to detect attempts to speak and identify which words to say.
To test their approach, the BRAVO1 team first presented short sentences made up of the 50 words and asked him to pronounce them several times. As he tried, the words were deciphered one by one on a screen from his brain activity.
Then the team switched to asking him questions like âHow are you today?â. and âWould you like some water?â As before, BRAVO1’s speech attempt appeared on the screen. âI’m very goodâ and âNo, I’m not thirstyâ.
The team found that the system could decipher words from brain activity at speeds of up to 18 words per minute with an accuracy of up to 93 percent (75 percent median). Contributing to its success was a language model used by Moses, which implemented an “auto-correct” feature similar to that used by consumer text and speech recognition software.
Moses referred to the early test results as proof of the principle. “We were thrilled to see the exact decoding of a multitude of meaningful sentences,” he said. “We have shown that it is actually possible to facilitate communication in this way and that it has potential for use in conversational situations.”
Looking ahead, Chang and Moses said they will expand the study to include more participants with severe paralysis and communication deficits. The team is currently working on increasing the number of words in the available vocabulary and improving speaking speed.
Both said that while the study focused on a single participant and limited vocabulary, these limitations did not detract from performance. “This is an important technological milestone for a person who cannot communicate naturally,” said Moses, “and it shows the potential of this approach to give voice to people with severe paralysis and speech loss.”
Authors: The full list of authors is David A. Moses, PhD *; Sean L. Metzger, MS *; Jessie R. Liu, BS *; Gopala K. Anumanchipalli, PhD; Joseph G. Makin, PhD; Pengfei F. Sun, PhD; Josh Chartier, PhD; Maximilian E. Dougherty, BA; Patricia M. Liu, MA; Gary M. Abrams, MD; Adelyn Tu-Chan, DO; Karunesh Ganguly, MD, PhD; and Edward F. Chang, MD, all from UCSF. Funding sources included the National Institutes of Health (U01 NS098971-01), Philanthropy, and a sponsored research agreement with Facebook Reality Labs (FRL) that was concluded in early 2021. * Denotes the same contribution.
Financing: Supported by a research contract under the Sponsored Academic Research Agreement from Facebook, the National Institutes of Health (Grant NIH U01 DC018671-01A1), Joan and Sandy Weill and the Weill Family Foundation, the Bill and Susan Oberndorf Foundation, the William K. Bowes , Jr. Foundation and the Shurl and Kay Curci Foundation. UCSF researchers performed all of the design, execution, data analysis and reporting of clinical trials. The data of the research participants was collected exclusively by UCSF, is treated confidentially and is not passed on to third parties. FRL provided high quality feedback and advice on machine learning.
About UCSF: The University of California, San Francisco (UCSF) is entirely focused on health sciences and is dedicated to advancing health worldwide through advanced biomedical research, graduate education in the life sciences and health professions, and excellence in patient care. UCSF Health, which serves as the UCSF’s primary academic medical center Top specialist clinics and other clinical programs and has connections across the Bay Area. The UCSF School of Medicine also has a regional campus in Fresno. Find out more at ucsf.edu or take a look at our fact sheet.