Researchers have made a groundbreaking development in creating a device capable of translating thoughts about speech into spoken words instantaneously. Though in the experimental phase, this brain-computer interface holds promise for giving a voice to individuals who are unable to speak. A recent study highlighted the testing of this technology on a 47-year-old woman with quadriplegia, who lost the ability to speak 18 years ago following a stroke. In a clinical trial, doctors implanted the device in her brain during surgery.
The device functions by converting her desire to speak into smooth sentences, as explained by Gopala Anumanchipalli, a co-author of the study recently published in the journal Nature Neuroscience. Conventional brain-computer interfaces, or BCIs, for speech generally experience slight delays between forming thoughts and verbalizing them through a computer. Such delays can interfere with the flow of conversation, posing communication challenges and potential frustration, as highlighted by researchers.
Jonathan Brumberg, from the Speech and Applied Neuroscience Lab at the University of Kansas, noted that this represents a significant advancement in the field. Without being part of the study, he emphasized the innovative nature of the work. In California, a research team recorded the brain activity of the participant using electrodes while she silently articulated sentences in her mind. Additionally, the scientists developed a synthesizer that used her pre-injury voice to mimic what she would have sounded like. They then trained an AI model to convert neural activity into sound segments.
Anumanchipalli, from the University of California, Berkeley, noted that this technology functions similarly to systems that transcribe meetings or phone conversations in real-time. The implant is positioned over the brain’s speech center to capture the necessary signals, which are then converted into speech elements that form sentences. Anumanchipalli described it as a “streaming approach,” wherein each 80-millisecond unit of speech, approximately half a syllable, is fed into a recorder as it occurs.
The quick processing capability of the system shows promise for keeping up with the rapid pace of natural speech, according to Brumberg. He further remarked that the inclusion of voice samples represents a considerable enhancement in the natural quality of the synthesized speech. Although partially funded by the National Institutes of Health (NIH), Anumanchipalli assured that the work was not impacted by recent NIH funding cuts. He projected that with continued financial support, the technology could be widely available to patients within a decade, though more research is required before it can be broadly implemented.