Photo credit: d3sign/Getty Images
What if with a simple analysis of your voice, technology could detect the potential of current or future illnesses? The modifications in voice, usually undetectable by human hearing, are now being researched to find early signs of health issues.
Patients affected by Parkinson’s disease for example experience loss of voice volume and begin to speak with haste. In schizophrenia, the words in sentences lose the connection between each other. Patients suffering from Alzheimer’s disease express themselves with longer pauses between words, or have trouble in finding their words and often replace the nouns with pronouns such as "that" and "it."
Using machine learning algorithms and vocal analysis technology, researchers, medical specialists and health tech companies are introducing to the world the future of voice biomarkers.
What are voice biomarkers?
Voice biomarkers represent voice data used for detecting alterations in health. The voice is a complex result of our muscles and brain working together with maximum precision. Mild or severe modifications in voice and language can be an indicator of various diseases, making vocal biomarkers a noninvasive tool for detecting and tracking these diseases.
The science behind vocal biomarkers:
In 2007, a team of researchers selected 34 teenagers who presented as being at high risk for developing psychosis and monitored them for three years. At the beginning of the study, voice samples of the participants were recorded and researchers used voice analysis technology to identify the modifications known to manifest in early stages of psychosis.
The results of the study, published in 2015, indicated that researchers were able to predict with 100% accuracy the participants who developed psychosis, based exclusively on the complex analysis of their voices. Although the number of participants used for this study was small, its findings promoted the use of voice analysis technology for predicting human behavior in psychiatric health disorders.
In 2015, specialists from a biotech company began to use voice samples provided via an iPhone app by patients with and without Parkinson’s disease for developing machine learning patterns that could turn voice data into a tool for diagnosing neurological affections.
In 2019, the researchers processed the 10-seconds per patient voice recordings from the ongoing study and then placed these audio bits into their machine learning patterns. The machine was able to identify from the entire data available those patterns that were significant, and by doing so was capable to differentiate 85% of the time between the people affected by Parkinson’s disease and those from the control group.
The results were better than the accuracy of clinical diagnosis set by nonspecialist doctors (which at that time was 74%) and better than the accuracy of diagnosis set by disorder specialists (which in average was 80%).
The findings of the study indicated that voice biomarkers could be used to provide valuable insights regarding neurological diseases and, in this specific case, regarding Parkinson’s disease, a disorder that accounts for 60,000 new diagnoses every year in the U.S.
In 2020, a team of medical specialists from Mayo Clinic identified a voice biomarker for pulmonary hypertension (PH), which is highly present in patients affected by heart failure (HF). Heart failure affects more than six million Americans every year and translates into one million hospitalizations per year.
Based on their findings, the team from Mayo clinic conducted a study on a selected sample of 83 patients. The patients were split into those with high pulmonary arterial pressure – PAP, a form of PH – and those without.
Both groups were required to record their voices on a smartphone while reading a given text, then recount a positive experience and a negative one. The results concluded there was a link between the patients with high PAP and the voice biomarker used in the study.
Health tech companies have developed computer vision methods to detect and "read" what vocal biomarkers are saying with these steps:
1) A person’s voice is recorded and then transferred into a spectrogram (an image) alongside medical data of that person.
2) Machine learning algorithms analyze the voice samples and detect in the recording patterns specific to certain symptoms or illnesses. These patterns appear as small changes in the spectrogram, identified with the help of computer vision methods.
Challenges faced by voice biomarker use
Data interpretation
Although emotions experienced by patients during the recording of their voices don’t have a significant impact on the voice biomarkers' performance, it seems that changes in speech do. That is why data interpretation is vital: Researchers or medical specialists must rule out parameters irrelevant for algorithms designed for detecting specific diseases or symptoms.
For example, after a stroke, the algorithm designed to track cognitive deficiencies should search for changes in the words spoken by the patient who suffered the stroke, instead of focusing on parameters such as frequency or tone of speech.
Complexity of voice characteristics
Respiratory disorders also determine modifications of speech. Some may speak slower than a healthy patient, or might take longer pauses in order to breathe before speaking. They might even speak softer compared to those without respiratory problems.
Although breathiness is translatable into a measurable parameter, its characteristics, such as acoustics or frequency, are highly complex and to some extent versatile. Their analysis requires specifically designed algorithms, proper data interpretation methodology and skilled personnel to conduct this interpretation.
Patient reluctance
Many patients are reluctant to provide voice samples or allow applications to track their regular activities because of personal data protection concerns. This makes the work of researchers, medical specialists and biotech developers difficult, since without large voice data samples, biomarkers models cannot be used to detect potential illnesses.
Present and future of voice biomarkers
Machine learning algorithms and vocal analysis technology detect voice patterns specific to various diseases or symptoms. Voice biomarkers are already enabling the detection of neurological diseases like Parkinson’s disease.
For clinicians, it could be difficult to detect signs or symptoms of neurological diseases during one appointment (one moment in time). Voice samples can also ease the path to establishing an accurate and timely diagnosis for these diseases.
COVID-19 is proof that specific vocal biomarkers can assist medical specialists in detecting disorders that affect respiratory organs. For COVID-19, biotech companies have created a composite image based on a large database of voice samples collected from COVID-19 positive patients.
When patients are screened for COVID-19, their voice sample is compared to this composite image. Based on the existing connection between the two, it can be established with high accuracy if the tested person is or not at risk of being COVID-19 positive.
However, vocal biomarkers are not providing a final diagnosis, but are meant to smooth the triage of people who are potentially infected with COVID-19. A definitive diagnosis can be established only by further testing, by specialized medical care or even by placing the tested persons into quarantine.
Vocal biomarkers are being investigated for a safe return of employees to their offices with the help of an application that records a voice sample, analyzes it and provides a risk score. If this COVID-19 score is low, based also on coughing analyses, employees can return to work. If the score is high, further medical investigations are required in order to establish a definitive diagnosis.
On the long term, voice biomarkers can also:
1) Reduce the pressure placed on the medical system and medical staff by enabling remote triage of most frequent affections.
2) Help patients learn if they are affected by a medical condition.
3) Support telehealth thanks to voice samples recorded by patients in their homes.
4) Save costs in healthcare by improving diagnosis, customizing prescribed treatments, adjusting medication, recommending appropriate injury prevention measures and monitoring the evolution of patients involved in studies for drug development.
Conclusion
In order for more specific vocal biomarkers to be developed, tested and successfully used, researchers and health tech companies can work together in creating a national database of voice recordings that would allow the tracking of more specific affections.
As a noninvasive method of remotely detecting new cases of various diseases, voice biomarkers can be a key to unlock disease diagnoses for earlier adequate interventions and improved health outcomes for patients.
About the Author
Dr. Liz Kwo a serial healthcare entrepreneur, physician and Harvard Medical School faculty lecturer. She received an MD from Harvard Medical School, an MBA from Harvard Business School and an MPH from the Harvard T.H. Chan School of Public Health.