The covid-19 crisis has tested healthcare systems throughout the world. Access to vaccines against covid-19 has rendered the situation more stable by the day. However, we must and will have to continue mass population screening to detect positive cases and thus break possible virus transmission chains. Hence, we must investigate new techniques to reduce the time and cost of diagnostic tests to carry them out on a large scale in an accessible, efficient and economical manner.
In the framework of the Interspeech 2021 international congress, a team of researchers has submitted the system to the Cough Sound Track of the Diagnosing COVID-19 using Acoustics (DiCOVA) Challenge. The article relating to their contribution has been accepted to join the Interspeech scientific programme.
This research is led by Adrià Mallol, UPF audiovisual systems engineering alumni and researcher at the University of Augsburg (Germany), and Helena Cuesta, with the participation of Emilia Gómez (Joint Research Centre, European Commission), both members of the Music Technology research group (MTG) of the UPF Department of Information and Communication Technologies (DTIC), and Björn Schuller, a researcher at the University of Augsburg and at Imperial College London (UK).
"We set ourselves the challenge of investigating how to use Artificial Intelligence techniques to detect the disease"
Previous systems based on AI have proved effective at detecting coughing and sneezing and for recognizing respiratory anomalies. AI has also been used in the field of mental health to identify patients with depression or PTSD. Following advances in digital health: "Inspired by these studies, and based on the respiratory disorders caused by covid-19, we set ourselves the challenge of investigating whether AI techniques can detect disorders related to the virus through automatic cough analysis", explains Helena Cuesta, a member of the research team.
The cough signal is altered in patients testing positive for covid-19 In this paper, the authors investigate two different neural network architectures, but with a common structure: a first block that processes the input spectrograms and extracts a set of embedded features, and a second block that classifies these features according to whether they correspond to a patient testing positive for covid-19 or a healthy patient.
"Our models use the spectrogram, a time-frequency representation of the audio signal as input"
The first step is pre-processing the input data. In general, the recordings of the database contain various coughs, separated by silences (the typical pattern when we cough). "To preserve solely the parts of the recording containing relevant information, i.e., coughing, we use a sound activity detector (SAD) based on the energy of the signal", Cuesta explains. Having filtered these data, the next step is to extract the features and subsequently segment them. "Our models use the spectrogram, a time-frequency representation of the audio signal as input". First, we calculate the spectrogram of each recording of the database, and then segment it into fragments of a second each", she adds.
Patient's gender is important
An interesting contribution by the project is the study of different versions of neural networks to investigate whether the patient's gender is a factor for consideration when analysing the cough. "Intuitively, when we approached this work, one of our hypotheses was that men's and women's coughs should have different features, as their vocal tracts differ in size and shape", the authors comment.
Coughs generated by a man and a woman are not necessarily equivalent from the point of view of the spectrogram
From the experiments conducted for their work, one of the most remarkable aspects is that the models that incorporate information on the gender of the patient (gender-based, gender-specific) obtain better results in their predictions in most scenarios assessed by the authors, which confirms the hypothesis that coughs generated by a man and those generated by a woman are not necessarily equivalent from the point of view of the spectrogram.
Cough Sound Track - DiCOVA Challenge The organization of the DiCOVA Challenge provides participants with a database (Coswara dataset) containing 1,040 audio recordings of between 1 and 15 seconds of people coughing. Together with the recordings, this database provides a series of metadata associated with each recording: positive/negative for covid-19, the individual's gender and nationality. "Based on these data, we have developed and evaluated two different neural networks that, using one second of audio, predict positive or negative covid-19", the authors point out.
Although this work is just a first approach to detecting covid-19 via automatic cough analysis, the experiments presented by the authors offer a number of clues to be followed in the next steps of this research. It remains to be understood how the cough signal is altered in covid-19-positive patients. Thus, features could be extracted and specific neural networks designed to improve the quality of the models.
Adrià Mallol-Ragolta, Helena Cuesta, Emilia Gómez, and Björn Schuller (2021), "Cough-based COVID-19 Detection with Contextual Attention Convolutional Neural Networks and Gender Information", Proceedings of Interspeech. Brno, (Czechia): ISCA, 2021, in press.
Funding: This project is funded by the EU's Horizon 2020 research and innovation programme via the projects sustAGE (ID 826506) and TROMPA (ID 770376), and by the AGAUR, Catalan Government (ID 2018FI-B01015).