A Deep Learning-Based Multimodal Architecture to predict Signs of Dementia

Academic Article

Publication Date:

2023

abstract:

This paper proposes a multimodal deep learning architecture combining text and audio information to predict dementia, a disease which affects around 55 million people all over the world and makes them in some cases dependent people. The system was evaluated on the DementiaBank Pitt Corpus dataset, which includes audio recordings as well as their transcriptions for healthy people and people with dementia. Different models have been used and tested, including Convolutional Neural Networks (CNN) for audio classification, Transformers for text classification, and a combination of both in a multimodal ensemble. These models have been evaluated on a test set, obtaining the best results by using the text modality, achieving 90.36% accuracy on the task of detecting dementia. Additionally, an analysis of the corpus has been conducted for the sake of explainability, aiming to obtain more information about how the models generate their predictions and identify patterns in the data.

Iris type:

01.01 Articolo in rivista

Keywords:

deep learning; dementia

List of contributors:

Leo, Marco

Authors of the University:

LEO MARCO

Handle:

https://iris.cnr.it/handle/20.500.14243/452254

Published in:

NEUROCOMPUTING

Journal