Publication Date:
2021
abstract:
In this paper, we present tools for addressing noisy keyword issues in digital libraries. Two tasks, language detection and misspelling detection and correction, are addressed using both machine learning and deep learning techniques. To train and validate the models, different datasets were used/created/scraped. Encouraging preliminary results are presented and discussed.
Iris type:
04.01 Contributo in Atti di convegno
Keywords:
Content based retrieval; Digital library; Noisy data; Tags; Unsupervised tools
List of contributors:
Gagliardi, Isabella; Artese, MARIA TERESA
Book title:
Digital Presentation and Preservation of Cultural and Scientific Heritage
Published in: