Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze
  1. Pubblicazioni

Machine learning tools to improve the quality of imperfect keywords

Contributo in Atti di convegno
Data di Pubblicazione:
2022
Abstract:
The availability of keywords that describe the content of a text or document certainly is essential for effective and efficient content-based retrieval. But their quality, the presence of spelling variants, synonyms, near-synonyms, and spelling errors make their use less effective. Here we present a set of tools we are developing for the management of tags. These tools are intended to be used to improve the quality of textual features and to enhance traditional ways of searching and browsing data on the web. This approach integrates different methods: word embedding models, able to capture the semantics of words and their context, clustering algorithms, able to identify/group semantically related terms, and methods able to calculate the syntactic similarity between strings. The work is still under development, and the paper will present some preliminary qualitative results that demonstrate the feasibility of our approach.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
Clustering; Content based retrieval; Multilingual tags; Natural language processing; Quality of data; Semantic relatedness; Syntactic similarity; Word embedding models
Elenco autori:
Gagliardi, Isabella; Artese, MARIA TERESA
Autori di Ateneo:
ARTESE MARIA TERESA
GAGLIARDI ISABELLA
Link alla scheda completa:
https://iris.cnr.it/handle/20.500.14243/420202
Titolo del libro:
The Future of Heritage Science and Technologies: ICT and Digital Heritage. Florence Heri-Tech 2022.
Pubblicato in:
COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE (PRINT)
Series
  • Dati Generali

Dati Generali

URL

https://link.springer.com/chapter/10.1007/978-3-031-20302-2_8
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)