Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze
  1. Pubblicazioni

Preprocessing pipeline for Italian cultural heritage multimedia datasets

Contributo in Atti di convegno
Data di Pubblicazione:
2019
Abstract:
Preprocessing is an important task and a fundamental step in Information Retrieval, Text Mining, Natural Language Processing (NLP). While datasets in the English language can rely on well-established tools and methods for text preprocessing, the situation for the Italian language is more nuanced, due to a sum of factors, not least that few er experiments and studies were made, and algorithms developed. Here we present an experimentation, a work in progress whose purpose is to define a pipeline able to preprocess texts. The different steps of the pipeline have been implemented and tested individually on Cultural Heritage datasets. The results obtained have been evaluated in the context of unsupervised automatic keyword extraction algorithms, such as RAKE or TextRank.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
N/A
Elenco autori:
Gagliardi, Isabella; Artese, MARIA TERESA
Autori di Ateneo:
ARTESE MARIA TERESA
GAGLIARDI ISABELLA
Link alla scheda completa:
https://iris.cnr.it/handle/20.500.14243/368582
Titolo del libro:
ARCHIVING 2019: Digitization, Preservation, and Access - Final Program and Proceedings
  • Dati Generali

Dati Generali

URL

https://www.ingentaconnect.com/content/ist/ac/2019/00002019/00000001/art00018
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)