Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze
  1. Pubblicazioni

Harmonizing and merging Italian treebanks: Towards a merged Italian dependency treebank and beyond

Capitolo di libro
Data di Pubblicazione:
2015
Abstract:
In this paper we address the challenge of combining existing CoNLL-compliant dependency-annotated corpora with the final aim of constructing a bigger treebank for the Italian language. To this end, we defined amethodology formapping different annotation schemes, based on: (i)The analysis of similarities and differences of considered source and target dependency annotation schemes; (ii) The analysis of the performance of state of the art dependency parsers trained on the source and target treebanks; (iii) The mapping of the source annotation scheme(s) onto a set of target (possibly underspecified) data categories. This methodology was applied in two different case studies. The first one was aimed at constructing a "Merged Italian Dependency Treebank" (MIDT) starting from existing Italian dependency treebanks, namely TUT and ISST-TANL. The second case study, still ongoing, consists in the conversion of the MIDT resource into the Stanford Dependencies de facto standard with the final aim of developing an "Italian Stanford Dependency Treebank" (ISDT).
Tipologia CRIS:
02.01 Contributo in volume (Capitolo o Saggio)
Keywords:
Harmonization and merging of resources; Italian; Dependency Treebank
Elenco autori:
Montemagni, Simonetta
Autori di Ateneo:
MONTEMAGNI SIMONETTA
Link alla scheda completa:
https://iris.cnr.it/handle/20.500.14243/297500
Titolo del libro:
Harmonization and Development of Resources and Tools for Italian Natural Language Processing within the PARLI Project
  • Dati Generali

Dati Generali

URL

http://www.scopus.com/inward/record.url?eid=2-s2.0-84927143016&partnerID=q2rCbXpz
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)