Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • People
  • Outputs
  • Organizations
  • Expertise & Skills

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • People
  • Outputs
  • Organizations
  • Expertise & Skills
  1. Outputs

Harmonizing and merging Italian treebanks: Towards a merged Italian dependency treebank and beyond

Chapter
Publication Date:
2015
abstract:
In this paper we address the challenge of combining existing CoNLL-compliant dependency-annotated corpora with the final aim of constructing a bigger treebank for the Italian language. To this end, we defined amethodology formapping different annotation schemes, based on: (i)The analysis of similarities and differences of considered source and target dependency annotation schemes; (ii) The analysis of the performance of state of the art dependency parsers trained on the source and target treebanks; (iii) The mapping of the source annotation scheme(s) onto a set of target (possibly underspecified) data categories. This methodology was applied in two different case studies. The first one was aimed at constructing a "Merged Italian Dependency Treebank" (MIDT) starting from existing Italian dependency treebanks, namely TUT and ISST-TANL. The second case study, still ongoing, consists in the conversion of the MIDT resource into the Stanford Dependencies de facto standard with the final aim of developing an "Italian Stanford Dependency Treebank" (ISDT).
Iris type:
02.01 Contributo in volume (Capitolo o Saggio)
Keywords:
Harmonization and merging of resources; Italian; Dependency Treebank
List of contributors:
Montemagni, Simonetta
Authors of the University:
MONTEMAGNI SIMONETTA
Handle:
https://iris.cnr.it/handle/20.500.14243/297500
Book title:
Harmonization and Development of Resources and Tools for Italian Natural Language Processing within the PARLI Project
  • Overview

Overview

URL

http://www.scopus.com/inward/record.url?eid=2-s2.0-84927143016&partnerID=q2rCbXpz
  • Use of cookies

Powered by VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)