Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • People
  • Outputs
  • Organizations
  • Expertise & Skills

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • People
  • Outputs
  • Organizations
  • Expertise & Skills
  1. Outputs

Atypical or underrepresented? A pilot study on small treebanks

Conference Paper
Publication Date:
2021
abstract:
We illustrate an approach for multilingual treebanks explorations by introducing a novel adaptation to small treebanks of a methodology for identifying cross-lingual quantitative trends in the distribution of dependency relations. By relying on the principles of cross-validation, we reduce the amount of data required to execute the method, paving the way to expanding its use to low-resources languages. We validated the approach on 8 small treebanks, each containing less than 100,000 tokens and representing typologically different languages. We also show preliminary but promising evidence on the use of the proposed methodology for treebank expansion.
Iris type:
04.01 Contributo in Atti di convegno
Keywords:
universal depedency; language resources; quality check; treebank expansion
List of contributors:
Alzetta, Chiara
Handle:
https://iris.cnr.it/handle/20.500.14243/446049
Published in:
CEUR WORKSHOP PROCEEDINGS
Series
  • Overview

Overview

URL

http://www.scopus.com/record/display.url?eid=2-s2.0-85121279597&origin=inward
  • Use of cookies

Powered by VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)