Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze
  1. Pubblicazioni

Making Italian Parliamentary Records Machine-Actionable: the Construction of the ParlaMint-IT corpus

Contributo in Atti di convegno
Data di Pubblicazione:
2022
Abstract:
This paper describes the process of acquisition, cleaning, interpretation, coding and linguistic annotation of a collection of parliamentary debates from the Senate of the Italian Republic covering the COVID-19 pandemic emergency period and a former period for reference and comparison according to the CLARIN ParlaMint prescriptions. The corpus contains 1199 sessions and 79,373 speeches for a total of about 31 million words, and was encoded according to the ParlaCLARIN TEI XML format. It includes extensive metadata about the speakers, sessions, political parties and parliamentary groups. As required by the ParlaMint initiative, the corpus was also linguistically annotated for sentences, tokens, POS tags, lemmas and dependency syntax according to the universal dependencies guidelines. Named entity annotation and classification is also included. All linguistic annotation was performed automatically using state-of-the-art NLP technology with no manual revision. The Italian dataset is freely available as part of the larger ParlaMint 2.1 corpus deposited and archived in CLARIN repository together with all other national corpora. It is also available for direct analysis and inspection via various CLARIN services and has already been used both for research and educational purposes.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
parliamentary debates; CLARIN ParlaMint; corpus creation; corpus annotation
Elenco autori:
Montemagni, Simonetta; Bartolini, Roberto; Agnoloni, Tommaso; Quochi, Valeria; Frontini, Francesca; Venturi, Giulia
Autori di Ateneo:
AGNOLONI TOMMASO
BARTOLINI ROBERTO
FRONTINI FRANCESCA
MONTEMAGNI SIMONETTA
QUOCHI VALERIA
VENTURI GIULIA
Link alla scheda completa:
https://iris.cnr.it/handle/20.500.14243/446358
  • Dati Generali

Dati Generali

URL

https://aclanthology.org/2022.parlaclarin-1.17/
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)