Evaluating Transformer Models for Punctuation Restoration in Italian

Contributo in Atti di convegno

Data di Pubblicazione:

2021

Abstract:

In this paper, we propose an evaluation of a Transformerbased punctuation restoration model for the Italian language. Experimenting with a BERT-base model, we perform several fine-tuning with different training data and sizes and tested them in an in- and crossdomain scenario. Moreover, we offer a comparison in a multilingual setting with the same model fine-tuned on English transcriptions. Finally, we conclude with an error analysis of the main weaknesses of the model related to specific punctuation marks.

Tipologia CRIS:

04.01 Contributo in Atti di convegno

Keywords:

transformer models; nlp; punctuation restoration

Elenco autori: