Converting and structuring a digital historical dictionary of Italian: a case study
Conference Paper
Publication Date:
2019
abstract:
The paper describes ongoing work on the digitization of an authoritative historical
Italian dictionary, namely Il Grande Dizionario della Lingua Italiana (GDLI), with a
specific view to creating the prerequisites for advanced human-oriented querying. After
discussing the general approach taken to extract and structure the GDLI contents, in
the paper we report the encouraging results of a case study carried out against two
volumes which have been selected for the different conversion issues raised. Dictionary
content extraction and structuring is being carried out through an iterative process
based on hand coded patterns: starting from the recognition of the entry headword, a
series of truth conditions are tested which allow the building and progressive
structuring, in successive steps, of the whole lexical entry. We also started to design
the representation of extracted and structured entries in a standard format, encoded in
TEI. An outline of an example entry is also provided and illustrated in order to show
what the end result will look like.
Iris type:
04.01 Contributo in Atti di convegno
Keywords:
historical dictionaries; automatic acquisition; TEI representation
List of contributors: