Data di Pubblicazione:
2012
Abstract:
The present paper describes LMF LExical MErger (L-LEME), an architecture to combine two lexicons in order to obtain new resource(s). L-LEME relies on standards, thus exploiting the benefits of the ISO Lexical Markup Framework (LMF) to ensure interoperability. L-LEME is meant to be dynamic and heavily adaptable: it allows the users to configure it to meet their specific needs. The L-LEME architecture is composed of two main modules: the Mapper, which takes in input two lexicons A and B and a set of user-defined rules and instructions to guide the mapping process (Directives D) and gives in output all matching entries. The algorithm also calculates a cosine similarity score. The Builder takes in input the previous results, a set of Directives D1 and produces a new LMF lexicon C. The Directives allow the user to define its own building rules and different merging scenarios. L-LEME is applied to a specific concrete task within the PANACEA project, namely the merging of two Italian SubCategorization Frame (SCF) lexicons. The experiment is interesting in that A and B have different philosophies behind, being A built by human introspection and B automatically extracted. Ultimately, L-LEME has interesting repercussions in many language technology applications
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
LMF; Lexicon mapping; similarity score
Elenco autori:
Abrate, Matteo; Frontini, Francesca; Rubino, Francesco; LO DUCA, Angelica; Monachini, Monica; Quochi, Valeria; DEL GRATTA, Riccardo
Link alla scheda completa:
Titolo del libro:
Proceedings of the LREC 2012 Workshop on Language Resource Merging