A graph neural network approach for evaluating correctness of groups of duplicates
Contributo in Atti di convegno
Data di Pubblicazione:
2023
Abstract:
Unlabeled entity deduplication is a relevant task already studied in the recent literature. Most methods can be traced back to the following workflow: entity blocking phase, in-block pairwise comparisons between entities to draw similarity relations, closure of the resulting meshes to create groups of duplicate entities, and merging group entities to remove disambiguation. Such methods are effective but still not good enough whenever a very low false positive rate is required. In this paper, we present an approach for evaluating the correctness of "groups of duplicates", which can be used to measure the group's accuracy hence its likelihood of false-positiveness. Our novel approach is based on a Graph Neural Network that exploits and combines the concept of Graph Attention and Long Short Term Memory (LSTM). The accuracy of the proposed approach is verified in the context of Author Name Disambiguation applied to a curated dataset obtained as a subset of the OpenAIRE Graph that includes PubMed publications with at least one ORCID identifier.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
Disambiguation; Graph neural network; Scholarly knowledge graphs
Elenco autori:
DE BONIS, Michele; Manghi, Paolo; Falchi, Fabrizio
Link alla scheda completa:
Link al Full Text:
Titolo del libro:
Linking Theory and Practice of Digital Libraries