Data di Pubblicazione:
2022
Abstract:
The GDup Software enables an integrated, scalable, general-purpose system for entity deduplication over big information graphs. GDup supports practitioners with the functionalities needed to realize a fully-fledged entity deduplication workflow over a generic input graph, including Ground Truth support, end-user feedback, and strategies for identifying and merging duplicates to obtain an output disambiguated graph. GDup is today one of the core components of the OpenAIRE infrastructure production system, monitoring Open Science trends on behalf of the European Commission.
Tipologia CRIS:
05.11 Software
Keywords:
Deduplication; Framework; Java; Spark; Hadoop
Elenco autori:
DE BONIS, Michele; Manghi, Paolo; Artini, Michele; Dell'Amico, Andrea; Bardi, Alessia; LA BRUZZO, SANDRO FABRIZIO; Baglioni, Miriam; Atzori, Claudio
Link alla scheda completa: