Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze
  1. Pubblicazioni

Effective Incremental Clustering for Duplicate Detection in Large Databases

Contributo in Atti di convegno
Data di Pubblicazione:
2006
Abstract:
We propose an incremental algorithm for discovering clusters of duplicate tuples in large databases. The core of the approach is the usage of an indexing technique which, for any newly arrived tuple mu, allows to efficiently retrieve a set of tuples in the database which are mostly similar to P, and which are likely to refer to the same real-world entity which is associated with mu. The proposed index is based on a hashing approach which tends to assign similar objects to the same buckets. Empirical and analytical evaluation demonstrates that the proposed approach achieves satisfactory efficiency results, at the cost of low accuracy loss.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Elenco autori:
Manco, Giuseppe; Pontieri, Luigi; Folino, FRANCESCO PAOLO
Autori di Ateneo:
FOLINO FRANCESCO PAOLO
MANCO GIUSEPPE
PONTIERI LUIGI
Link alla scheda completa:
https://iris.cnr.it/handle/20.500.14243/70001
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)