Data di Pubblicazione:
2000
Abstract:
A novel access structure for similarity search in metric data, called Similarity Hashing (SH), is proposed. Its multi-level hash structure of separable buckets on each level supports easy insertion and bounded search costs, because at most one bucket needs to be accessed at each level for range queries up to a pre-defined value of search radius. At the same time, the number of distance computations is always significantly reduced by use of pre-computed distances obtained at insertion time. Buckets of static files can be arranged in such a way that the I/O costs never exceed the costs to scan a compressed sequential file. Experimental results demonstrate that the performance of SH is superior to the available tree-based structures. Contrary to tree organizations, the SH structure is suitable for distributed and parallel implementations.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
Similarity search; Metric space; Information search and retrieval
Elenco autori:
Zezula, Pavel; Savino, Pasquale; Gennaro, Claudio
Link alla scheda completa:
Link al Full Text:
Titolo del libro:
Proceedings of the first DELOS workshop on Information Seeking, Searching and Querying in Digital Libraries