Data di Pubblicazione:
2007
Abstract:
In this paper we study the trade-offs in designing efficient caching systems for Web search engines. We explore the impact of different approaches, such as static vs. dynamic caching, and caching query results vs. caching posting lists.
Using a query log spanning a whole year we explore the limitations of caching and we demonstrate that caching posting lists can achieve higher hit rates than caching query answers. We propose a new algorithm for static caching of
posting lists, which outperforms previous methods. We also study the problem of finding the optimal way to split the static cache between answers and posting lists. Finally, we measure how the changes in the query log affect the effectiveness
of static caching, given our observation that the distribution of the queries changes slowly over time. Our results and observations are applicable to different levels of the data-access hierarchy, for instance, for a memory/disk
layer or a broker/remote server layer.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
H.3.3 Information Search and Retrieval. Search process; H.3.4 Systems and Software. Distributed systems; H.3.4 Systems and Software. Performance evaluation (efficiency and effectiveness); Caching; Web search
Elenco autori:
Silvestri, Fabrizio
Link alla scheda completa:
Titolo del libro:
SIGIR '07 The 30th Annual International SIGIR Conference Amsterdam -- July 23 - 27, 2007