Integration of Deep Web Sources: A Distributed Information Retrieval Approach
Contributo in Atti di convegno
Data di Pubblicazione:
2017
Abstract:
The Deep Web consists of those structured data that are available as
dynamically generated pages, typically requested through HTML forms. Deep
Web pages cannot be indexed by search engines, and are notoriously difficult
to query and integrate due to the limited access that they offer.
We propose a novel framework for integrating Deep Web sources by means of a
mediated schema that represent the underlying, distributed sources. Our
goal is to compute answers to queries posed on the mediated schema. To this
aim, we propose the use of techniques from the area of Distributed
Information Retrieval. We discuss a novel approach to automated sampling,
size estimation and selection of Deep Web sources, as well as a technique
for merging result lists.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
Deep web; information integration
Elenco autori:
Straccia, Umberto
Link alla scheda completa: