Publication Date:
2005
abstract:
In this paper the problem of indexing heterogeneous structured documents and of retrieving semi-structured documents is considered. We propose a flexible paradigm for both indexing such documents and formulating user queries specifying soft constraints on both documents' structure and content. At the indexing level we propose a model that achieves flexibility by constructing personalised document representations based on usersÂ’ views of the documents. This is obtained by allowing users to specify their preferences on the documentsÂ’ sections that they estimate to bear the most interesting information, as well as to linguistically quantify the number of sections which determine the global potential interest of the documents. At the query language level, a flexible query language for expressing soft selection conditions on both the documentsÂ’ structure and content is proposed.
Iris type:
01.01 Articolo in rivista
Keywords:
indicizzazione; documenti semistrutturati; operatori di aggregazione; linguaggi di interrogazione; information retreival
List of contributors:
Pasi, Gabriella; Bordogna, Gloria
Published in: