Data di Pubblicazione:
2019
Abstract:
We observe that most relevant terms in unstructured news articles are primarily concentrated towards the beginning and the end of the document. Exploiting this observation, we propose a novel version of the classical BM25 weighting model, called BM25 Passage (BM25P), which scores query results by computing a linear combination of term statistics in the different portions of news articles. Our experimentation, conducted using three publicly available news datasets, demonstrates that BM25P markedly outperforms BM25 in term of effectiveness by up to 17.44% in NDCG@5 and 85% in NDCG@1.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
Information Retrieval
Elenco autori:
Nardini, FRANCO MARIA; Catena, Matteo; Tonellotto, Nicola; Muntean, CRISTINA-IOANA; Perego, Raffaele
Link alla scheda completa: