Active learning strategies for multi-label text classification

Contributo in Atti di convegno

Data di Pubblicazione:

2009

Abstract:

Active learning refers to the task of devising a ranking function that, given a classifier trained from relatively few training examples, ranks a set of additional unlabeled examples in terms of how much further information they would carry, once manually labeled, for retraining a (hopefully) better classifier. Research on active learning in text classification has so far concentrated on single-label classification; active learning for multi-label classification, instead, has either been tackled in a simulated (and, we contend, non-realistic) way, or neglected tout court. In this paper we aim to fill this gap by examining a number of realistic strategies for tackling active learning for multi-label classification. Each such strategy consists of a rule for combining the outputs returned by the individual binary classifiers as a result of classifying a given unlabeled document. We present the results of extensive experiments in which we test these strategies on two standard text classification datasets.

Tipologia CRIS:

04.01 Contributo in Atti di convegno

Keywords:

Information Storage and Retrieval; Learning; Classifier design and evaluation; Active learning; Text classification

Elenco autori: