Data di Pubblicazione:
2006
Abstract:
Automatic text classification (ATC) is a discipline at the crossroads of information retrieval (IR), machine learning (ML), and computational linguistics (CL), and consists in the realization of text classifiers, i.e. software systems capable of assigning texts to one or more categories, or classes, from a predefined set. Applications range from the automated indexing of scientific articles, to e-mail routing, spam filtering, authorship attribution, and automated survey coding. This article will focus on the ML approach to ATC, whereby a software system (called the learner) automatically builds a classifier for the categories of interest by generalizing from a 'training' set of pre-classified texts.
Tipologia CRIS:
02.01 Contributo in volume (Capitolo o Saggio)
Keywords:
Text classification; Text categorization; Supervised learning
Elenco autori:
Sebastiani, Fabrizio
Link alla scheda completa:
Titolo del libro:
The Encyclopedia of Language and Linguistics