Data di Pubblicazione:
2004
Abstract:
Text Categorization (TC) is the discipline concerned with the construction of automatic text classifiers, i.e. programs capable of assigning to a document one or more among a set of predefined categories based on the content of the document. Building these classifiers is itself done automatically, by means of a general inductive process that learns the characteristics of the categories from a set of preclassified documents. In this paper we discuss a class of applications, automatic indexing with controlled vocabularies, that is of direct concern to organizing digital libraries. We exemplify this class of applications by discussing an ongoing project aimed at classifying scientific papers about computer science with respect to the ACM Classification Scheme.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
Hierarchical text classification; Hierarchical clustering
Elenco autori:
Avancini, HENRI HECTOR; Sebastiani, Fabrizio
Link alla scheda completa: