Data di Pubblicazione:
2006
Abstract:
This paper focuses on the problem of how data representation
influences the generalization error of kernel based learning
machines like Support Vector Machines (SVM) for classification.
Frame theory provides a well founded mathematical framework for
representing data in many different ways. We analyze the effects
of sparse and dense data representations on the generalization
error of such learning machines measured by using leave-one-out
error given a finite amount of training data. We show that, in the
case of sparse data representations, the generalization error of
an SVM trained by using polynomial or Gaussian kernel functions is
equal to the one of a linear SVM. This is equivalent to saying
that the capacity of separating points of functions belonging to
hypothesis spaces induced by polynomial or Gaussian kernel
functions reduces to the capacity of a separating hyperplane in
the input space. Moreover we show that, in general, sparse data
representations increase or leave unchanged the generalization
error of kernel based methods.
%as long as the representation is not too sparse, as in the case of
%very large dictionaries. Very sparse representations increase
%drastically the generalization error of kernel based methods.
Dense data representations, on the contrary, reduce the
generalization error in the case of very large frames. We use two
different schemes for representing data in overcomplete systems of
Haar and Gabor functions, and measure SVM generalization error on
benchmarked data sets.
Tipologia CRIS:
01.01 Articolo in rivista
Elenco autori:
Maglietta, Rosalia; Ancona, Nicola; Stella, Ettore
Link alla scheda completa:
Pubblicato in: