Deterministic learning and an application in optimal control

Chapter

Publication Date:

2006

abstract:

In a wide variety of real world situations a functional dependence y = g(x) has to be estimated from a set of observations (xL; yL) = f(xl; yl); l = 0; : : : ;L ¡ 1g concerning a phenomenon of interest. This is the case when the behavior of a continuous signal has to be forecast starting from its previous history or when the value of an unmeasurable quantity has to be inferred from the measurements of other related variables. If an insu±cient amount of a priori information about the form of the functional dependence g is available, its estimation must provide for two di®erent actions: 1. at ¯rst a su±ciently large class ¡ of functions must be properly selected (model selection); 2. then, the best element g 2 ¡ must be retrieved by adopting a suitable optimization algorithm (training phase). The model selection task is usually performed by taking a very general paradigm, whose complexity can be controlled by acting on a small number of constant values. For example, the usual polynomial series expansion can approximate arbitrarily well every measurable function, i.e., polynomials are universal approximators. However, by including in ¡ only the functions whose polynomial series expansion does not contain terms with exponent greater than a prescribed maximum k, we can control the richness of the class ¡. In particular, if k = 1 only linear functions are included in ¡, if k = 2 the expansion can realize only linear and quadratic functions, etc. Other general paradigms have been extensively used for model selection: neural networks have been shown to possess the universal approximation property [1, 2, 3, 4] and have been successfully applied in a lot of di®erent ¯elds. In this case, the complexity of the class ¡ can be controlled by acting on the architecture of the network (the number of layers and the number of neurons in the feedforward structure). Once the class ¡ has been chosen, the optimization algorithm to be employed in the training phase is selected accordingly. For example, the back-propagation technique (and its modi¯cations) is often adopted to retrieve the function in ¡ that best ¯ts the collection (xL; yL) of observations at our disposal, usually called training set. However, the basic goal to be pursued is to obtain a function g that generalizes well, i.e., that behaves correctly even in correspondence with other points of the domain not included in the training set. How can it be guaranteed that the element of ¡ that best ¯ts our observations also generalizes well? This is a fundamental question in the context of learning theory. Since in many practical situations the input vectors xl cannot be freely chosen and the training set (xL; yL) can be corrupted by noise, most results on learning theory are based on a statistical framework, which arose in the pattern recognition community [5, 6, 7, 8] and has been naturally extended to other inductive problems, like regression estimation [9, 10, 11] and probability density reconstruction [10]. In this framework, called Statistical Learning (SL), the input vectors xl, for l = 0; : : : ;L ¡ 1, are viewed as realizations of a random variable, generated according to an unknown (but ¯xed) probability density p. On the other hand, there are several cases where the position of the points xl in the input space can be suitably selected for the problem at hand. If a deterministic algorithm is employed to choose the input vectors xl, SL is no more the most appropriate approach. In this case a new framework, called Deterministic Learning (DL), is able to catch the peculiarities of the situation at hand, thus providing precise conditions about the generalization ability of thefunction g 2 ¡ that best ¯ts the observations of the training set (xL; yL). This chapter pres

Iris type:

02.01 Contributo in volume (Capitolo o Saggio)

List of contributors:

Cervellera, Cristiano; Muselli, Marco

Authors of the University:

CERVELLERA CRISTIANO

MUSELLI MARCO

Handle:

https://iris.cnr.it/handle/20.500.14243/139418

Published in:

ADVANCES IN IMAGING AND ELECTRON PHYSICS

Journal