Improved Treatment of the Independent Variables for the Deployment of Model Selection Criteria in the Analysis of Complex Systems
Academic Article
Publication Date:
2021
abstract:
Model selection criteria are widely used to identify the model that best represents the
data among a set of potential candidates. Amidst the different model selection criteria, the Bayesian
information criterion (BIC) and the Akaike information criterion (AIC) are the most popular and
better understood. In the derivation of these indicators, it was assumed that the model's dependent
variables have already been properly identified and that the entries are not affected by significant
uncertainties. These are issues that can become quite serious when investigating complex systems,
especially when variables are highly correlated and the measurement uncertainties associated with
them are not negligible. More sophisticated versions of this criteria, capable of better detecting
spurious relations between variables when non-negligible noise is present, are proposed in this paper.
Their derivation is obtained starting from a Bayesian statistics framework and adding an a priori
Chi-squared probability distribution function of the model, dependent on a specifically defined
information theoretic quantity that takes into account the redundancy between the dependent
variables. The performances of the proposed versions of these criteria are assessed through a series
of systematic simulations, using synthetic data for various classes of functions and noise levels. The
results show that the upgraded formulation of the criteria clearly outperforms the traditional ones in
most of the cases reported.
Iris type:
01.01 Articolo in rivista
Keywords:
model selection criteria; Akaike information criterion; Bayesian information criterion; overfitting; redundancy; variable selection; complexity; information theory; relevance
List of contributors:
Murari, Andrea
Published in: