Regressor and Structure Selection Uses of ANOVA in System Identification

University dissertation from Institutionen för systemteknik

Abstract: Identification of nonlinear dynamical models of a black box nature involves both structure decisions (i.e., which regressors to use and the selection of a regressor function), and the estimation of the parameters involved. The typical approach in system identification is often a mix of all these steps, which for example means that the selection of regressors is based on the fits that is achieved for different choices. Alternatively one could then interpret the regressor selection as based on hypothesis tests (F-tests) at a certain confidence level that depends on the data. It would in many cases be desirable to decide which regressors to use, independently of the other steps. A survey of regressor selection methods used for linear regression and nonlinear identification problems is given.In this thesis we investigate what the well known method of analysis of variance (ANOVA) can offer for this problem. System identification applications violate many of the ideal conditions for which ANOVA was designed and we study how the method performs under such non-ideal conditions. It turns out that ANOVA gives better and more homogeneous results compared to several other regressor selection methods. Some practical aspects are discussed, especially how to categorise the data set for the use of ANOVA, and whether to balance the data set used for structure identification or not.An ANOVA-based method, Test of Interactions using Layout for Intermixed ANOVA (TILIA), for regressor selection in typical system identification problems with many candidate regressors is developed and tested with good performance on a variety of simulated and measured data sets.Typical system identification applications of ANOVA, such as guiding the choice of linear terms in the regression vector and the choice of regime variables in local linear models, are investigated.It is also shown that the ANOVA problem can be recast as an optimisation problem. Two modified, convex versions of the ANOVA optimisation problem are then proposed, and it turns out that they are closely related to the nn-garrote and wavelet shrinkage methods, respectively. In the case of balanced data, it is also shown that the methods have a nice orthogonality property in the sense that different groups of parameters can be computed independently.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)