Statistics Seminar, "Optimal Pooling of Covariance Matrix Estimates Across Multiple Classes", Elias Raninen, Aalto University
We consider the problem of estimating the covariance matrices of multiple classes in a low sample support condition, where the data dimensionality is comparable to, or larger than, the sample sizes of the available data sets. In such conditions, a common approach is to shrink the class sample covariance matrices (SCMs) towards the pooled SCM. The success of this approach hinges upon the ability to choose the optimal regularization parameter. Typically, a common regularization level is shared among the classes and determined via a procedure based on cross-validation. We use class-specific regularization levels since this enables the derivation of the optimal regularization parameter for each class in terms of the minimum mean squared error (MMSE). The optimal parameters depend on the true unknown class population covariances. Consistent estimators of the parameters can, however, be easily constructed under the assumption that the class populations follow (unspecified) elliptically symmetric distributions. The performance of the proposed method is demostrated via a simulation study as well as via an application to discriminant analysis using both synthetic and real data sets.