The effect of across-location heteroscedasticity on the classification of mixed categorical and continuous data |
| |
Authors: | Chi-Ying Leung |
| |
Affiliation: | Department of Statistics, The Chinese University of Hong Kong, Shatin NT, Hong Kong, Hong Kong |
| |
Abstract: | Classification of mixed categorical and continuous data is often performed using the location linear discriminant function which assumes across-location homoscedasticity. In this paper, we investigate the hazard arising from a routine application of the classifier under across-location heteroscedasticity. A limiting and a first-order asymptotic performance index are proposed and studied in a general setting. The first index studies the limiting behavior. The second index corrects the bias due to the finite sample size. Both indexes are illustrated under the assumption of unequal spherical covariance matrices across all the locations. This is likely to be the case in most classification problems dealing with mixed categorical and continuous data. Results of a numerical study are reported. |
| |
Keywords: | Location linear discriminant function Across-location heteroscedasticity Expected overall error rate Performance index Asymptotic expansions |
本文献已被 ScienceDirect 等数据库收录! |