首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A general methodology for selecting predictors for Gaussian generative classification models is presented. The problem is regarded as a model selection problem. Three different roles for each possible predictor are considered: a variable can be a relevant classification predictor or not, and the irrelevant classification variables can be linearly dependent on a part of the relevant predictors or independent variables. This variable selection model was inspired by a previous work on variable selection in model-based clustering. A BIC-like model selection criterion is proposed. It is optimized through two embedded forward stepwise variable selection algorithms for classification and linear regression. The model identifiability and the consistency of the variable selection criterion are proved. Numerical experiments on simulated and real data sets illustrate the interest of this variable selection methodology. In particular, it is shown that this well ground variable selection model can be of great interest to improve the classification performance of the quadratic discriminant analysis in a high dimension context.  相似文献   

2.
Classification models can be developed by statistical or mathematical programming discriminant analysis techniques. Variable selection extensions of these techniques allow the development of classification models with a limited number of variables. Although stepwise statistical variable selection methods are widely used, the performance of the resultant classification models may not be optimal because of the stepwise selection protocol and the nature of the group separation criterion. A mixed integer programming approach for selecting variables for maximum classification accuracy is developed in this paper and the performance of this approach, measured by the leave-one-out hit rate, is compared with the published results from a statistical approach in which all possible variable subsets were considered. Although this mixed integer programming approach can only be applied to problems with a relatively small number of observations, it may be of great value where classification decisions must be based on a limited number of observations.  相似文献   

3.
We propose a criterion for variable selection in discriminant analysis. This criterion permits to arrange the variables in decreasing order of adequacy for discrimination, so that the variable selection problem reduces to that of the estimation of suitable permutation and dimensionality. Then, estimators for these parameters are proposed and the resulting method for selecting variables is shown to be consistent. In a simulation study, we compute proportions of correct classification after variable selection in order to gain understanding of the performance of our proposal and to compare it to existing methods.  相似文献   

4.
Block clustering aims to reveal homogeneous block structures in a data table. Among the different approaches of block clustering, we consider here a model-based method: the Gaussian latent block model for continuous data which is an extension of the Gaussian mixture model for one-way clustering. For a given data table, several candidate models are usually examined, which differ for example in the number of clusters. Model selection then becomes a critical issue. To this end, we develop a criterion based on an approximation of the integrated classification likelihood for the Gaussian latent block model, and propose a Bayesian information criterion-like variant following the same pattern. We also propose a non-asymptotic exact criterion, thus circumventing the controversial definition of the asymptotic regime arising from the dual nature of the rows and columns in co-clustering. The experimental results show steady performances of these criteria for medium to large data tables.  相似文献   

5.
We propose a new binary classification and variable selection technique especially designed for high-dimensional predictors. Among many predictors, typically, only a small fraction of them have significant impact on prediction. In such a situation, more interpretable models with better prediction accuracy can be obtained by variable selection along with classification. By adding an ?1-type penalty to the loss function, common classification methods such as logistic regression or support vector machines (SVM) can perform variable selection. Existing penalized SVM methods all attempt to jointly solve all the parameters involved in the penalization problem altogether. When data dimension is very high, the joint optimization problem is very complex and involves a lot of memory allocation. In this article, we propose a new penalized forward search technique that can reduce high-dimensional optimization problems to one-dimensional optimization by iterating the selection steps. The new algorithm can be regarded as a forward selection version of the penalized SVM and its variants. The advantage of optimizing in one dimension is that the location of the optimum solution can be obtained with intelligent search by exploiting convexity and a piecewise linear or quadratic structure of the criterion function. In each step, the predictor that is most able to predict the outcome is chosen in the model. The search is then repeatedly used in an iterative fashion until convergence occurs. Comparison of our new classification rule with ?1-SVM and other common methods show very promising performance, in that the proposed method leads to much leaner models without compromising misclassification rates, particularly for high-dimensional predictors.  相似文献   

6.
Algebras over estimation algorithms in the set of regular problems with nonoverlapping classes are considered. A correctness criterion for the arbitrary degree algebraic closure of the model of estimation algorithms in the classification problems of this type is proposed; this criterion can be efficiently verified. An estimate of the minimal degree of the algebraic closure that is sufficient for constructing a correct classifier in an arbitrary regular problem with nonoverlapping classes is found.  相似文献   

7.
In developing a classification model for assigning observations of unknown class to one of a number of specified classes using the values of a set of features associated with each observation, it is often desirable to base the classifier on a limited number of features. Mathematical programming discriminant analysis methods for developing classification models can be extended for feature selection. Classification accuracy can be used as the feature selection criterion by using a mixed integer programming (MIP) model in which a binary variable is associated with each training sample observation, but the binary variable requirements limit the size of problems to which this approach can be applied. Heuristic feature selection methods for problems with large numbers of observations are developed in this paper. These heuristic procedures, which are based on the MIP model for maximizing classification accuracy, are then applied to three credit scoring data sets.  相似文献   

8.
This paper presents a generalization of Rao's covariance structure. In a general linear regression model, we classify the error covariance structure into several categories and investigate the efficiency of the ordinary least squares estimator (OLSE) relative to the Gauss–Markov estimator (GME). The classification criterion considered here is the rank of the covariance matrix of the difference between the OLSE and the GME. Hence our classification includes Rao's covariance structure. The results are applied to models with special structures: a general multivariate analysis of variance model, a seemingly unrelated regression model, and a serial correlation model.  相似文献   

9.
A new definition of information is proposed that is minimalistic and incorporates a lifetime requirement for conditions, (which we define here) applied to anything that can be considered to be information. The definition explicitly treats a state in thermodynamic equilibrium as an effectively zero information state. The definition includes the absolute requirement of selection for achieving information; the selection criterion being that the information directly or indirectly contributes to its own replication. The definition also explicitly incorporates the Laws of Thermodynamics, the Second Law leading to the requirement for replication. Finally, the definition explains the origin of information and predicts the monotonic increase of information with time. Published 2010 Wiley Periodicals, Inc. Complexity, 2011  相似文献   

10.
The curse of dimensionality is based on the fact that high dimensional data is often difficult to work with. A large number of features can increase the noise of the data and thus the error of a learning algorithm. Feature selection is a solution for such problems where there is a need to reduce the data dimensionality. Different feature selection algorithms may yield feature subsets that can be considered local optima in the space of feature subsets. Ensemble feature selection combines independent feature subsets and might give a better approximation to the optimal subset of features. We propose an ensemble feature selection approach based on feature selectors’ reliability assessment. It aims at providing a unique and stable feature selection without ignoring the predictive accuracy aspect. A classification algorithm is used as an evaluator to assign a confidence to features selected by ensemble members based on their associated classification performance. We compare our proposed approach to several existing techniques and to individual feature selection algorithms. Results show that our approach often improves classification performance and feature selection stability for high dimensional data sets.  相似文献   

11.
A method for feature selection in linear regression based on an extension of Akaike’s information criterion is proposed. The use of classical Akaike’s information criterion (AIC) for feature selection assumes the exhaustive search through all the subsets of features, which has unreasonably high computational and time cost. A new information criterion is proposed that is a continuous extension of AIC. As a result, the feature selection problem is reduced to a smooth optimization problem. An efficient procedure for solving this problem is derived. Experiments show that the proposed method enables one to efficiently select features in linear regression. In the experiments, the proposed procedure is compared with the relevance vector machine, which is a feature selection method based on Bayesian approach. It is shown that both procedures yield similar results. The main distinction of the proposed method is that certain regularization coefficients are identical zeros. This makes it possible to avoid the underfitting effect, which is a characteristic feature of the relevance vector machine. A special case (the so-called nondiagonal regularization) is considered in which both methods are identical.  相似文献   

12.
In automobile insurance, policyholders are generally classified according to rating factors such as age, territory and type of vehicle and charged appropriate premiums for their risk class. The problem of designing an efficient classification system can be considered as a contiguous clustering problem. A least squares decision criterion is employed and illustrated for both unidimensional and multidimensional cases. Both the benefit and cost associated with classifying risks are functions of the degree of complexity of the system, and the least squares criterion is used as a decision support mechanism for aiding system design decisions.  相似文献   

13.
In this Note, we consider the problem of order selection of vector autoregressive moving-average (VARMA) models under the assumption that the errors are uncorrelated, but not necessarily independent. These models are called weak VARMA by opposition to the standard VARMA models, also called strong VARMA models, in which the error terms are supposed to be iid. This selection is based on minimizing an information criterion, especially that introduced by Akaike. The theoretical foundations of the Akaike information criterion (AIC) are not more established when the iid assumption on the noise is relaxed. We propose a modified AIC criterion, and which may be very different from the standard AIC criterion.  相似文献   

14.
An algorithm of simplicial optimization is proposed where a bi-criteria selection of a simplex for the bi-section is applied. The first criterion is the minimum of estimated Lipschitz lower bound over the considered simplex. The second criterion is the diameter of the simplex. The results of experimental testing are included.  相似文献   

15.
在微分方程定性理论中.一次奇点的分类.高次奇点的分类,极限环的稳定性等,都是需要研究的重要问题,并且是用不同的方法来加以解决的.而高次奇点中焦点与中心的区分,至今还是一个未解决的问题.在本文中,我们从理论上阐明了.所有上述问题都可利用积分因子的概念而统一地加以处理.此外,我们并给出了判别中心与焦点的方法,这一方法对于一次奇点与高次奇点都是同样适用的.从而解决了关于高次奇点的中心与焦点的区分问题.  相似文献   

16.
This paper discusses the topic of model selection for finite-dimensional normal regression models. We compare model selection criteria according to prediction errors based upon prediction with refitting, and prediction without refitting. We provide a new lower bound for prediction without refitting, while a lower bound for prediction with refitting was given by Rissanen. Moreover, we specify a set of sufficient conditions for a model selection criterion to achieve these bounds. Then the achievability of the two bounds by the following selection rules are addressed: Rissanen's accumulated prediction error criterion (APE), his stochastic complexity criterion, AIC, BIC and the FPE criteria. In particular, we provide upper bounds on overfitting and underfitting probabilities needed for the achievability. Finally, we offer a brief discussion on the issue of finite-dimensional vs. infinite-dimensional model assumptions.Support from the National Science Foundation, grant DMS 8802378 and support from ARO, grant DAAL03-91-G-007 to B. Yu during the revision are gratefully acknowledged.  相似文献   

17.
The efficacy of family-based approaches to mixture model-based clustering and classification depends on the selection of parsimonious models. Current wisdom suggests the Bayesian information criterion (BIC) for mixture model selection. However, the BIC has well-known limitations, including a tendency to overestimate the number of components as well as a proclivity for underestimating, often drastically, the number of components in higher dimensions. While the former problem might be soluble by merging components, the latter is impossible to mitigate in clustering and classification applications. In this paper, a LASSO-penalized BIC (LPBIC) is introduced to overcome this problem. This approach is illustrated based on applications of extensions of mixtures of factor analyzers, where the LPBIC is used to select both the number of components and the number of latent factors. The LPBIC is shown to match or outperform the BIC in several situations.  相似文献   

18.
联合广义线性模型中的变量选择   总被引:1,自引:0,他引:1       下载免费PDF全文
在联合广义线性模型中,均值和散度参数都被赋予了广义线性模型的结构,本文主要考虑该模型的变量选择问题. 文章利用扩展拟似然函数,提出了一个适用于联合广义线性模型的新的变量选择准则(EAIC),该准则是Akaike信息准则的推广.论文通过模拟研究和一个实例分析验证了该准则的效果.  相似文献   

19.
This paper deals with the unsupervised classification of univariate observations. Given a set of observations originating from a K-component mixture, we focus on the estimation of the component expectations. We propose an algorithm based on the minimization of the “K-product” (KP) criterion we introduced in a previous work. We show that the global minimum of this criterion can be reached by first solving a linear system then calculating the roots of some polynomial of order K. The KP global minimum provides a first raw estimate of the component expectations, then a nearest-neighbour classification enables to refine this estimation. Our method’s relevance is finally illustrated through simulations of various mixtures. When the mixture components do not strongly overlap, the KP algorithm provides better estimates than the Expectation-Maximization algorithm.  相似文献   

20.
This paper compares heuristic criteria used for extracting a pre-specified number of fuzzy classification rules from numerical data. We examine the performance of each heuristic criterion through computational experiments on well-known test problems. Experimental results show that better results are obtained from composite criteria of confidence and support measures than their individual use. It is also shown that genetic algorithm-based rule selection can improve the classification ability of extracted fuzzy rules by searching for good rule combinations. This observation suggests the importance of taking into account the combinatorial effect of fuzzy rules (i.e., the interaction among them).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号