首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, we consider the expected probabilities of misclassification (EPMC) in the linear discriminant function (LDF) based on two-step monotone missing samples and derive an asymptotic approximation for the EPMC with an explicit form for the considered LDF. For this purpose, we also provide some results of the expectations for the inverted Wishart matrices in this paper. Finally, we conduct the Monte Carlo simulation for evaluating our result.  相似文献   

2.
A class of discriminant rules which includes Fisher’s linear discriminant function and the likelihood ratio criterion is defined. Using asymptotic expansions of the distributions of the discriminant functions in this class, we derive a formula for cut-off points which satisfy some conditions on misclassification probabilities, and derive the optimal rules for some criteria. Some numerical experiments are carried out to examine the performance of the optimal rules for finite numbers of samples.  相似文献   

3.
Robust S-estimation is proposed for multivariate Gaussian mixture models generalizing the work of Hastie and Tibshirani (J. Roy. Statist. Soc. Ser. B 58 (1996) 155). In the case of Gaussian Mixture models, the unknown location and scale parameters are estimated by the EM algorithm. In the presence of outliers, the maximum likelihood estimators of the unknown parameters are affected, resulting in the misclassification of the observations. The robust S-estimators of the unknown parameters replace the non-robust estimators from M-step of the EM algorithm. The results were compared with the standard mixture discriminant analysis approach using the probability of misclassification criterion. This comparison showed a slight reduction in the average probability of misclassification using robust S-estimators as compared to the standard maximum likelihood estimators.  相似文献   

4.
In this paper we present a general notion of Fisher's linear discriminant analysis that extends the classical multivariate concept to situations that allow for function-valued random elements. The development uses a bijective mapping that connects a second order process to the reproducing kernel Hilbert space generated by its within class covariance kernel. This approach provides a seamless transition between Fisher's original development and infinite dimensional settings that lends itself well to computation via smoothing and regularization. Simulation results and real data examples are provided to illustrate the methodology.  相似文献   

5.
Much work in discriminant analysis and statistical pattern recognition has been performed in the former Soviet Union. However, most results derived by former Soviet Union researchers are unknown to statisticians and statistical pattern recognition researchers in the West. We attempt to give a succinct overview of important contributions by Soviet Block researchers to several topics in the discriminant analysis literature concerning the small training-sample size problem. We also include a partial review of corresponding work done in the West.  相似文献   

6.
In this article, an unconstrained Taylor series expansion is constructed for scalar-valued functions of vector-valued arguments that are subject to nonlinear equality constraints. The expansion is made possible by first reparameterizing the constrained argument in terms of identified and implicit parameters and then expanding the function solely in terms of the identified parameters. Matrix expressions are given for the derivatives of the function with respect to the identified parameters. The expansion is employed to construct an unconstrained Newton algorithm for optimizing the function subject to constraints.Parameters in statistical models often are estimated by solving statistical estimating equations. It is shown how the unconstrained Newton algorithm can be employed to solve constrained estimating equations. Also, the unconstrained Taylor series is adapted to construct Edgeworth expansions of scalar functions of the constrained estimators. The Edgeworth expansion is illustrated on maximum likelihood estimators in an exploratory factor analysis model in which an oblique rotation is applied after Kaiser row-normalization of the factor loading matrix. A simulation study illustrates the superiority of the two-term Edgeworth approximation compared to the asymptotic normal approximation when sampling from multivariate normal or nonnormal distributions.  相似文献   

7.
In this paper, we analyze matrix dynamics for online linear discriminant analysis (online LDA). Convergence of the dynamics have been studied for nonsingular cases; our main contribution is an analysis of singular cases, that is a key for efficient calculation without full-size square matrices. All fixed points of the dynamics are identified and their stability is examined.  相似文献   

8.
The concept of quadratic subspace is introduced as a helpful tool for dimension reduction in quadratic discriminant analysis (QDA). It is argued that an adequate representation of the quadratic subspace may lead to better methods for both data representation and classification. Several theoretical results describe the structure of the quadratic subspace, that is shown to contain some of the subspaces previously proposed in the literature for finding differences between the class means and covariances. A suitable assumption of orthogonality between location and dispersion subspaces allows us to derive a convenient reduced version of the full QDA rule. The behavior of these ideas in practice is illustrated with three real data examples.  相似文献   

9.
The limit behavior of the conditional probability of error of linear and quadratic discriminant analyses is studied under wide assumptions on the class conditional distributions. Results obtained may help to explain analytically the behavior in applications of linear and quadratic discrimination techniques.  相似文献   

10.
This paper explores some properties of the quadratic subspace, a tool for dimension reduction in discriminant analysis ( [Velilla, 2008] and [Velilla, 2010]). This linear manifold has a fairly complex structure, and it may sometimes include components with both mean and covariance separation properties. In this case, an assumption of orthogonality between the leading location directions and the bulk of the dispersion subspaces can help to find an adequate directional representation of it in practice. Two real data sets are analyzed.  相似文献   

11.
Let Λ=|Se|/|Se+Sh|, where Sh and Se are independently distributed as Wishart distributions Wp(q,Σ) and Wp(n,Σ), respectively. Then Λ has Wilks’ lambda distribution Λp,q,n which appears as the distributions of various multivariate likelihood ratio tests. This paper is concerned with theoretical accuracy for asymptotic expansions of the distribution of T=-nlogΛ. We derive error bounds for the approximations. It is necessary to underline that our error bounds are given in explicit and computable forms.  相似文献   

12.
This paper examines asymptotic distributions of the canonical correlations between and with qp, based on a sample of size of N=n+1. The asymptotic distributions of the canonical correlations have been studied extensively when the dimensions q and p are fixed and the sample size N tends toward infinity. However, these approximations worsen when q or p is large in comparison to N. To overcome this weakness, this paper first derives asymptotic distributions of the canonical correlations under a high-dimensional framework such that q is fixed, m=np and c=p/nc0∈[0,1), assuming that and have a joint (q+p)-variate normal distribution. An extended Fisher’s z-transformation is proposed. Then, the asymptotic distributions are improved further by deriving their asymptotic expansions. Numerical simulations revealed that our approximations are more accurate than the classical approximations for a large range of p,q, and n and the population canonical correlations.  相似文献   

13.
In many real world classification problems, class-conditional classification noise (CCC-Noise) frequently deteriorates the performance of a classifier that is naively built by ignoring it. In this paper, we investigate the impact of CCC-Noise on the quality of a popular generative classifier, normal discriminant analysis (NDA), and its corresponding discriminative classifier, logistic regression (LR). We consider the problem of two multivariate normal populations having a common covariance matrix. We compare the asymptotic distribution of the misclassification error rate of these two classifiers under CCC-Noise. We show that when the noise level is low, the asymptotic error rates of both procedures are only slightly affected. We also show that LR is less deteriorated by CCC-Noise compared to NDA. Under CCC-Noise contexts, the Mahalanobis distance between the populations plays a vital role in determining the relative performance of these two procedures. In particular, when this distance is small, LR tends to be more tolerable to CCC-Noise compared to NDA.  相似文献   

14.
General formulas of the asymptotic cumulants of a studentized parameter estimator are given up to the fourth order with the added higher-order asymptotic variance. Using the sample counterparts of the asymptotic cumulants, formulas for the Cornish-Fisher expansions with third-order accuracy are obtained. Some new methods of monotonic transformations of the studentized estimator are presented. In addition, similar transformations of a fixed normal deviate are proposed up to the same order with some asymptotic comparisons to the transformations of the studentized estimator. Applications to a mean and a binomial proportion are shown with simulations for estimation of the proportion.  相似文献   

15.
Let α(n1, n2) be the probability of classifying an observation from population Π1 into population Π2 using Fisher's linear discriminant function based on samples of size n1 and n2. A standard estimator of α, denoted by T1, is the proportion of observations in the first sample misclassified by the discriminant function. A modification of T1, denoted by T2, is obtained by eliminating the observation being classified from the calculation of the discriminant function. The UMVU estimators, T11 and T21, of ET1 = τ1(n1, n2) and ET2 = τ2(n1, n2) = α(n1 ? 1, n2) are derived for the case when the populations have multivariate normal distributions with common dispersion matrix. It is shown that T11 and T21 are nonincreasing functions of D2, the Mahalanobis sample distance. This result is used to derive the sampling distributions and moments of T11 and T21. It is also shown that α is a decreasing function of Δ2 = (μ1 ? μ2)′Σ?11 ? μ2). Hence, by truncating T11 and T21 (or any estimator) at the value of α for Σ = 0, new estimators are obtained which, for all samples, are as close or closer to α.  相似文献   

16.
A general methodology for selecting predictors for Gaussian generative classification models is presented. The problem is regarded as a model selection problem. Three different roles for each possible predictor are considered: a variable can be a relevant classification predictor or not, and the irrelevant classification variables can be linearly dependent on a part of the relevant predictors or independent variables. This variable selection model was inspired by a previous work on variable selection in model-based clustering. A BIC-like model selection criterion is proposed. It is optimized through two embedded forward stepwise variable selection algorithms for classification and linear regression. The model identifiability and the consistency of the variable selection criterion are proved. Numerical experiments on simulated and real data sets illustrate the interest of this variable selection methodology. In particular, it is shown that this well ground variable selection model can be of great interest to improve the classification performance of the quadratic discriminant analysis in a high dimension context.  相似文献   

17.
Classical discriminant analysis focusses on Gaussian and nonparametric models where in the second case the unknown densities are replaced by kernel densities based on the training sample. In the present article we assume that it suffices to base the classification on exceedances above higher thresholds, which can be interpreted as observations in a conditional framework. Therefore, the statistical modeling of truncated distributions is merely required. In this context, a nonparametric modeling is not adequate because the kernel method is inaccurate in the upper tail region. Yet one may deal with truncated parametric distributions like the Gaussian ones. Our primary aim is to replace truncated Gaussian distributions by appropriate generalized Pareto distributions and to explore properties and the relationship of discriminant functions in both models.  相似文献   

18.
This paper analyzes the problem of using the sample covariance matrix to detect the presence of clustering in p-variate data in the special case when the component covariance matrices are known up to a constant multiplier. For the case of testing one population against a mixture of two populations, tests are derived and shown to be optimal in a certain sense. Some of their distribution properties are derived exactly. Some remarks on the extensions of these tests to mixtures of kp populations are included. The paper is essentially a formal treatment (in a special case) of some well-known procedures. The methods used in deriving the distribution properties are applicable to a variety of other situations involving mixtures.  相似文献   

19.
Asymptotic expansions of the distributions of parameter estimators in mean and covariance structures are derived. The parameters may be common to, or specific in means and covariances of observable variables. The means are possibly structured by the common/specific parameters. First, the distributions of the parameter estimators standardized by the population asymptotic standard errors are expanded using the single- and the two-term Edgeworth expansions. In practice, the pivotal statistic or the Studentized estimator with the asymptotically distribution-free standard error is of interest. An asymptotic distribution of the pivotal statistic is also derived by the Cornish-Fisher expansion. Simulations are performed for a factor analysis model with nonzero factor means to see the accuracy of the asymptotic expansions in finite samples.  相似文献   

20.
This paper deals with two criteria for selection of variables for the discriminant analysis in the case of two multivariate normal populations with different means and a common covariance matrix. One is based on the estimated error rate of misclassification. The other uses Akaike's information criterion. The asymptotic distributions and error rate risks of the criteria are obtained. The result will prove that the two criteria are asymptotically equivalent in the sense of their asymptotic distributions and error rate risks being identical.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号