首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 210 毫秒
1.
We propose a new definition of the Neyman chi-square divergence between distributions. Based on convexity properties and duality, this version of the χ2 is well suited both for the classical applications of the χ2 for the analysis of contingency tables and for the statistical tests in parametric models, for which it is advocated to be robust against outliers.We present two applications in testing. In the first one, we deal with goodness-of-fit tests for finite and infinite numbers of linear constraints; in the second one, we apply χ2-methodology to parametric testing against contamination.  相似文献   

2.
In this article we develop a nonparametric methodology for estimating the mean change for matched samples on a Lie group. We then notice that for k≥5, a manifold of projective shapes of k-ads in 3D has the structure of a 3k−15 dimensional Lie group that is equivariantly embedded in a Euclidean space, therefore testing for mean change amounts to a one sample test for extrinsic means on this Lie group. The Lie group technique leads to a large sample and a nonparametric bootstrap test for one population extrinsic mean on a projective shape space, as recently developed by Patrangenaru, Liu and Sughatadasa. On the other hand, in the absence of occlusions, the 3D projective shape of a spatial k-ad can be recovered from a stereo pair of images, thus allowing one to test for mean glaucomatous 3D projective shape change detection from standard stereo pair eye images.  相似文献   

3.
Model identification and discrimination are two major statistical challenges. In this paper we consider a set of models Mk for factorial experiments with the parameters representing the general mean, main effects, and only k out of all two-factor interactions. We consider the class D of all fractional factorial plans with the same number of runs having the ability to identify all the models in Mk, i.e., the full estimation capacity.The fractional factorial plans in D with the full estimation capacity for k?2 are able to discriminate between models in Mu for u?k*, where k*=(k/2) when k is even, k*=((k-1)/2) when k is odd. We obtain fractional factorial plans in D satisfying the six optimality criterion functions AD, AT, AMCR, GD, GT, and GMCR for 2m factorial experiments when m=4 and 5. Both single stage and multi-stage (hierarchical) designs are given. Some results on estimation capacity of a fractional factorial plan for identifying models in Mk are also given. Our designs D4.1 and D10 stand out in their performances relative to the designs given in Li and Nachtsheim [Model-robust factorial designs, Technometrics 42(4) (2000) 345-352.] for m=4 and 5 with respect to the criterion functions AD, AT, AMCR, GD, GT, and GMCR. Our design D4.2 stands out in its performance relative the Li-Nachtsheim design for m=4 with respect to the four criterion functions AT, AMCR, GT, and GMCR. However, the Li-Nachtsheim design for m=4 stands out in its performance relative to our design D4.2 with respect to the criterion functions AD and GD. Our design D14 does have the full estimation capacity for k=5 but the twelve run Li-Nachtsheim design does not have the full estimation capacity for k=5.  相似文献   

4.
Two robustness criteria are presented that are applicable to general clustering methods. Robustness and stability in cluster analysis are not only data dependent, but even cluster dependent. Robustness is in the present paper defined as a property of not only the clustering method, but also of every individual cluster in a data set. The main principles are: (a) dissimilarity measurement of an original cluster with the most similar cluster in the induced clustering obtained by adding data points, (b) the dissolution point, which is an adaptation of the breakdown point concept to single clusters, (c) isolation robustness: given a clustering method, is it possible to join, by addition of g points, arbitrarily well separated clusters?Results are derived for k-means, k-medoids (k estimated by average silhouette width), trimmed k-means, mixture models (with and without noise component, with and without estimation of the number of clusters by BIC), single and complete linkage.  相似文献   

5.
For normally distributed data from the k populations with m×m covariance matrices Σ1,…,Σk, we test the hypothesis H:Σ1=?=Σk vs the alternative AH when the number of observations Ni, i=1,…,k from each population are less than or equal to the dimension m, Nim, i=1,…,k. Two tests are proposed and compared with two other tests proposed in the literature. These tests, however, do not require that Nim, and thus can be used in all situations, including when the likelihood ratio test is available. The asymptotic distributions of the test statistics are given, and the power compared by simulations with other test statistics proposed in the literature. The proposed tests perform well and better in several cases than the other two tests available in the literature.  相似文献   

6.
Empirical likelihood (EL) ratio tests are developed for testing for or against the hypothesis that k-population means μ1,μ2,…,μk are isotonic with respect to some quasi-order ? on {1,2,…,k}. The null asymptotic distributions are derived and are shown to be of chi-bar squared type. The asymptotic power of the proposed test for testing for equality of these means against the order restriction is derived under contiguous alternatives and a simulation study is carried out to investigate the finite sample behaviors of this test. In addition, an adjusted EL test is used to improve the small size performance of our test and an example is also discussed to illustrate the theoretical results.  相似文献   

7.
Autoregressive time series models of order p have p+2 parameters, the mean, the variance of the white noise and the p autoregressive parameters. Change in any of these over time is a sign of disturbance that is important to detect. The methods of this paper can test for change in any one of these p+2 parameters separately, or in any collection of them. They are available in forms that make one-sided tests possible, furthermore, they can be used to test for a temporary change. The test statistics are based on the efficient score vector. The large sample properties of the change-point estimator are also explored.  相似文献   

8.
Empirical Bayes estimators are given for the mean of a k-dimensional normal distribution, k ≥ 3. We assume that yNk(θ, V1), V1 = diag(vi), vi known (i = 1, 2,…, k); also, θNk(0, V2) ? V2 defined by one or more unknown parameters. Of particular interest is V2 generated by an autoregressive process. A recent result of Efron and Morris is used to obtain necessary and sufficient conditions for the minimaxity of our estimators. Practical sufficient conditions (for minimaxity) are obtained by exploiting the structure of V2. Another result shows that our estimators have good Bayesian properties. Estimates of the exact size of Pearson's chi-square test are given in an example; the autoregressive prior is very natural in this situation.  相似文献   

9.
Testing for the independence between two categorical variables R and S forming a contingency table is a well-known problem: the classical chi-square and likelihood ratio tests are used. Suppose now that for each individual a set of p characteristics is also observed. Those explanatory variables, likely to be associated with R and S, can play a major role in their possible association, and it can therefore be interesting to test the independence between R and S conditionally on them. In this paper, we propose two nonparametric tests which generalise the chi-square and the likelihood ratio ideas to this case. The procedure is based on a kernel estimator of the conditional probabilities. The asymptotic law of the proposed test statistics under the conditional independence hypothesis is derived; the finite sample behaviour of the procedure is analysed through some Monte Carlo experiments and the approach is illustrated with a real data example.  相似文献   

10.
We consider the problem of testing whether the common mean of a single n-vector of multivariate normal random variables with known variance and unknown common correlation ρ is zero. We derive the standardized likelihood ratio test for known ρ and explore different ways of proceeding with ρ unknown. We evaluate the performance of the standardized statistic where ρ is replaced with an estimate of ρ and determine the critical value cn that controls the type I error rate for the least favorable ρ in [0,1]. The constant cn increases with n and this procedure has pathological behavior if ρ depends on n and ρn converges to zero at a certain rate. As an alternate approach, we replace ρ with the upper limit of a (1−βn) confidence interval chosen so that cn=c for all n. We determine βn so that the type I error rate is exactly controlled for all ρ in [0,1]. We also investigate a simpler approach where we bound the type I error rate. The former method performs well for all n while the less powerful bound method may be a useful in some settings as a simple approach. The proposed tests can be used in different applications, including within-cluster resampling and combining exchangeable p-values.  相似文献   

11.
We prove, in an axiomatic way, a compactness theorem for singular cardinals. We apply it to prove that, for singular λ, every λ-free algebra is free; and similar compactness results for transversals and colouring numbers. For the general result on free algebras, we develop some filters onS k(A). As an application we conclude thatV=L implies that every Whitehead group is free.  相似文献   

12.
We consider Bayesian analysis of data from multivariate linear regression models whose errors have a distribution that is a scale mixture of normals. Such models are used to analyze data on financial returns, which are notoriously heavy-tailed. Let π denote the intractable posterior density that results when this regression model is combined with the standard non-informative prior on the unknown regression coefficients and scale matrix of the errors. Roughly speaking, the posterior is proper if and only if nd+k, where n is the sample size, d is the dimension of the response, and k is number of covariates. We provide a method of making exact draws from π in the special case where n=d+k, and we study Markov chain Monte Carlo (MCMC) algorithms that can be used to explore π when n>d+k. In particular, we show how the Haar PX-DA technology studied in Hobert and Marchev (2008) [11] can be used to improve upon Liu’s (1996) [7] data augmentation (DA) algorithm. Indeed, the new algorithm that we introduce is theoretically superior to the DA algorithm, yet equivalent to DA in terms of computational complexity. Moreover, we analyze the convergence rates of these MCMC algorithms in the important special case where the regression errors have a Student’s t distribution. We prove that, under conditions on n, d, k, and the degrees of freedom of the t distribution, both algorithms converge at a geometric rate. These convergence rate results are important from a practical standpoint because geometric ergodicity guarantees the existence of central limit theorems which are essential for the calculation of valid asymptotic standard errors for MCMC based estimates.  相似文献   

13.
We analyze k-stage formality and relate resonance with this type of formality properties. For instance, we show that, for a finitely generated nilpotent group that is k-stage formal, the resonance varieties are trivial up to degree k. We also show that the cohomology ring, truncated up to degree k+1, of a finitely generated nilpotent, k-stage formal group is generated in degree 1; this criterion is necessary and sufficient for a finitely generated, 2-step nilpotent group to be k-stage formal. We compute resonance varieties for Heisenberg-type groups and deduce the degree of partial formality for this class of groups.  相似文献   

14.
Recently, we proposed variants as a statistical model for treating ambiguity. If data are extracted from an object with a machine then it might not be able to give a unique safe answer due to ambiguity about the correct interpretation of the object. On the other hand, the machine is often able to produce a finite number of alternative feature sets (of the same object) that contain the desired one. We call these feature sets variants of the object. Data sets that contain variants may be analyzed by means of statistical methods and all chapters of multivariate analysis can be seen in the light of variants. In this communication, we focus on point estimation in the presence of variants and outliers. Besides robust parameter estimation, this task requires also selecting the regular objects and their valid feature sets (regular variants). We determine the mixed MAP-ML estimator for a model with spurious variants and outliers as well as estimators based on the integrated likelihood. We also prove asymptotic results which show that the estimators are nearly consistent.The problem of variant selection turns out to be computationally hard; therefore, we also design algorithms for efficient approximation. We finally demonstrate their efficacy with a simulated data set and a real data set from genetics.  相似文献   

15.
In this paper, we consider sequences of vector martingale differences of increasing dimension. We show that the Kantorovich distance from the distribution of the k(n)-dimensional average of n martingale differences to the corresponding Gaussian distribution satisfies certain inequalities. As a consequence, if the growth of k(n) is not too fast, then the Kantorovich distance converges to zero. Two applications of this result are presented. The first is a precise proof of the asymptotic distribution of the multivariate portmanteau statistic applied to the residuals of an autoregressive model and the second is a proof of the asymptotic normality of the estimates of a finite autoregressive model when the process is an AR() and the order of the model grows with the length of the series.  相似文献   

16.
A test for the mean vector with fewer observations than the dimension   总被引:1,自引:0,他引:1  
In this paper, we consider a test for the mean vector of independent and identically distributed multivariate normal random vectors where the dimension p is larger than or equal to the number of observations N. This test is invariant under scalar transformations of each component of the random vector. Theories and simulation results show that the proposed test is superior to other two tests available in the literature. Interest in such significance test for high-dimensional data is motivated by DNA microarrays. However, the methodology is valid for any application which involves high-dimensional data.  相似文献   

17.
This paper studies the simultaneous selection of extreme populations from a set of independent populations. Two types of subset selection rules for k populations are proposed and studied. The first type selects one subset of populations that should contain the population with the smallest, and another subset of populations that should contain the population with the largest, φ-entropy. The second type selects analogously, but in terms of the extreme ?-divergences with respect a known control population. Properties of the proposed procedures are stated and studied. Examples are presented in order to illustrate the results.  相似文献   

18.
For kn-nearest neighbor estimates of a regression Y on X (d-dimensional random vector X, integrable real random variable Y) based on observed independent copies of (X,Y), strong universal pointwise consistency is shown, i.e., strong consistency PX-almost everywhere for general distribution of (X,Y). With tie-breaking by indices, this means validity of a universal strong law of large numbers for conditional expectations E(Y|X=x).  相似文献   

19.
For two multivariate normal populations with unequal covariance matrices, a procedure is developed for testing the equality of the mean vectors based on the concept of generalized p-values. The generalized p-values we have developed are functions of the sufficient statistics. The computation of the generalized p-values is discussed and illustrated with an example. Numerical results show that one of our generalized p-value test has a type I error probability not exceeding the nominal level. A formula involving only a finite number of chi-square random variables is provided for computing this generalized p-value. The formula is useful in a Bayesian solution as well. The problem of constructing a confidence region for the difference between the mean vectors is also addressed using the concept of generalized confidence regions. Finally, using the generalized p-value approach, a solution is developed for the heteroscedastic MANOVA problem.  相似文献   

20.
In the estimation of parametric models for stationary spatial or spatio-temporal data on a d-dimensional lattice, for d?2, the achievement of asymptotic efficiency under Gaussianity, and asymptotic normality more generally, with standard convergence rate, faces two obstacles. One is the “edge effect”, which worsens with increasing d. The other is the possible difficulty of computing a continuous-frequency form of Whittle estimate or a time domain Gaussian maximum likelihood estimate, due mainly to the Jacobian term. This is especially a problem in “multilateral” models, which are naturally expressed in terms of lagged values in both directions for one or more of the d dimensions. An extension of the discrete-frequency Whittle estimate from the time series literature deals conveniently with the computational problem, but when subjected to a standard device for avoiding the edge effect has disastrous asymptotic performance, along with finite sample numerical drawbacks, the objective function lacking a minimum-distance interpretation and losing any global convexity properties. We overcome these problems by first optimizing a standard, guaranteed non-negative, discrete-frequency, Whittle function, without edge-effect correction, providing an estimate with a slow convergence rate, then improving this by a sequence of computationally convenient approximate Newton iterations using a modified, almost-unbiased periodogram, the desired asymptotic properties being achieved after finitely many steps. The asymptotic regime allows increase in both directions of all d dimensions, with the central limit theorem established after re-ordering as a triangular array. However our work offers something new for “unilateral” models also. When the data are non-Gaussian, asymptotic variances of all parameter estimates may be affected, and we propose consistent, non-negative definite estimates of the asymptotic variance matrix.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号