首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
This paper examines asymptotic distributions of the canonical correlations between and with qp, based on a sample of size of N=n+1. The asymptotic distributions of the canonical correlations have been studied extensively when the dimensions q and p are fixed and the sample size N tends toward infinity. However, these approximations worsen when q or p is large in comparison to N. To overcome this weakness, this paper first derives asymptotic distributions of the canonical correlations under a high-dimensional framework such that q is fixed, m=np and c=p/nc0∈[0,1), assuming that and have a joint (q+p)-variate normal distribution. An extended Fisher’s z-transformation is proposed. Then, the asymptotic distributions are improved further by deriving their asymptotic expansions. Numerical simulations revealed that our approximations are more accurate than the classical approximations for a large range of p,q, and n and the population canonical correlations.  相似文献   

2.
The purpose of this paper is, in multivariate linear regression model (Part I) and GMANOVA model (Part II), to investigate the effect of nonnormality upon the nonnull distributions of some multivariate test statistics under normality. It is shown that whatever the underlying distributions, the difference of local powers up to order N−1 after either Bartlett’s type adjustment or Cornish-Fisher’s type size adjustment under nonnormality coincides with that in Anderson [An Introduction to Multivariate Statistical Analysis, 2nd ed. and 3rd ed., Wiley, New York, 1984, 2003] under normality. The derivation of asymptotic expansions is based on the differential operator associated with the multivariate linear regression model under general distributions. The performance of higher-order results in finite samples, including monotone Bartlett’s type adjustment and monotone Cornish-Fisher’s type size adjustment, is examined using simulation studies.  相似文献   

3.
This article analyzes whether some existing tests for the p×p covariance matrix Σ of the N independent identically distributed observation vectors work under non-normality. We focus on three hypotheses testing problems: (1) testing for sphericity, that is, the covariance matrix Σ is proportional to an identity matrix Ip; (2) the covariance matrix Σ is an identity matrix Ip; and (3) the covariance matrix is a diagonal matrix. It is shown that the tests proposed by Srivastava (2005) for the above three problems are robust under the non-normality assumption made in this article irrespective of whether Np or Np, but (N,p)→, and N/p may go to zero or infinity. Results are asymptotic and it may be noted that they may not hold for finite (N,p).  相似文献   

4.
This paper examines asymptotic expansions of test statistics for dimensionality and additional information in canonical correlation analysis based on a sample of size N=n+1 on two sets of variables, i.e.,  and . These problems are related to dimension reduction. The asymptotic approximations of the statistics have been studied extensively when dimensions p1 and p2 are fixed and the sample size N tends to infinity. However, the approximations worsen as p1 and p2 increase. This paper derives asymptotic expansions of the test statistics when both the sample size and dimension are large, assuming that and have a joint (p1+p2)-variate normal distribution. Numerical simulations revealed that this approximation is more accurate than the classical approximation as the dimension increases.  相似文献   

5.
We consider block thresholding wavelet-based density estimators with randomly right-censored data and investigate their asymptotic convergence rates. Unlike for the complete data case, the empirical wavelet coefficients are constructed through the Kaplan-Meier estimators of the distribution functions in the censored data case. On the basis of a result of Stute [W. Stute, The central limit theorem under random censorship, Ann. Statist. 23 (1995) 422-439] that approximates the Kaplan-Meier integrals as averages of i.i.d. random variables with a certain rate in probability, we can show that these wavelet empirical coefficients can be approximated by averages of i.i.d. random variables with a certain error rate in L2. Therefore we can show that these estimators, based on block thresholding of empirical wavelet coefficients, achieve optimal convergence rates over a large range of Besov function classes , p≥2, q≥1 and nearly optimal convergence rates when 1≤p<2. We also show that these estimators achieve optimal convergence rates over a large class of functions that involve many irregularities of a wide variety of types, including chirp and Doppler functions, and jump discontinuities. Therefore, in the presence of random censoring, wavelet estimators still provide extensive adaptivity to many irregularities of large function classes. The performance of the estimators is tested via a modest simulation study.  相似文献   

6.
A weighted multivariate signed-rank test is introduced for an analysis of multivariate clustered data. Observations in different clusters may then get different weights. The test provides a robust and efficient alternative to normal theory based methods. Asymptotic theory is developed to find the approximate p-value as well as to calculate the limiting Pitman efficiency of the test. A conditionally distribution-free version of the test is also discussed. The finite-sample behavior of different versions of the test statistic is explored by simulations and the new test is compared to the unweighted and weighted versions of Hotelling’s T2 test and the multivariate spatial sign test introduced in [D. Larocque, J. Nevalainen, H. Oja, A weighted multivariate sign test for cluster-correlated data, Biometrika 94 (2007) 267-283]. Finally, a real data example is used to illustrate the theory.  相似文献   

7.
This paper proposes the corrected likelihood ratio test(LRT) and large-dimensional trace criterion to test the independence of two large sets of multivariate variables of dimensions p1 and p2 when the dimensions p=p1+p2 and the sample size n tend to infinity simultaneously and proportionally.Both theoretical and simulation results demonstrate that the traditional χ2 approximation of the LRT performs poorly when the dimension p is large relative to the sample size n,while the corrected LRT and large-dimensional trace criterion behave well when the dimension is either small or large relative to the sample size.Moreover,the trace criterion can be used in the case of p> n,while the corrected LRT is unfeasible due to the loss of definition.  相似文献   

8.
For normally distributed data from the k populations with m×m covariance matrices Σ1,…,Σk, we test the hypothesis H:Σ1=?=Σk vs the alternative AH when the number of observations Ni, i=1,…,k from each population are less than or equal to the dimension m, Nim, i=1,…,k. Two tests are proposed and compared with two other tests proposed in the literature. These tests, however, do not require that Nim, and thus can be used in all situations, including when the likelihood ratio test is available. The asymptotic distributions of the test statistics are given, and the power compared by simulations with other test statistics proposed in the literature. The proposed tests perform well and better in several cases than the other two tests available in the literature.  相似文献   

9.
The exact distribution of Mauchly's sphericity test criterion W = |S|/[trS/p]p, when S is the sum of product matrix from a sample of size N taken from a p-variate normal population, is obtained using contour integration and methods similar to those of Nair and Box. Tables of percentage points for p = 4(1)10, α = 0.01 and 0.05, and various values of N (including small) are given and comparisons made with approximate percentage points using methods of Box, Mauchly, Tukey and Wilks, and Davis.  相似文献   

10.
The so-called independent component (IC) model states that the observed p-vector X is generated via X=ΛZ+μ, where μ is a p-vector, Λ is a full-rank matrix, and the centered random vector Z has independent marginals. We consider the problem of testing the null hypothesis H0:μ=0 on the basis of i.i.d. observations X1,…,Xn generated by the symmetric version of the IC model above (for which all ICs have a symmetric distribution about the origin). In the spirit of [M. Hallin, D. Paindaveine, Optimal tests for multivariate location based on interdirections and pseudo-Mahalanobis ranks, Annals of Statistics, 30 (2002), 1103-1133], we develop nonparametric (signed-rank) tests, which are valid without any moment assumption and are, for adequately chosen scores, locally and asymptotically optimal (in the Le Cam sense) at given densities. Our tests are measurable with respect to the marginal signed ranks computed in the collection of null residuals , where is a suitable estimate of Λ. Provided that is affine-equivariant, the proposed tests, unlike the standard marginal signed-rank tests developed in [M.L. Puri, P.K. Sen, Nonparametric Methods in Multivariate Analysis, Wiley & Sons, New York, 1971] or any of their obvious generalizations, are affine-invariant. Local powers and asymptotic relative efficiencies (AREs) with respect to Hotelling’s T2 test are derived. Quite remarkably, when Gaussian scores are used, these AREs are always greater than or equal to one, with equality in the multinormal model only. Finite-sample efficiencies and robustness properties are investigated through a Monte Carlo study.  相似文献   

11.
Consider the model Y=m(X)+ε, where m(⋅)=med(Y|⋅) is unknown but smooth. It is often assumed that ε and X are independent. However, in practice this assumption is violated in many cases. In this paper we propose modeling the dependence between ε and X by means of a copula model, i.e. (ε,X)∼Cθ(Fε(⋅),FX(⋅)), where Cθ is a copula function depending on an unknown parameter θ, and Fε and FX are the marginals of ε and X. Since many parametric copula families contain the independent copula as a special case, the so-obtained regression model is more flexible than the ‘classical’ regression model.We estimate the parameter θ via a pseudo-likelihood method and prove the asymptotic normality of the estimator, based on delicate empirical process theory. We also study the estimation of the conditional distribution of Y given X. The procedure is illustrated by means of a simulation study, and the method is applied to data on food expenditures in households.  相似文献   

12.
In this paper we study the properties of a kurtosis matrix and propose its eigenvectors as interesting directions to reveal the possible cluster structure of a data set. Under a mixture of elliptical distributions with proportional scatter matrix, it is shown that a subset of the eigenvectors of the fourth-order moment matrix corresponds to Fisher’s linear discriminant subspace. The eigenvectors of the estimated kurtosis matrix are consistent estimators of this subspace and its calculation is easy to implement and computationally efficient, which is particularly favourable when the ratio n/p is large.  相似文献   

13.
In this paper we aim to construct adaptive confidence region for the direction of ξ in semiparametric models of the form Y=G(ξTX,ε) where G(⋅) is an unknown link function, ε is an independent error, and ξ is a pn×1 vector. To recover the direction of ξ, we first propose an inverse regression approach regardless of the link function G(⋅); to construct a data-driven confidence region for the direction of ξ, we implement the empirical likelihood method. Unlike many existing literature, we need not estimate the link function G(⋅) or its derivative. When pn remains fixed, the empirical likelihood ratio without bias correlation can be asymptotically standard chi-square. Moreover, the asymptotic normality of the empirical likelihood ratio holds true even when the dimension pn follows the rate of pn=o(n1/4) where n is the sample size. Simulation studies are carried out to assess the performance of our proposal, and a real data set is analyzed for further illustration.  相似文献   

14.
The ratio of the largest eigenvalue divided by the trace of a p×p random Wishart matrix with n degrees of freedom and an identity covariance matrix plays an important role in various hypothesis testing problems, both in statistics and in signal processing. In this paper we derive an approximate explicit expression for the distribution of this ratio, by considering the joint limit as both p,n with p/nc. Our analysis reveals that even though asymptotically in this limit the ratio follows a Tracy-Widom (TW) distribution, one of the leading error terms depends on the second derivative of the TW distribution, and is non-negligible for practical values of p, in particular for determining tail probabilities. We thus propose to explicitly include this term in the approximate distribution for the ratio. We illustrate empirically using simulations that adding this term to the TW distribution yields a quite accurate expression to the empirical distribution of the ratio, even for small values of p,n.  相似文献   

15.
Accurate distributions of the estimator of the tetrachoric correlation coefficient and, more generally, functions of sample proportions for the 2 by 2 contingency table are derived. The results are obtained given the definitions of the estimators even when some marginal cell(s) are empty. Then, asymptotic expansions of the distributions of the parameter estimators standardized by the population asymptotic standard errors up to order O(1/n) and those of the studentized ones up to the order next beyond the conventional normal approximation are derived. The asymptotic results can be obtained in a much shorter computation time than the accurate ones. Numerical examples were used to illustrate advantages of the studentized estimator of Fisher’s z transformation of the tetrachoric correlation coefficient.  相似文献   

16.
Semiparametric models with both nonparametric and parametric components have become increasingly useful in many scientific fields, due to their appropriate representation of the trade-off between flexibility and efficiency of statistical models. In this paper we focus on semi-varying coefficient models (a.k.a. varying coefficient partially linear models) in a “large n, diverging p” situation, when both the number of parametric and nonparametric components diverges at appropriate rates, and we only consider the case p=o(n). Consistency of the estimator based on B-splines and asymptotic normality of the linear components are established under suitable assumptions. Interestingly (although not surprisingly) our analysis shows that the number of parametric components can diverge at a faster rate than the number of nonparametric components and the divergence rates of the number of the nonparametric components constrain the allowable divergence rates of the parametric components, which is a new phenomenon not established in the existing literature as far as we know. Finally, the finite sample behavior of the estimator is evaluated by some Monte Carlo studies.  相似文献   

17.
Let F be a distribution function in the maximal domain of attraction of the Gumbel distribution such that −log(1−F(x))=x1/θL(x) for a positive real number θ, called the Weibull tail index, and a slowly varying function L. It is well known that the estimators of θ have a very slow rate of convergence. We establish here a sharp optimality result in the minimax sense, that is when L is treated as an infinite dimensional nuisance parameter belonging to some functional class. We also establish the rate optimal asymptotic property of a data-driven choice of the sample fraction that is used for estimation.  相似文献   

18.
We investigate the estimation problem of parameters in a two-sample semiparametric model. Specifically, let X1,…,Xn be a sample from a population with distribution function G and density function g. Independent of the Xi’s, let Z1,…,Zm be another random sample with distribution function H and density function h(x)=exp[α+r(x)β]g(x), where α and β are unknown parameters of interest and g is an unknown density. This model has wide applications in logistic discriminant analysis, case-control studies, and analysis of receiver operating characteristic curves. Furthermore, it can be considered as a biased sampling model with weight function depending on unknown parameters. In this paper, we construct minimum Hellinger distance estimators of α and β. The proposed estimators are chosen to minimize the Hellinger distance between a semiparametric model and a nonparametric density estimator. Theoretical properties such as the existence, strong consistency and asymptotic normality are investigated. Robustness of proposed estimators is also examined using a Monte Carlo study.  相似文献   

19.
For independently distributed observables: XiN(θi,σ2),i=1,…,p, we consider estimating the vector θ=(θ1,…,θp) with loss ‖dθ2 under the constraint , with known τ1,…,τp,σ2,m. In comparing the risk performance of Bayesian estimators δα associated with uniform priors on spheres of radius α centered at (τ1,…,τp) with that of the maximum likelihood estimator , we make use of Stein’s unbiased estimate of risk technique, Karlin’s sign change arguments, and a conditional risk analysis to obtain for a fixed (m,p) necessary and sufficient conditions on α for δα to dominate . Large sample determinations of these conditions are provided. Both cases where all such δα’s and cases where no such δα’s dominate are elicited. We establish, as a particular case, that the boundary uniform Bayes estimator δm dominates if and only if mk(p) with , improving on the previously known sufficient condition of Marchand and Perron (2001) [3] for which . Finally, we improve upon a universal dominance condition due to Marchand and Perron, by establishing that all Bayesian estimators δπ with π spherically symmetric and supported on the parameter space dominate whenever mc1(p) with .  相似文献   

20.
This paper deals with the bias correction of the cross-validation (CV) criterion to estimate the predictive Kullback-Leibler information. A bias-corrected CV criterion is proposed by replacing the ordinary maximum likelihood estimator with the maximizer of the adjusted log-likelihood function. The adjustment is just slight and simple, but the improvement of the bias is remarkable. The bias of the ordinary CV criterion is O(n-1), but that of the bias-corrected CV criterion is O(n-2). We verify that our criterion has smaller bias than the AIC, TIC, EIC and the ordinary CV criterion by numerical experiments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号