首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 964 毫秒
1.
In this article we study test of sphericity for high-dimensional covariance matrix in the general population based on random matrix theory. When the sample size is less than data dimension, the classical likelihood ratio test has poor performance for test of sphericity. Thus, we propose a new statistic for test of sphericity by using the higher moments of spectral distribution of the sample covariance matrix, and derive the asymptotic distribution of the statistic under the null hypothesis. Simulation results show that the proposed statistics can effectively improve the power of the test of sphericity for high dimensional data, and have especially significant effects for Spiked model, on the basis of controlling the type-one error probability.  相似文献   

2.

We consider hypothesis testing for high-dimensional covariance structures in which the covariance matrix is a (i) scaled identity matrix, (ii) diagonal matrix, or (iii) intraclass covariance matrix. Our purpose is to systematically establish a nonparametric approach for testing the high-dimensional covariance structures (i)–(iii). We produce a new common test statistic for each covariance structure and show that the test statistic is an unbiased estimator of its corresponding test parameter. We prove that the test statistic establishes the asymptotic normality. We propose a new test procedure for (i)–(iii) and evaluate its asymptotic size and power theoretically when both the dimension and sample size increase. We investigate the performance of the proposed test procedure in simulations. As an application of testing the covariance structures, we give a test procedure to identify an eigenvector. Finally, we demonstrate the proposed test procedure by using a microarray data set.

  相似文献   

3.
We propose an empirical likelihood method to test whether the coefficients in a possibly high-dimensional linear model are equal to given values. The asymptotic distribution of the test statistic is independent of the number of covariates in the linear model.  相似文献   

4.
This paper introduces the scale-shape mixtures of skew-normal (SSMSN) distributions which provide alternative candidates for modeling asymmetric data in a wide variety of settings. We obtain the moments and study some characterizations of the SSMSN distributions. Instead of resorting to numerical optimization procedures, two variants of EM algorithms are developed for carrying out maximum likelihood estimation. Our algorithms are analytically simple because closed-form expressions of conditional expectations in the E-step as well as the updating estimators in the M-step can be explicitly obtained. The observed information matrix is derived for approximating the asymptotic covariance matrix of parameter estimates. A simulation study is conducted to examine the finite sample properties of ML estimators. The utility of the proposed methodology is illustrated by analyzing a real example.  相似文献   

5.
We propose a model selection algorithm for high-dimensional clustered data. Our algorithm combines a classical penalized likelihood method with a composite likelihood approach in the framework of colored graphical Gaussian models. Our method is designed to identify high-dimensional dense networks with a large number of edges but sparse edge classes. Its empirical performance is demonstrated through simulation studies and a network analysis of a gene expression dataset.  相似文献   

6.
This article is concerned with the calibration of the empirical likelihood (EL) for high-dimensional data where the data dimension may increase as the sample size increases. We analyze the asymptotic behavior of the EL under a general multivariate model and provide weak conditions under which the best rate for the asymptotic normality of the empirical likelihood ratio (ELR) is achieved. In addition, there is usually substantial lack-of-fit when the ELR is calibrated by the usual normal in high dimensions, producing tests with type I errors much larger than nominal levels. We find that this is mainly due to the underestimation of the centralized and normalized quantities of the ELR. By examining the connection between the ELR and the classical Hotelling’s $T$ -square statistic, we propose an effective calibration method which works much better in most situations.  相似文献   

7.
For multivariate copula-based models for which maximum likelihood is computationally difficult, a two-stage estimation procedure has been proposed previously; the first stage involves maximum likelihood from univariate margins, and the second stage involves maximum likelihood of the dependence parameters with the univariate parameters held fixed from the first stage. Using the theory of inference functions, a partitioned matrix in a form amenable to analysis is obtained for the asymptotic covariance matrix of the two-stage estimator. The asymptotic relative efficiency of the two-stage estimation procedure compared with maximum likelihood estimation is studied. Analysis of the limiting cases of the independence copula and Fréchet upper bound help to determine common patterns in the efficiency as the dependence in the model increases. For the Fréchet upper bound, the two-stage estimation procedure can sometimes be equivalent to maximum likelihood estimation for the univariate parameters. Numerical results are shown for some models, including multivariate ordinal probit and bivariate extreme value distributions, to indicate the typical level of asymptotic efficiency for discrete and continuous data.  相似文献   

8.
Yang  Yuehan  Zhu  Ji 《中国科学 数学(英文版)》2020,63(6):1203-1218
The problem of estimating high-dimensional Gaussian graphical models has gained much attention in recent years. Most existing methods can be considered as one-step approaches, being either regression-based or likelihood-based. In this paper, we propose a two-step method for estimating the high-dimensional Gaussian graphical model. Specifically, the first step serves as a screening step, in which many entries of the concentration matrix are identified as zeros and thus removed from further consideration. Then in the second step, we focus on the remaining entries of the concentration matrix and perform selection and estimation for nonzero entries of the concentration matrix. Since the dimension of the parameter space is effectively reduced by the screening step,the estimation accuracy of the estimated concentration matrix can be potentially improved. We show that the proposed method enjoys desirable asymptotic properties. Numerical comparisons of the proposed method with several existing methods indicate that the proposed method works well. We also apply the proposed method to a breast cancer microarray data set and obtain some biologically meaningful results.  相似文献   

9.
Data in social and behavioral sciences are often hierarchically organized. Multilevel statistical methodology was developed to analyze such data. Most of the procedures for analyzing multilevel data are derived from maximum likelihood based on the normal distribution assumption. Standard errors for parameter estimates in these procedures are obtained from the corresponding information matrix. Because practical data typically contain heterogeneous marginal skewnesses and kurtoses, this paper studies how nonnormally distributed data affect the standard errors of parameter estimates in a two-level structural equation model. Specifically, we study how skewness and kurtosis in one level affect standard errors of parameter estimates within its level and outside its level. We also show that, parallel to asymptotic robustness theory in conventional factor analysis, conditions exist for asymptotic robustness of standard errors in a multilevel factor analysis model.  相似文献   

10.
Gaussian graphical models represent the underlying graph structure of conditional dependence between random variables, which can be determined using their partial correlation or precision matrix. In a high-dimensional setting, the precision matrix is estimated using penalized likelihood by adding a penalization term, which controls the amount of sparsity in the precision matrix and totally characterizes the complexity and structure of the graph. The most commonly used penalization term is the L1 norm of the precision matrix scaled by the regularization parameter, which determines the trade-off between sparsity of the graph and fit to the data. In this article, we propose several procedures to select the regularization parameter in the estimation of graphical models that focus on recovering reliably the appropriate network structure of the graph. We conduct an extensive simulation study to show that the proposed methods produce useful results for different network topologies. The approaches are also applied in a high-dimensional case study of gene expression data with the aim to discover the genes relevant to colon cancer. Using these data, we find graph structures, which are verified to display significant biological gene associations. Supplementary material is available online.  相似文献   

11.
An extended growth curve model is considered which, among other things, is useful when linear restrictions exist on the mean in the ordinary growth curve model. The maximum likelihood estimators consist of complicated stochastic expressions. It is shown how, by the aid of fairly elementary calculations, the dispersion matrix for the estimator of the mean and the expectation of the estimated dispersion matrix are obtained. Results for Wishart, inverted Wishart, and inverse beta variables are utilized. Additionally, some asymptotic results are presented.  相似文献   

12.
We present a very fast algorithm for general matrix factorization of a data matrix for use in the statistical analysis of high-dimensional data via latent factors. Such data are prevalent across many application areas and generate an ever-increasing demand for methods of dimension reduction in order to undertake the statistical analysis of interest. Our algorithm uses a gradient-based approach which can be used with an arbitrary loss function provided the latter is differentiable. The speed and effectiveness of our algorithm for dimension reduction is demonstrated in the context of supervised classification of some real high-dimensional data sets from the bioinformatics literature.  相似文献   

13.
In this paper, the authors derived asymptotic expressions for the null distributions of the likelihood ratio test statistics for multiple independence and multiple homogeneity of the covariance matrices when the underlying distributions are complex multivariate normal. Also, asymptotic expressions are obtained in the non-null cases for the likelihood ratio test statistics for independence of two sets of variables and the equality of two covariance matrices. The expressions obtained in this paper are in terms of beta series. In the null cases, the accuracy of the first terms alone is sufficient for many practical purposes.  相似文献   

14.
In this paper, we put non-concave penalty on the local conditional likelihood. We obtain the oracle property and asymptotic normal distribution property of the parameters in Ising model. With a union band, we obtain the sign consistence for the estimator of parameter matrix, and the convergence speed under the matrix $L_1$ norm. The results of the simulation studies and a real data analysis show that the non-concave penalized estimator has larger sensitivity.  相似文献   

15.
In this paper the distribution of the likelihood ratio test for testing the reality of the covariance matrix of a complex multivariate normal distribution is investigated. Some simplifications in the noncentral distribution are made and the noncentral distribution is derived for the special case where the rank of the noncentrality matrix is two. In the null case exact expressions for the distribution are given up to p = 6, and percentage points are tabulated. These percentage points were compared with percentage points derived from an asymptotic expansion of the distribution, and the accuracy of the approximation was found to be sufficient for several practical situations.  相似文献   

16.
We consider asymptotic distributions of maximum deviations of sample covariance matrices, a fundamental problem in high-dimensional inference of covariances. Under mild dependence conditions on the entries of the data matrices, we establish the Gumbel convergence of the maximum deviations. Our result substantially generalizes earlier ones where the entries are assumed to be independent and identically distributed, and it provides a theoretical foundation for high-dimensional simultaneous inference of covariances.  相似文献   

17.
For analysis of time-to-event data with incomplete information beyond right-censoring, many generalizations of the inference of the distribution and regression model have been proposed. However, the development of martingale approaches in this area has not progressed greatly, while for right-censored data such an approach has spread widely to study the asymptotic properties of estimators and to derive regression diagnosis methods. In this paper, focusing on doubly censored data, we discuss a martingale approach for inference of the nonparametric maximum likelihood estimator (NPMLE). We formulate a martingale structure of the NPMLE using a score function of the semiparametric profile likelihood. Finally, an expression of the asymptotic distribution of the NPMLE is derived more conveniently without depending on an infinite matrix expression as in previous research. A further useful point is that a variance-covariance formula of the NPMLE computable in a larger sample is obtained as an empirical version of the limit form presented here.  相似文献   

18.
<正>Empirical Likelihood of Quantile Difference with Missing Response When High-dimensional Covariates Are Present Cui Juan KONG Han Ying LIANG Abstract We,in this paper,investigate two-sample quantile difference by empirical likelihood method when the responses with high-dimensional covariates of the two populations are missing at random.In particular,based on sufficient dimension reduction technique,we construct three empirical log-likelihood ratios for the quantile difference between two samples by using inverse probability weighting imputation,regression imputation as well as augmented inverse probability weighting imputation,respectively,and prove their asymptotic distributions.  相似文献   

19.
In this paper, we consider a scale adjusted-type distance-based classifier for high-dimensional data. We first give such a classifier that can ensure high accuracy in misclassification rates for two-class classification. We show that the classifier is not only consistent but also asymptotically normal for high-dimensional data. We provide sample size determination so that misclassification rates are no more than a prespecified value. We propose a classification procedure called the misclassification rate adjusted classifier. We further develop the classifier to multiclass classification. We show that the classifier can still enjoy asymptotic properties and ensure high accuracy in misclassification rates for multiclass classification. Finally, we demonstrate the proposed classifier in actual data analyses by using a microarray data set.  相似文献   

20.
We consider one-step estimation of parameters that represent the strength of spatial dependence in a geostatistical or lattice spatial model. While the maximum likelihood estimators (MLE) of spatial dependence parameters are known to have various desirable properties, they do not have closed-form expressions. Therefore, we consider a one-step alternative to maximum likelihood estimation based on solving an approximate (i.e., one-step) profile likelihood estimating equation. The resulting approximate profile likelihood estimator (APLE) has a closed-form representation, making it a suitable alternative to the widely used Moran’s I statistic. Since the finite-sample and asymptotic properties of one-step estimators of covariance-function parameters have not been studied rigorously, we explore these properties for the APLE of the spatial dependence parameter in the simultaneous autoregressive (SAR) model. Motivated by the APLE statistic’s closed from, we develop exploratory spatial data analysis tools that capture regions of local clustering or the extent to which the strength of spatial dependence varies across space. We illustrate these exploratory tools using both simulated data and observed crime rates in Columbus, OH.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号