期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bayesian Lasso with neighborhood regression method for Gaussian graphical model

Fan-qun?Li Email author Xin-sheng?Zhang 《应用数学学报(英文版)》2017,33(2):485-496

In this paper, we consider the problem of estimating a high dimensional precision matrix of Gaussian graphical model. Taking advantage of the connection between multivariate linear regression and entries of the precision matrix, we propose Bayesian Lasso together with neighborhood regression estimate for Gaussian graphical model. This method can obtain parameter estimation and model selection simultaneously. Moreover, the proposed method can provide symmetric confidence intervals of all entries of the precision matrix. 相似文献

2.

Selection of the Regularization Parameter in Graphical Models Using Network Characteristics

Adria Caballe Mestres Natalia Bochkina Claus Mayer 《Journal of computational and graphical statistics》2018,27(2):323-333

Gaussian graphical models represent the underlying graph structure of conditional dependence between random variables, which can be determined using their partial correlation or precision matrix. In a high-dimensional setting, the precision matrix is estimated using penalized likelihood by adding a penalization term, which controls the amount of sparsity in the precision matrix and totally characterizes the complexity and structure of the graph. The most commonly used penalization term is the L1 norm of the precision matrix scaled by the regularization parameter, which determines the trade-off between sparsity of the graph and fit to the data. In this article, we propose several procedures to select the regularization parameter in the estimation of graphical models that focus on recovering reliably the appropriate network structure of the graph. We conduct an extensive simulation study to show that the proposed methods produce useful results for different network topologies. The approaches are also applied in a high-dimensional case study of gene expression data with the aim to discover the genes relevant to colon cancer. Using these data, we find graph structures, which are verified to display significant biological gene associations. Supplementary material is available online. 相似文献

3.

Regularized Estimation of Piecewise Constant Gaussian Graphical Models: The Group-Fused Graphical Lasso

Alexander J. Gibberd James D. B. Nelson 《Journal of computational and graphical statistics》2017,26(3):623-634

The time-evolving precision matrix of a piecewise-constant Gaussian graphical model encodes the dynamic conditional dependency structure of a multivariate time-series. Traditionally, graphical models are estimated under the assumption that data are drawn identically from a generating distribution. Introducing sparsity and sparse-difference inducing priors, we relax these assumptions and propose a novel regularized M-estimator to jointly estimate both the graph and changepoint structure. The resulting estimator possesses the ability to therefore favor sparse dependency structures and/or smoothly evolving graph structures, as required. Moreover, our approach extends current methods to allow estimation of changepoints that are grouped across multiple dependencies in a system. An efficient algorithm for estimating structure is proposed. We study the empirical recovery properties in a synthetic setting. The qualitative effect of grouped changepoint estimation is then demonstrated by applying the method on a genetic time-course dataset. Supplementary material for this article is available online. 相似文献

4.

Sparse Steinian Covariance Estimation

Brett Naul Jonathan Taylor 《Journal of computational and graphical statistics》2017,26(2):355-366

We consider a new method for sparse covariance matrix estimation which is motivated by previous results for the so-called Stein-type estimators. Stein proposed a method for regularizing the sample covariance matrix by shrinking together the eigenvalues; the amount of shrinkage is chosen to minimize an unbiased estimate of the risk (UBEOR) under the entropy loss function. The resulting estimator has been shown in simulations to yield significant risk reductions over the maximum likelihood estimator. Our method extends the UBEOR minimization problem by adding an ?₁ penalty on the entries of the estimated covariance matrix, which encourages a sparse estimate. For a multivariate Gaussian distribution, zeros in the covariance matrix correspond to marginal independences between variables. Unlike the ?₁-penalized Gaussian likelihood function, our penalized UBEOR objective is convex and can be minimized via a simple block coordinate descent procedure. We demonstrate via numerical simulations and an analysis of microarray data from breast cancer patients that our proposed method generally outperforms other methods for sparse covariance matrix estimation and can be computed efficiently even in high dimensions. 相似文献

5.

A two-stage sequential conditional selection approach to sparse high-dimensional multivariate regression models

Chen Zehua Jiang Yiwei 《Annals of the Institute of Statistical Mathematics》2020,72(1):65-90

In this article, we deal with sparse high-dimensional multivariate regression models. The models distinguish themselves from ordinary multivariate regression models in two aspects: (1) the dimension of the response vector and the number of covariates diverge to infinity; (2) the nonzero entries of the coefficient matrix and the precision matrix are sparse. We develop a two-stage sequential conditional selection (TSCS) approach to the identification and estimation of the nonzeros of the coefficient matrix and the precision matrix. It is established that the TSCS is selection consistent for the identification of the nonzeros of both the coefficient matrix and the precision matrix. Simulation studies are carried out to compare TSCS with the existing state-of-the-art methods, which demonstrates that the TSCS approach outperforms the existing methods. As an illustration, the TSCS approach is also applied to a real dataset.

相似文献

6.

Graphical Models for Ordinal Data

Jian Guo Elizaveta Levina George Michailidis Ji Zhu 《Journal of computational and graphical statistics》2013,22(1):183-204

This article considers a graphical model for ordinal variables, where it is assumed that the data are generated by discretizing the marginal distributions of a latent multivariate Gaussian distribution. The relationships between these ordinal variables are then described by the underlying Gaussian graphical model and can be inferred by estimating the corresponding concentration matrix. Direct estimation of the model is computationally expensive, but an approximate EM-like algorithm is developed to provide an accurate estimate of the parameters at a fraction of the computational cost. Numerical evidence based on simulation studies shows the strong performance of the algorithm, which is also illustrated on datasets on movie ratings and an educational survey. 相似文献

7.

Shrinkage estimation analysis of correlated binary data with a diverging number of parameters

XU PeiRong FU WenJiang ZHU LiXing 《中国科学数学(英文版)》2013,56(2):359-377

For analyzing correlated binary data with high-dimensional covariates,we,in this paper,propose a two-stage shrinkage approach.First,we construct a weighted least-squares(WLS) type function using a special weighting scheme on the non-conservative vector field of the generalized estimating equations(GEE) model.Second,we define a penalized WLS in the spirit of the adaptive LASSO for simultaneous variable selection and parameter estimation.The proposed procedure enjoys the oracle properties in high-dimensional framework where the number of parameters grows to infinity with the number of clusters.Moreover,we prove the consistency of the sandwich formula of the covariance matrix even when the working correlation matrix is misspecified.For the selection of tuning parameter,we develop a consistent penalized quadratic form(PQF) function criterion.The performance of the proposed method is assessed through a comparison with the existing methods and through an application to a crossover trial in a pain relief study. 相似文献

8.

High-Dimensional Covariance Matrix Estimation Based on Network

WANG Xuzhen JIN Baisuo 《应用概率统计》2006,36(4):342-354

A new method for estimating high-dimensional covariance matrix based on network structure with heteroscedasticity of response variables is proposed in this paper. This method greatly reduces the computational complexity by transforming the high-dimensional covariance matrix estimation problem into a low-dimensional linear regression problem. Even if the size of sample is finite, the estimation method is still effective. The error of estimation will decrease with the increase of matrix dimension. In addition, this paper presents a method of identifying influential nodes in network via covariance matrix. This method is very suitable for academic cooperation networks by taking into account both the contribution of the node itself and the impact of the node on other nodes. 相似文献

9.

Efficient Distributed Estimation of High-dimensional Sparse Precision Matrix for Transelliptical Graphical Models

下载免费PDF全文

Wang Guan Peng Cui Heng Jian 《数学学报(英文版)》2021,37(5):689-706

In this paper,distributed estimation of high-dimensional sparse precision matrix is proposed based on the debiased D-trace loss penalized lasso and the hard threshold method when samples are distributed into different machines for transelliptical graphical models.At a certain level of sparseness,this method not only achieves the correct selection of non-zero elements of sparse precision matrix,but the error rate can be comparable to the estimator in a non-distributed setting.The numerical results further prove that the proposed distributed method is more effective than the usual average method. 相似文献

10.

Robust quaternion matrix completion with applications to image inpainting

Zhigang Jia Michael K. Ng Guang‐Jing Song 《Numerical Linear Algebra with Applications》2019,26(4)

In this paper, we study robust quaternion matrix completion and provide a rigorous analysis for provable estimation of quaternion matrix from a random subset of their corrupted entries. In order to generalize the results from real matrix completion to quaternion matrix completion, we derive some new formulas to handle noncommutativity of quaternions. We solve a convex optimization problem, which minimizes a nuclear norm of quaternion matrix that is a convex surrogate for the quaternion matrix rank, and the ?₁‐norm of sparse quaternion matrix entries. We show that, under incoherence conditions, a quaternion matrix can be recovered exactly with overwhelming probability, provided that its rank is sufficiently small and that the corrupted entries are sparsely located. The quaternion framework can be used to represent red, green, and blue channels of color images. The results of missing/noisy color image pixels as a robust quaternion matrix completion problem are given to show that the performance of the proposed approach is better than that of the testing methods, including image inpainting methods, the tensor‐based completion method, and the quaternion completion method using semidefinite programming. 相似文献

11.

Precision Matrix Estimation by Inverse Principal Orthogonal Decomposition

下载免费PDF全文

Cheng Yong Tang Yingying Fan Yinfei Kong 《数学研究通讯：英文版》2020,36(1):68-92

We investigate the structure of a large precision matrix in Gaussian graphical models by decomposing it into a low rank component and a remainder part with sparse precision matrix.Based on the decomposition,we propose to estimate the large precision matrix by inverting a principal orthogonal decomposition(IPOD).The IPOD approach has appealing practical interpretations in conditional graphical models given the low rank component,and it connects to Gaussian graphical models with latent variables.Specifically,we show that the low rank component in the decomposition of the large precision matrix can be viewed as the contribution from the latent variables in a Gaussian graphical model.Compared with existing approaches for latent variable graphical models,the IPOD is conveniently feasible in practice where only inverting a low-dimensional matrix is required.To identify the number of latent variables,which is an objective of its own interest,we investigate and justify an approach by examining the ratios of adjacent eigenvalues of the sample covariance matrix?Theoretical properties,numerical examples,and a real data application demonstrate the merits of the IPOD approach in its convenience,performance,and interpretability. 相似文献

12.

Improving the Graphical Lasso Estimation for the Precision Matrix Through Roots of the Sample Covariance Matrix

Vahe Avagyan Andrés M. Alonso Francisco J. Nogales 《Journal of computational and graphical statistics》2017,26(4):865-872

In this article, we focus on the estimation of a high-dimensional inverse covariance (i.e., precision) matrix. We propose a simple improvement of the graphical Lasso (glasso) framework that is able to attain better statistical performance without increasing significantly the computational cost. The proposed improvement is based on computing a root of the sample covariance matrix to reduce the spread of the associated eigenvalues. Through extensive numerical results, using both simulated and real datasets, we show that the proposed modification improves the glasso procedure. Our results reveal that the square-root improvement can be a reasonable choice in practice. Supplementary material for this article is available online. 相似文献

13.

A Robust Model-Free Feature Screening Method for Ultrahigh-Dimensional Data

Jingnan Xue 《Journal of computational and graphical statistics》2017,26(4):803-813

Feature screening plays an important role in dimension reduction for ultrahigh-dimensional data. In this article, we introduce a new feature screening method and establish its sure independence screening property under the ultrahigh-dimensional setting. The proposed method works based on the nonparanormal transformation and Henze–Zirkler’s test, that is, it first transforms the response variable and features to Gaussian random variables using the nonparanormal transformation and then tests the dependence between the response variable and features using the Henze–Zirkler’s test. The proposed method enjoys at least two merits. First, it is model-free, which avoids the specification of a particular model structure. Second, it is condition-free, which does not require any extra conditions except for some regularity conditions for high-dimensional feature screening. The numerical results indicate that, compared to the existing methods, the proposed method is more robust to the data generated from heavy-tailed distributions and/or complex models with interaction variables. The proposed method is applied to screening of anticancer drug response genes. Supplementary material for this article is available online. 相似文献

14.

Efficient parameter estimation via modified Cholesky decomposition for quantile regression with longitudinal data

Jing Lv Chaohui Guo 《Computational Statistics》2017,32(3):947-975

It is well known that specifying a covariance matrix is difficult in the quantile regression with longitudinal data. This paper develops a two step estimation procedure to improve estimation efficiency based on the modified Cholesky decomposition. Specifically, in the first step, we obtain the initial estimators of regression coefficients by ignoring the possible correlations between repeated measures. Then, we apply the modified Cholesky decomposition to construct the covariance models and obtain the estimator of within-subject covariance matrix. In the second step, we construct unbiased estimating functions to obtain more efficient estimators of regression coefficients. However, the proposed estimating functions are discrete and non-convex. We utilize the induced smoothing method to achieve the fast and accurate estimates of parameters and their asymptotic covariance. Under some regularity conditions, we establish the asymptotically normal distributions for the resulting estimators. Simulation studies and the longitudinal progesterone data analysis show that the proposed approach yields highly efficient estimators. 相似文献

15.

High-Dimensional Mixed Graphical Models

Jie Cheng Tianxi Li Elizaveta Levina Ji Zhu 《Journal of computational and graphical statistics》2017,26(2):367-378

While graphical models for continuous data (Gaussian graphical models) and discrete data (Ising models) have been extensively studied, there is little work on graphical models for datasets with both continuous and discrete variables (mixed data), which are common in many scientific applications. We propose a novel graphical model for mixed data, which is simple enough to be suitable for high-dimensional data, yet flexible enough to represent all possible graph structures. We develop a computationally efficient regression-based algorithm for fitting the model by focusing on the conditional log-likelihood of each variable given the rest. The parameters have a natural group structure, and sparsity in the fitted graph is attained by incorporating a group lasso penalty, approximated by a weighted lasso penalty for computational efficiency. We demonstrate the effectiveness of our method through an extensive simulation study and apply it to a music annotation dataset (CAL500), obtaining a sparse and interpretable graphical model relating the continuous features of the audio signal to binary variables such as genre, emotions, and usage associated with particular songs. While we focus on binary discrete variables for the main presentation, we also show that the proposed methodology can be easily extended to general discrete variables. 相似文献

16.

A simultaneous reconstruction of missing data in DNA microarrays

Shmuel Friedland Amir Niknejad Laura Chihara 《Linear algebra and its applications》2006,416(1):8-28

We suggest here a new method of the estimation of missing entries in a gene expression matrix, which is done simultaneously—i.e., the estimation of one missing entry influences the estimation of other entries. Our method is closely related to the methods and techniques used for solving inverse eigenvalue problems. 相似文献

17.

The Graphical Horseshoe Estimator for Inverse Covariance Matrices

Yunfan Li Bruce A. Craig Anindya Bhadra 《Journal of computational and graphical statistics》2013,22(3):747-757

We develop a new estimator of the inverse covariance matrix for high-dimensional multivariate normal data using the horseshoe prior. The proposed graphical horseshoe estimator has attractive properties compared to other popular estimators, such as the graphical lasso and the graphical smoothly clipped absolute deviation. The most prominent benefit is that when the true inverse covariance matrix is sparse, the graphical horseshoe provides estimates with small information divergence from the sampling model. The posterior mean under the graphical horseshoe prior can also be almost unbiased under certain conditions. In addition to these theoretical results, we also provide a full Gibbs sampler for implementing our estimator. MATLAB code is available for download from github at http://github.com/liyf1988/GHS. The graphical horseshoe estimator compares favorably to existing techniques in simulations and in a human gene network data analysis. Supplementary materials for this article are available online. 相似文献

18.

Positive Semidefinite Rank-Based Correlation Matrix Estimation With Application to Semiparametric Graph Estimation

Tuo Zhao Kathryn Roeder Han Liu 《Journal of computational and graphical statistics》2013,22(4):895-922

Many statistical methods gain robustness and flexibility by sacrificing convenient computational structures. In this article, we illustrate this fundamental tradeoff by studying a semiparametric graph estimation problem in high dimensions. We explain how novel computational techniques help to solve this type of problem. In particular, we propose a nonparanormal neighborhood pursuit algorithm to estimate high-dimensional semiparametric graphical models with theoretical guarantees. Moreover, we provide an alternative view to analyze the tradeoff between computational efficiency and statistical error under a smoothing optimization framework. Though this article focuses on the problem of graph estimation, the proposed methodology is widely applicable to other problems with similar structures. We also report thorough experimental results on text, stock, and genomic datasets. 相似文献

19.

Weak signals in high‐dimensional regression: Detection,estimation and prediction

Yanming Li Hyokyoung G. Hong S. Ejaz Ahmed Yi Li 《商业与工业应用随机模型》2019,35(2):283-298

Regularization methods, including Lasso, group Lasso, and SCAD, typically focus on selecting variables with strong effects while ignoring weak signals. This may result in biased prediction, especially when weak signals outnumber strong signals. This paper aims to incorporate weak signals in variable selection, estimation, and prediction. We propose a two‐stage procedure, consisting of variable selection and postselection estimation. The variable selection stage involves a covariance‐insured screening for detecting weak signals, whereas the postselection estimation stage involves a shrinkage estimator for jointly estimating strong and weak signals selected from the first stage. We term the proposed method as the covariance‐insured screening‐based postselection shrinkage estimator. We establish asymptotic properties for the proposed method and show, via simulations, that incorporating weak signals can improve estimation and prediction performance. We apply the proposed method to predict the annual gross domestic product rates based on various socioeconomic indicators for 82 countries. 相似文献

20.

Estimation of Symmetry-Constrained Gaussian Graphical Models: Application to Clustered Dense Networks

Xin Gao Hélène Massam 《Journal of computational and graphical statistics》2013,22(4):909-929

We propose a model selection algorithm for high-dimensional clustered data. Our algorithm combines a classical penalized likelihood method with a composite likelihood approach in the framework of colored graphical Gaussian models. Our method is designed to identify high-dimensional dense networks with a large number of edges but sparse edge classes. Its empirical performance is demonstrated through simulation studies and a network analysis of a gene expression dataset. 相似文献