首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 390 毫秒
1.
In this paper, a Bayesian hierarchical model for variable selection and estimation in the context of binary quantile regression is proposed. Existing approaches to variable selection in a binary classification context are sensitive to outliers, heteroskedasticity or other anomalies of the latent response. The method proposed in this study overcomes these problems in an attractive and straightforward way. A Laplace likelihood and Laplace priors for the regression parameters are proposed and estimated with Bayesian Markov Chain Monte Carlo. The resulting model is equivalent to the frequentist lasso procedure. A conceptional result is that by doing so, the binary regression model is moved from a Gaussian to a full Laplacian framework without sacrificing much computational efficiency. In addition, an efficient Gibbs sampler to estimate the model parameters is proposed that is superior to the Metropolis algorithm that is used in previous studies on Bayesian binary quantile regression. Both the simulation studies and the real data analysis indicate that the proposed method performs well in comparison to the other methods. Moreover, as the base model is binary quantile regression, a much more detailed insight in the effects of the covariates is provided by the approach. An implementation of the lasso procedure for binary quantile regression models is available in the R-package bayesQR.  相似文献   

2.
The least angle regression (LAR) was proposed by Efron, Hastie, Johnstone and Tibshirani in the year 2004 for continuous model selection in linear regression. It is motivated by a geometric argument and tracks a path along which the predictors enter successively and the active predictors always maintain the same absolute correlation (angle) with the residual vector. Although it gains popularity quickly, its extensions seem rare compared to the penalty methods. In this expository article, we show that the powerful geometric idea of LAR can be generalized in a fruitful way. We propose a ConvexLAR algorithm that works for any convex loss function and naturally extends to group selection and data adaptive variable selection. After simple modification, it also yields new exact path algorithms for certain penalty methods such as a convex loss function with lasso or group lasso penalty. Variable selection in recurrent event and panel count data analysis, Ada-Boost, and Gaussian graphical model is reconsidered from the ConvexLAR angle. Supplementary materials for this article are available online.  相似文献   

3.
While graphical models for continuous data (Gaussian graphical models) and discrete data (Ising models) have been extensively studied, there is little work on graphical models for datasets with both continuous and discrete variables (mixed data), which are common in many scientific applications. We propose a novel graphical model for mixed data, which is simple enough to be suitable for high-dimensional data, yet flexible enough to represent all possible graph structures. We develop a computationally efficient regression-based algorithm for fitting the model by focusing on the conditional log-likelihood of each variable given the rest. The parameters have a natural group structure, and sparsity in the fitted graph is attained by incorporating a group lasso penalty, approximated by a weighted lasso penalty for computational efficiency. We demonstrate the effectiveness of our method through an extensive simulation study and apply it to a music annotation dataset (CAL500), obtaining a sparse and interpretable graphical model relating the continuous features of the audio signal to binary variables such as genre, emotions, and usage associated with particular songs. While we focus on binary discrete variables for the main presentation, we also show that the proposed methodology can be easily extended to general discrete variables.  相似文献   

4.
何晓霞  徐伟  李缓  吴传菊 《数学杂志》2017,37(5):1101-1110
本文研究了基于面板数据的分位数回归模型的变量选择问题.通过增加改进的自适应Lasso惩罚项,同时实现了固定效应面板数据的分位数回归和变量选择,得到了模型中参数的选择相合性和渐近正态性.随机模拟验证了该方法的有效性.推广了文献[14]的结论.  相似文献   

5.
In many problems involving generalized linear models, the covariates are subject to measurement error. When the number of covariates p exceeds the sample size n, regularized methods like the lasso or Dantzig selector are required. Several recent papers have studied methods which correct for measurement error in the lasso or Dantzig selector for linear models in the p > n setting. We study a correction for generalized linear models, based on Rosenbaum and Tsybakov’s matrix uncertainty selector. By not requiring an estimate of the measurement error covariance matrix, this generalized matrix uncertainty selector has a great practical advantage in problems involving high-dimensional data. We further derive an alternative method based on the lasso, and develop efficient algorithms for both methods. In our simulation studies of logistic and Poisson regression with measurement error, the proposed methods outperform the standard lasso and Dantzig selector with respect to covariate selection, by reducing the number of false positives considerably. We also consider classification of patients on the basis of gene expression data with noisy measurements. Supplementary materials for this article are available online.  相似文献   

6.
This paper considers the problem of spatio-temporal extreme value prediction of precipitation data. This work presents some methods that predict monthly extremes over the next 20 years corresponding to 0.998 quantile at several stations over a certain region. The proposed methods are based on a novel combination of quantile regression forests and circular transformation. As the core of the methodology, quantile regression forests by combining many decorrelated bootstrapping trees may improve prediction performance, and circular transformation is used for building circular transformed predictors of months, which are put into the quantile regression forests model for prediction. The empirical performance of the proposed methods are evaluated through real data analysis, which demonstrates promising results of the proposed approaches.  相似文献   

7.
A Frisch-Newton Algorithm for Sparse Quantile Regression   总被引:3,自引:0,他引:3  
Recent experience has shown that interior-point methods using a log barrier approach are far superior to classical simplex methods for computing solutions to large parametric quantile regression problems. In many large empirical applications, the design matrix has a very sparse structure. A typical example is the classical fixed-effect model for panel data where the parametric dimension of the model can be quite large, but the number of non-zero elements is quite small. Adopting recent developments in sparse linear algebra we introduce a modified version of the Prisch-Newton algorithm for quantile regression described in Portnoy and Koenker~([28]). The new algorithm substantially reduces the storage (memory) requirements and increases computational speed. The modified algorithm also facilitates the development of nonparametric quantile regression methods. The pseudo design matrices employed in nonparametric quantile regression smoothing are inherently sparse in both the fidelity and roughness penalty components. Exploiting the sparse structure of these problems opens up a whole range of new possibilities for multivariate smoothing on large data sets via ANOVA-type decomposition and partial linear models.  相似文献   

8.
A number of classical approaches to nonparametric regression have recently been extended to the case of functional predictors. This article introduces a new method of this type, which extends intermediate-rank penalized smoothing to scalar-on-function regression. In the proposed method, which we call principal coordinate ridge regression, one regresses the response on leading principal coordinates defined by a relevant distance among the functional predictors, while applying a ridge penalty. Our publicly available implementation, based on generalized additive modeling software, allows for fast optimal tuning parameter selection and for extensions to multiple functional predictors, exponential family-valued responses, and mixed-effects models. In an application to signature verification data, principal coordinate ridge regression, with dynamic time warping distance used to define the principal coordinates, is shown to outperform a functional generalized linear model. Supplementary materials for this article are available online.  相似文献   

9.
We propose a multinomial logistic regression model for link prediction in a time series of directed binary networks. To account for the dynamic nature of the data, we employ a dynamic model for the model parameters that is strongly connected with the fused lasso penalty. In addition to promoting sparseness, this prior allows us to explore the presence of change points in the structure of the network. We introduce fast computational algorithms for estimation and prediction using both optimization and Bayesian approaches. The performance of the model is illustrated using simulated data and data from a financial trading network in the NYMEX natural gas futures market. Supplementary material containing the trading network dataset and code to implement the algorithms is available online.  相似文献   

10.
We consider a regularized least squares problem, with regularization by structured sparsity-inducing norms, which extend the usual ? 1 and the group lasso penalty, by allowing the subsets to overlap. Such regularizations lead to nonsmooth problems that are difficult to optimize, and we propose in this paper a suitable version of an accelerated proximal method to solve them. We prove convergence of a nested procedure, obtained composing an accelerated proximal method with an inner algorithm for computing the proximity operator. By exploiting the geometrical properties of the penalty, we devise a new active set strategy, thanks to which the inner iteration is relatively fast, thus guaranteeing good computational performances of the overall algorithm. Our approach allows to deal with high dimensional problems without pre-processing for dimensionality reduction, leading to better computational and prediction performances with respect to the state-of-the art methods, as shown empirically both on toy and real data.  相似文献   

11.
多元非参数分位数回归常常是难于估计的, 为了降低维数同时保持非参数估计的灵活性, 人们常常用单指标的方法模拟响应变量的条件分位数. 本文主要研究单指标分位数回归的变量选择. 以最小化平均损失估计为基础, 我们通过最小化具有SCAD惩罚项的平均损失进行变量选择和参数估计. 在正则条件下, 得到了单指标分位数回归SCAD变量选择的Oracle性质, 给出了SCAD变量选择的计算方法, 并通过模拟研究说明了本文所提方法变量选择的样本性质.  相似文献   

12.
Abstract

Bridge regression, a special family of penalized regressions of a penalty function Σ|βj|γ with γ ≤ 1, considered. A general approach to solve for the bridge estimator is developed. A new algorithm for the lasso (γ = 1) is obtained by studying the structure of the bridge estimators. The shrinkage parameter γ and the tuning parameter λ are selected via generalized cross-validation (GCV). Comparison between the bridge model (γ ≤ 1) and several other shrinkage models, namely the ordinary least squares regression (λ = 0), the lasso (γ = 1) and ridge regression (γ = 2), is made through a simulation study. It is shown that the bridge regression performs well compared to the lasso and ridge regression. These methods are demonstrated through an analysis of a prostate cancer data. Some computational advantages and limitations are discussed.  相似文献   

13.
We propose a new binary classification and variable selection technique especially designed for high-dimensional predictors. Among many predictors, typically, only a small fraction of them have significant impact on prediction. In such a situation, more interpretable models with better prediction accuracy can be obtained by variable selection along with classification. By adding an ?1-type penalty to the loss function, common classification methods such as logistic regression or support vector machines (SVM) can perform variable selection. Existing penalized SVM methods all attempt to jointly solve all the parameters involved in the penalization problem altogether. When data dimension is very high, the joint optimization problem is very complex and involves a lot of memory allocation. In this article, we propose a new penalized forward search technique that can reduce high-dimensional optimization problems to one-dimensional optimization by iterating the selection steps. The new algorithm can be regarded as a forward selection version of the penalized SVM and its variants. The advantage of optimizing in one dimension is that the location of the optimum solution can be obtained with intelligent search by exploiting convexity and a piecewise linear or quadratic structure of the criterion function. In each step, the predictor that is most able to predict the outcome is chosen in the model. The search is then repeatedly used in an iterative fashion until convergence occurs. Comparison of our new classification rule with ?1-SVM and other common methods show very promising performance, in that the proposed method leads to much leaner models without compromising misclassification rates, particularly for high-dimensional predictors.  相似文献   

14.
The statistics literature of the past 15 years has established many favorable properties for sparse diminishing-bias regularization: techniques that can roughly be understood as providing estimation under penalty functions spanning the range of concavity between ?0 and ?1 norms. However, lasso ?1-regularized estimation remains the standard tool for industrial Big Data applications because of its minimal computational cost and the presence of easy-to-apply rules for penalty selection. In response, this article proposes a simple new algorithm framework that requires no more computation than a lasso path: the path of one-step estimators (POSE) does ?1 penalized regression estimation on a grid of decreasing penalties, but adapts coefficient-specific weights to decrease as a function of the coefficient estimated in the previous path step. This provides sparse diminishing-bias regularization at no extra cost over the fastest lasso algorithms. Moreover, our gamma lasso implementation of POSE is accompanied by a reliable heuristic for the fit degrees of freedom, so that standard information criteria can be applied in penalty selection. We also provide novel results on the distance between weighted-?1 and ?0 penalized predictors; this allows us to build intuition about POSE and other diminishing-bias regularization schemes. The methods and results are illustrated in extensive simulations and in application of logistic regression to evaluating the performance of hockey players. Supplementary materials for this article are available online.  相似文献   

15.
Quantile regression model estimates the relationship between the quantile of a response distribution and the regression parameters, and has been developed for linear models with continuous responses. In this paper, we apply Bayesian quantile regression model for the Malaysian motor insurance claim count data to study the effects of change in the estimates of regression parameters (or the rating factors) on the magnitude of the response variable (or the claim count). We also compare the results of quantile regression models from the Bayesian and frequentist approaches and the results of mean regression models from the Poisson and negative binomial. Comparison from Poisson and Bayesian quantile regression models shows that the effects of vehicle year decrease as the quantile increases, suggesting that the rating factor has lower risk for higher claim counts. On the other hand, the effects of vehicle type increase as the quantile increases, indicating that the rating factor has higher risk for higher claim counts.  相似文献   

16.
We propose and study a new iterative coordinate descent algorithm (QICD) for solving nonconvex penalized quantile regression in high dimension. By permitting different subsets of covariates to be relevant for modeling the response variable at different quantiles, nonconvex penalized quantile regression provides a flexible approach for modeling high-dimensional data with heterogeneity. Although its theory has been investigated recently, its computation remains highly challenging when p is large due to the nonsmoothness of the quantile loss function and the nonconvexity of the penalty function. Existing coordinate descent algorithms for penalized least-squares regression cannot be directly applied. We establish the convergence property of the proposed algorithm under some regularity conditions for a general class of nonconvex penalty functions including popular choices such as SCAD (smoothly clipped absolute deviation) and MCP (minimax concave penalty). Our Monte Carlo study confirms that QICD substantially improves the computational speed in the p ? n setting. We illustrate the application by analyzing a microarray dataset.  相似文献   

17.
An efficient algorithm is derived for solving the quantile regression problem combined with a group sparsity promoting penalty. The group sparsity of the regression parameters is achieved by using a \(\ell _{1,\infty }\) -norm penalty (or constraint) on the regression parameters. The algorithm is efficient in the sense that it obtains the regression parameters for a wide range of penalty parameters, thus enabling easy application of a model selection criteria afterwards. A Matlab implementation of the proposed algorithm is provided and some applications of the methods are studied.  相似文献   

18.
Locally weighted regression is a technique that predicts the response for new data items from their neighbors in the training data set, where closer data items are assigned higher weights in the prediction. However, the original method may suffer from overfitting and fail to select the relevant variables. In this paper we propose combining a regularization approach with locally weighted regression to achieve sparse models. Specifically, the lasso is a shrinkage and selection method for linear regression. We present an algorithm that embeds lasso in an iterative procedure that alternatively computes weights and performs lasso-wise regression. The algorithm is tested on three synthetic scenarios and two real data sets. Results show that the proposed method outperforms linear and local models for several kinds of scenarios.  相似文献   

19.
This paper formulates the quadratic penalty function for the dual problem of the linear programming associated with the \(L_1\) constrained linear quantile regression model. We prove that the solution of the original linear programming can be obtained by minimizing the quadratic penalty function, with the formulas derived. The obtained quadratic penalty function has no constraint, thus could be minimized efficiently by a generalized Newton algorithm with Armijo step size. The resulting algorithm is easy to implement, without requiring any sophisticated optimization package other than a linear equation solver. The proposed approach can be generalized to the quantile regression model in reproducing kernel Hilbert space with slight modification. Extensive experiments on simulated data and real-world data show that, the proposed Newton quantile regression algorithms can achieve performance comparable to state-of-the-art.  相似文献   

20.

This paper considers estimation and inference in semiparametric quantile regression models when the response variable is subject to random censoring. The paper considers both the cases of independent and dependent censoring and proposes three iterative estimators based on inverse probability weighting, where the weights are estimated from the censoring distribution using the Kaplan–Meier, a fully parametric and the conditional Kaplan–Meier estimators. The paper proposes a computationally simple resampling technique that can be used to approximate the finite sample distribution of the parametric estimator. The paper also considers inference for both the parametric and nonparametric components of the quantile regression model. Monte Carlo simulations show that the proposed estimators and test statistics have good finite sample properties. Finally, the paper contains a real data application, which illustrates the usefulness of the proposed methods.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号