首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
Variable selection is an important aspect of high-dimensional statistical modeling, particularly in regression and classification. In the regularization framework, various penalty functions are used to perform variable selection by putting relatively large penalties on small coefficients. The L1 penalty is a popular choice because of its convexity, but it produces biased estimates for the large coefficients. The L0 penalty is attractive for variable selection because it directly penalizes the number of non zero coefficients. However, the optimization involved is discontinuous and non convex, and therefore it is very challenging to implement. Moreover, its solution may not be stable. In this article, we propose a new penalty that combines the L0 and L1 penalties. We implement this new penalty by developing a global optimization algorithm using mixed integer programming (MIP). We compare this combined penalty with several other penalties via simulated examples as well as real applications. The results show that the new penalty outperforms both the L0 and L1 penalties in terms of variable selection while maintaining good prediction accuracy.  相似文献   

2.
High angular resolution diffusion imaging (HARDI) has recently been of great interest in mapping the orientation of intravoxel crossing fibers, and such orientation information allows one to infer the connectivity patterns prevalent among different brain regions and possible changes in such connectivity over time for various neurodegenerative and neuropsychiatric diseases. The aim of this article is to propose a penalized multiscale adaptive regression model (PMARM) framework to spatially and adaptively infer the orientation distribution function (ODF) of water diffusion in regions with complex fiber configurations. In PMARM, we reformulate the HARDI imaging reconstruction as a weighted regularized least-square regression (WRLSR) problem. Similarity and distance weights are introduced to account for spatial smoothness of HARDI, while preserving the unknown discontinuities (e.g., edges between white matter and gray matter) of HARDI. The L1 penalty function is introduced to ensure the sparse solutions of ODFs, while a scaled L1 weighted estimator is calculated to correct the bias introduced by the L1 penalty at each voxel. In PMARM, we integrate the multiscale adaptive regression models, the propagation-separation method, and Lasso (least absolute shrinkage and selection operator) to adaptively estimate ODFs across voxels. Experimental results indicate that PMARM can reduce the angle detection errors on fiber crossing area and provide more accurate reconstruction than standard voxel-wise methods. Supplementary materials for this article are available online.  相似文献   

3.
Many least-square problems involve affine equality and inequality constraints. Although there are a variety of methods for solving such problems, most statisticians find constrained estimation challenging. The current article proposes a new path-following algorithm for quadratic programming that replaces hard constraints by what are called exact penalties. Similar penalties arise in l 1 regularization in model selection. In the regularization setting, penalties encapsulate prior knowledge, and penalized parameter estimates represent a trade-off between the observed data and the prior knowledge. Classical penalty methods of optimization, such as the quadratic penalty method, solve a sequence of unconstrained problems that put greater and greater stress on meeting the constraints. In the limit as the penalty constant tends to ∞, one recovers the constrained solution. In the exact penalty method, squared penalties are replaced by absolute value penalties, and the solution is recovered for a finite value of the penalty constant. The exact path-following method starts at the unconstrained solution and follows the solution path as the penalty constant increases. In the process, the solution path hits, slides along, and exits from the various constraints. Path following in Lasso penalized regression, in contrast, starts with a large value of the penalty constant and works its way downward. In both settings, inspection of the entire solution path is revealing. Just as with the Lasso and generalized Lasso, it is possible to plot the effective degrees of freedom along the solution path. For a strictly convex quadratic program, the exact penalty algorithm can be framed entirely in terms of the sweep operator of regression analysis. A few well-chosen examples illustrate the mechanics and potential of path following. This article has supplementary materials available online.  相似文献   

4.
The local discontinuous Galerkin method has been developed recently by Cockburn and Shu for convection‐dominated convection‐diffusion equations. In this article, we consider versions of this method with interior penalties for the numerical solution of transport equations, and derive a priori error estimates. We consider two interior penalty methods, one that penalizes jumps in the solution across interelement boundaries, and another that also penalizes jumps in the diffusive flux across such boundaries. For the first penalty method, we demonstrate convergence of order k in the L(L2) norm when polynomials of minimal degree k are used, and for the second penalty method, we demonstrate convergence of order k+1/2. Through a parabolic lift argument, we show improved convergence of order k+1/2 (k+1) in the L2(L2) norm for the first penalty method with a penalty parameter of order one (h?1). © 2001 John Wiley & Sons, Inc. Numer Methods Partial Differential Eq 17: 545–564, 2001  相似文献   

5.
This article proposes a practical modeling approach that can accommodate a rich variety of predictors, united in a generalized linear model (GLM) setting. In addition to the usual ANOVA-type or covariatelinear (L) predictors, we consider modeling any combination of smooth additive (G) components, varying coefficient (V) components, and (discrete representations of) signal (S) components. We assume that G is, and the coefficients of V and S are, inherently smooth—projecting each of these onto B-spline bases using a modest number of equally spaced knots. Enough knots are used to ensure more flexibility than needed; further smoothness is achieved through a difference penalty on adjacent B-spline coefficients (P-splines). This linear re-expression allows all of the parameters associated with these components to be estimated simultaneously in one large GLM through penalized likelihood. Thus, we have the advantage of avoiding both the backfitting algorithm and complex knot selection schemes. We regulate the flexibility of each component through a separate penalty parameter that is optimally chosen based on cross-validation or an information criterion.  相似文献   

6.
We consider the problem of estimating the slope parameter in circular functional linear regression, where scalar responses Y 1, ..., Y n are modeled in dependence of 1-periodic, second order stationary random functions X 1, ...,X n . We consider an orthogonal series estimator of the slope function β, by replacing the first m theoretical coefficients of its development in the trigonometric basis by adequate estimators. We propose a model selection procedure for m in a set of admissible values, by defining a contrast function minimized by our estimator and a theoretical penalty function; this first step assumes the degree of ill-posedness to be known. Then we generalize the procedure to a random set of admissible m’s and a random penalty function. The resulting estimator is completely data driven and reaches automatically what is known to be the optimal minimax rate of convergence, in terms of a general weighted L 2-risk. This means that we provide adaptive estimators of both β and its derivatives.  相似文献   

7.

We study the asymptotic properties of a new version of the Sparse Group Lasso estimator (SGL), called adaptive SGL. This new version includes two distinct regularization parameters, one for the Lasso penalty and one for the Group Lasso penalty, and we consider the adaptive version of this regularization, where both penalties are weighted by preliminary random coefficients. The asymptotic properties are established in a general framework, where the data are dependent and the loss function is convex. We prove that this estimator satisfies the oracle property: the sparsity-based estimator recovers the true underlying sparse model and is asymptotically normally distributed. We also study its asymptotic properties in a double-asymptotic framework, where the number of parameters diverges with the sample size. We show by simulations and on real data that the adaptive SGL outperforms other oracle-like methods in terms of estimation precision and variable selection.

  相似文献   

8.
In this article, for Lasso penalized linear regression models in high-dimensional settings, we propose a modified cross-validation (CV) method for selecting the penalty parameter. The methodology is extended to other penalties, such as Elastic Net. We conduct extensive simulation studies and real data analysis to compare the performance of the modified CV method with other methods. It is shown that the popular K-fold CV method includes many noise variables in the selected model, while the modified CV works well in a wide range of coefficient and correlation settings. Supplementary materials containing the computer code are available online.  相似文献   

9.
Wavelet-based denoising techniques are well suited to estimate spatially inhomogeneous signals. Waveshrink (Donoho and Johnstone) assumes independent Gaussian errors and equispaced sampling of the signal. Various articles have relaxed some of these assumptions, but a systematic generalization to distributions such as Poisson, binomial, or Bernoulli is missing. We consider a unifying l1-penalized likelihood approach to regularize the maximum likelihood estimation by adding an l1 penalty of the wavelet coefficients. Our approach works for all types of wavelets and for a range of noise distributions. We develop both an algorithm to solve the estimation problem and rules to select the smoothing parameter automatically. In particular, using results from Poisson processes, we give an explicit formula for the universal smoothing parameter to denoise Poisson measurements. Simulations show that the procedure is an improvement over other methods. An astronomy example is given.  相似文献   

10.
Xin Jia  Herbert A. Mang 《PAMM》2011,11(1):957-958
Unless the hangers of arch bridges are sufficiently stiff, such bridges are imperfection sensitive [1]. Increasing the stiffness of the hangers, such structures eventually become imperfection insensitive. The mathematical definition of imperfection insensitivity follows from a series expansion of the dimensionless load parameter Δλ(κ, η), relative to the stability limit λ = λS, given as [2] Δλ(κ, η) = λ1(κ)η + λ2(κ)η2 + λ3(κ)η3 + O4), (1) where λ1, λ2, … are coefficients depending on the stiffness of the hangers representing the design parameter κ and η is a path parameter describing the postbuckling path. A necessary condition for imperfection insensitivity is [3] λ1(κ) = 0 ∀κ. (2) If, for a specific value κ of κ, also λ2(κ=κ ) > 0, (3) then the structure is imperfection insensitive for κ=κ . It will be shown numerically that the increase of the stiffness of the hangers is the remedy addressed in the title of the paper. (© 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

11.
Abstract

An important class of nonparametric signal processing methods entails forming a set of predictors from an overcomplete set of basis functions associated with a fast transform (e.g., wavelet packets). In these methods, the number of basis functions can far exceed the number of sample values in the signal, leading to an ill-posed prediction problem. The “basis pursuit” denoising method of Chen, Donoho, and Saunders regularizes the prediction problem by adding an l 1 penalty term on the coefficients for the basis functions. Use of an l 1 penalty instead of l 2 has significant benefits, including higher resolution of signals close in time/frequency and a more parsimonious representation. The l 1 penalty, however, poses a challenging optimization problem that was solved by Chen, Donoho and Saunders using a novel application of interior-point algorithms (IP). This article investigates an alternative optimization approach based on block coordinate relaxation (BCR) for sets of basis functions that are the finite union of sets of orthonormal basis functions (e.g., wavelet packets). We show that the BCR algorithm is globally convergent, and empirically, the BCR algorithm is faster than the IP algorithm for a variety of signal denoising problems.  相似文献   

12.
This article introduces a smoothing technique to the l1 exact penalty function. An application of the technique yields a twice continuously differentiable penalty function and a smoothed penalty problem. Under some mild conditions, the optimal solution to the smoothed penalty problem becomes an approximate optimal solution to the original constrained optimization problem. Based on the smoothed penalty problem, we propose an algorithm to solve the constrained optimization problem. Every limit point of the sequence generated by the algorithm is an optimal solution. Several numerical examples are presented to illustrate the performance of the proposed algorithm.  相似文献   

13.
In this paper, univariate cubic L 1 interpolating splines based on the first derivative and on 5-point windows are introduced. Analytical results for minimizing the local spline functional on 5-point windows are presented and, based on these results, an efficient algorithm for calculating the spline coefficients is set up. It is shown that cubic L 1 splines based on the first derivative and on 5-point windows preserve linearity of the original data and avoid extraneous oscillation. Computational examples, including comparison with first-derivative-based cubic L 1 splines calculated by a primal affine algorithm and with second-derivative-based cubic L 1 splines, show the advantages of the first-derivative-based cubic L 1 splines calculated by the new algorithm.  相似文献   

14.
While there are many approaches to detecting changes in mean for a univariate time series, the problem of detecting multiple changes in slope has comparatively been ignored. Part of the reason for this is that detecting changes in slope is much more challenging: simple binary segmentation procedures do not work for this problem, while existing dynamic programming methods that work for the change in mean problem cannot be used for detecting changes in slope. We present a novel dynamic programming approach, CPOP, for finding the “best” continuous piecewise linear fit to data under a criterion that measures fit to data using the residual sum of squares, but penalizes complexity based on an L0 penalty on changes in slope. We prove that detecting changes in this manner can lead to consistent estimation of the number of changepoints, and show empirically that using an L0 penalty is more reliable at estimating changepoint locations than using an L1 penalty. Empirically CPOP has good computational properties, and can analyze a time series with 10,000 observations and 100 changes in a few minutes. Our method is used to analyze data on the motion of bacteria, and provides better and more parsimonious fits than two competing approaches. Supplementary material for this article is available online.  相似文献   

15.
We study the problem of learning a sparse linear regression vector under additional conditions on the structure of its sparsity pattern. This problem is relevant in machine learning, statistics and signal processing. It is well known that a linear regression can benefit from knowledge that the underlying regression vector is sparse. The combinatorial problem of selecting the nonzero components of this vector can be “relaxed” by regularizing the squared error with a convex penalty function like the ?1 norm. However, in many applications, additional conditions on the structure of the regression vector and its sparsity pattern are available. Incorporating this information into the learning method may lead to a significant decrease of the estimation error. In this paper, we present a family of convex penalty functions, which encode prior knowledge on the structure of the vector formed by the absolute values of the regression coefficients. This family subsumes the ?1 norm and is flexible enough to include different models of sparsity patterns, which are of practical and theoretical importance. We establish the basic properties of these penalty functions and discuss some examples where they can be computed explicitly. Moreover, we present a convergent optimization algorithm for solving regularized least squares with these penalty functions. Numerical simulations highlight the benefit of structured sparsity and the advantage offered by our approach over the Lasso method and other related methods.  相似文献   

16.
In this paper, a new sequential penalty algorithm, based on the Linfin exact penalty function, is proposed for a general nonlinear constrained optimization problem. The algorithm has the following characteristics: it can start from an arbitrary initial point; the feasibility of the subproblem is guaranteed; the penalty parameter is adjusted automatically; global convergence without any regularity assumption is proved. The update formula of the penalty parameter is new. It is proved that the algorithm proposed in this paper behaves equivalently to the standard SQP method after sufficiently many iterations. Hence, the local convergence results of the standard SQP method can be applied to this algorithm. Preliminary numerical experiments show the efficiency and stability of the algorithm.  相似文献   

17.
We consider an implicit nonlinear functional model with errors in variables. On the basis of the concept of deconvolution, we propose a new adaptive estimator of the least contrast of the regression parameter. We formulate sufficient conditions for the consistency of this estimator. We consider several examples within the framework of the L 1- and L 2-approaches.  相似文献   

18.
Since the pioneering work of Karmarkar, much interest was directed to penalty algorithms, in particular to the log barrier algorithm. We analyze in this paper the asymptotic convergence rate of a barrier algorithm when applied to non-linear programs. More specifically, we consider a variant of the SUMT method, in which so called extrapolation predictor steps allowing reducing the penalty parameter rk +1}k are followed by some Newton correction steps. While obviously related to predictor-corrector interior point methods, the spirit differs since our point of view is biased toward nonlinear barrier algorithms; we contrast in details both points of view. In our context, we identify an asymptotically optimal strategy for reducing the penalty parameter r and show that if rk+1=r k with < 8/5, then asymptotically only 2 Newton corrections are required, and this strategy achieves the best overall average superlinear convergence order (1.1696). Therefore, our main result is to characterize the best possible convergence order for SUMT type methods.  相似文献   

19.
In this article we study rank one discrete valuations of the field k((X 1,…, X n )) whose center in k[[X 1,…, X n ]] is the maximal ideal. In Sections 2 to 6 we give a construction of a system of parametric equations describing such valuations. This amounts to finding a parameter and a field of coefficients. We devote Section 2 to finding an element of value 1, that is, a parameter. The field of coefficients is the residue field of the valuation, and it is given in Section 5.

The constructions given in these sections are not effective in the general case, because we need either to use Zorn's lemma or to know explicitly a section σ of the natural homomorphism R v  → Δ v between the ring and the residue field of the valuation v.

However, as a consequence of this construction, in Section 7, we prove that k((X 1,…, X n )) can be embedded into a field L((Y 1,…, Y n )), where L is an algebraic extension of k and the “extended valuation” is as close as possible to the usual order function.  相似文献   

20.
Let T(λ, ε ) = λ2 + λC + λεD + K be a perturbed quadratic matrix polynomial, where C, D, and K are n × n hermitian matrices. Let λ0 be an eigenvalue of the unperturbed matrix polynomial T(λ, 0). With the falling part of the Newton diagram of det T(λ, ε), we find the number of differentiable eigenvalues. Some results are extended to the general case L(λ, ε) = λ2 + λD(ε) + K, where D(ε) is an analytic hermitian matrix function. We show that if K is negative definite on Ker L0, 0), then every eigenvalue λ(ε) of L(λ, ε) near λ0 is analytic.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号