期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function

Peter Richtárik Martin Takáč 《Mathematical Programming》2014,144(1-2):1-38

In this paper we develop a randomized block-coordinate descent method for minimizing the sum of a smooth and a simple nonsmooth block-separable convex function and prove that it obtains an $\varepsilon $ -accurate solution with probability at least $1-\rho $ in at most $O((n/\varepsilon ) \log (1/\rho ))$ iterations, where $n$ is the number of blocks. This extends recent results of Nesterov (SIAM J Optim 22(2): 341–362, 2012), which cover the smooth case, to composite minimization, while at the same time improving the complexity by the factor of 4 and removing $\varepsilon $ from the logarithmic term. More importantly, in contrast with the aforementioned work in which the author achieves the results by applying the method to a regularized version of the objective function with an unknown scaling factor, we show that this is not necessary, thus achieving first true iteration complexity bounds. For strongly convex functions the method converges linearly. In the smooth case we also allow for arbitrary probability vectors and non-Euclidean norms. Finally, we demonstrate numerically that the algorithm is able to solve huge-scale $\ell _1$ -regularized least squares problems with a billion variables. 相似文献

2.

Parallel distributed block coordinate descent methods based on pairwise comparison oracle

Kota Matsui Wataru Kumagai Takafumi Kanamori 《Journal of Global Optimization》2017,68(1):1-22

The aim of this paper is to find the global solutions of uncertain optimization problems having a quadratic objective function and quadratic inequality constraints. The bounded epistemic uncertainties in the constraint coefficients are represented using either universal or existential quantified parameters and interval parameter domains. This approach allows to model non-controlled uncertainties by using universally quantified parameters and controlled uncertainties by using existentially quantified ones. While existentially quantified parameters could be equivalently considered as additional variables, keeping them as parameters allows maintaining the quadratic problem structure, which is essential for the proposed algorithm. The branch and bound algorithm presented in the paper handles both universally and existentially quantified parameters in a homogeneous way, without branching on their domains, and uses some dedicated numerical constraint programming techniques for finding a robust, global solution. Several examples clarify the theoretical parts and the tests demonstrate the usefulness of the proposed method. 相似文献

3.

On the convergence of inexact block coordinate descent methods for constrained optimization

A. Cassioli D. Di Lorenzo M. Sciandrone 《European Journal of Operational Research》2013

We consider the problem of minimizing a smooth function over a feasible set defined as the Cartesian product of convex compact sets. We assume that the dimension of each factor set is huge, so we are interested in studying inexact block coordinate descent methods (possibly combined with column generation strategies). We define a general decomposition framework where different line search based methods can be embedded, and we state global convergence results. Specific decomposition methods based on gradient projection and Frank–Wolfe algorithms are derived from the proposed framework. The numerical results of computational experiments performed on network assignment problems are reported. 相似文献

4.

On the complexity analysis of randomized block-coordinate descent methods

Zhaosong Lu Lin Xiao 《Mathematical Programming》2015,152(1-2):615-642

相似文献

5.

On optimal probabilities in stochastic coordinate descent methods

Peter Richtárik Martin Takáč 《Optimization Letters》2016,10(6):1233-1243

We propose and analyze a new parallel coordinate descent method—NSync—in which at each iteration a random subset of coordinates is updated, in parallel, allowing for the subsets to be chosen using an arbitrary probability law. This is the first method of this type. We derive convergence rates under a strong convexity assumption, and comment on how to assign probabilities to the sets to optimize the bound. The complexity and practical performance of the method can outperform its uniform variant by an order of magnitude. Surprisingly, the strategy of updating a single randomly selected coordinate per iteration—with optimal probabilities—may require less iterations, both in theory and practice, than the strategy of updating all coordinates at every iteration. 相似文献

6.

Minimization by coordinate descent

T. Abatzoglou B. O'Donnell 《Journal of Optimization Theory and Applications》1982,36(2):163-174

We examine the minimization of anN-dimensional real-valued function using the coordinate descent method. We impose conditions on the function under which the method converges; furthermore, by specializing our class of functions, we obtain the rate of convergence. We also present some examples from classical approximation theory where this method applies. A computational example is also given. 相似文献

7.

Stepsize analysis for descent methods 总被引：4，自引：0，他引：4

A. I. Cohen 《Journal of Optimization Theory and Applications》1981,33(2):187-205

The convergence rates of descent methods with different stepsize rules are compared. Among the stepsize rules considered are: constant stepsize, exact minimization along a line, Goldstein-Armijo rules, and stepsize equal to that which yields the minimum of certain interpolatory polynomials. One of the major results shown is that the rate of convergence of descent methods with the Goldstein-Armijo stepsize rules can be made as close as desired to the rate of convergence of methods that require exact minimization along a line. Also, a descent algorithm that combines a Goldstein-Armijo stepsize rule with a secant-type step is presented. It is shown that this algorithm has a convergence rate equal to the convergence of descent methods that require exact minimization along a line and that, eventually (i.e., near the minimum), it does not require a search to determine an acceptable stepsize. 相似文献

8.

A flexible coordinate descent method

Kimon Fountoulakis Rachael Tappenden 《Computational Optimization and Applications》2018,70(2):351-394

We present a novel randomized block coordinate descent method for the minimization of a convex composite objective function. The method uses (approximate) partial second-order (curvature) information, so that the algorithm performance is more robust when applied to highly nonseparable or ill conditioned problems. We call the method Flexible Coordinate Descent (FCD). At each iteration of FCD, a block of coordinates is sampled randomly, a quadratic model is formed about that block and the model is minimized approximately/inexactly to determine the search direction. An inexpensive line search is then employed to ensure a monotonic decrease in the objective function and acceptance of large step sizes. We present several high probability iteration complexity results to show that convergence of FCD is guaranteed theoretically. Finally, we present numerical results on large-scale problems to demonstrate the practical performance of the method. 相似文献

9.

Local convergence analysis of a grouped variable version of coordinate descent 总被引：1，自引：0，他引：1

J. C. Bezdek R. J. Hathaway R. E. Howard C. A. Wilson M. P. Windham 《Journal of Optimization Theory and Applications》1987,54(3):471-477

LetF(x,y) be a function of the vector variablesxR ⁿ andyR ^m. One possible scheme for minimizingF(x,y) is to successively alternate minimizations in one vector variable while holding the other fixed. Local convergence analysis is done for this vector (grouped variable) version of coordinate descent, and assuming certain regularity conditions, it is shown that such an approach is locally convergent to a minimizer and that the rate of convergence in each vector variable is linear. Examples where the algorithm is useful in clustering and mixture density decomposition are given, and global convergence properties are briefly discussed.This research was supported in part by NSF Grant No. IST-84-07860. The authors are indebted to Professor R. A. Tapia for his help in improving this paper. 相似文献

10.

A block coordinate gradient descent method for regularized convex separable optimization and covariance selection

Sangwoon Yun Paul Tseng Kim-Chuan Toh 《Mathematical Programming》2011,129(2):331-355

We consider a class of unconstrained nonsmooth convex optimization problems, in which the objective function is the sum of a convex smooth function on an open subset of matrices and a separable convex function on a set of matrices. This problem includes the covariance selection problem that can be expressed as an ℓ ₁-penalized maximum likelihood estimation problem. In this paper, we propose a block coordinate gradient descent method (abbreviated as BCGD) for solving this class of nonsmooth separable problems with the coordinate block chosen by a Gauss-Seidel rule. The method is simple, highly parallelizable, and suited for large-scale problems. We establish global convergence and, under a local Lipschizian error bound assumption, linear rate of convergence for this method. For the covariance selection problem, the method can terminate in O(n³/e){O(n^3/\epsilon)} iterations with an e{\epsilon}-optimal solution. We compare the performance of the BCGD method with the first-order methods proposed by Lu (SIAM J Optim 19:1807–1827, 2009; SIAM J Matrix Anal Appl 31:2000–2016, 2010) for solving the covariance selection problem on randomly generated instances. Our numerical experience suggests that the BCGD method can be efficient for large-scale covariance selection problems with constraints. 相似文献

11.

From error bounds to the complexity of first-order descent methods for convex functions

Jérôme Bolte Trong Phong Nguyen Juan Peypouquet Bruce W. Suter 《Mathematical Programming》2017,165(2):471-507

This paper shows that error bounds can be used as effective tools for deriving complexity results for first-order descent methods in convex minimization. In a first stage, this objective led us to revisit the interplay between error bounds and the Kurdyka-?ojasiewicz (KL) inequality. One can show the equivalence between the two concepts for convex functions having a moderately flat profile near the set of minimizers (as those of functions with Hölderian growth). A counterexample shows that the equivalence is no longer true for extremely flat functions. This fact reveals the relevance of an approach based on KL inequality. In a second stage, we show how KL inequalities can in turn be employed to compute new complexity bounds for a wealth of descent methods for convex problems. Our approach is completely original and makes use of a one-dimensional worst-case proximal sequence in the spirit of the famous majorant method of Kantorovich. Our result applies to a very simple abstract scheme that covers a wide class of descent methods. As a byproduct of our study, we also provide new results for the globalization of KL inequalities in the convex framework. Our main results inaugurate a simple method: derive an error bound, compute the desingularizing function whenever possible, identify essential constants in the descent method and finally compute the complexity using the one-dimensional worst case proximal sequence. Our method is illustrated through projection methods for feasibility problems, and through the famous iterative shrinkage thresholding algorithm (ISTA), for which we show that the complexity bound is of the form $O(q^{k})$ where the constituents of the bound only depend on error bound constants obtained for an arbitrary least squares objective with $\ell ^1$ regularization. 相似文献

12.

Cyclic coordinate descent in hydrothermal nonsmooth problems

Luis Bayón Jose M. Grau Maria M. Ruiz Pedro M. Suárez 《Numerical Algorithms》2012,59(2):227-247

In this paper we present an algorithm, inspired by the cyclic coordinate descent method, which allows the solution of hydrothermal optimization problems involving pumped-storage plants. The proof of the convergence of the succession generated by the algorithm was based on the use of an appropriate adaptation of Zangwill’s global theorem of convergence. Finally, the algorithm proposed is implemented using the Mathematica Package and is applied to an example to illustrate the results obtained. 相似文献

13.

Algorithms for nonnegative matrix and tensor factorizations: a unified view based on block coordinate descent framework

Jingu Kim Yunlong He Haesun Park 《Journal of Global Optimization》2014,58(2):285-319

We review algorithms developed for nonnegative matrix factorization (NMF) and nonnegative tensor factorization (NTF) from a unified view based on the block coordinate descent (BCD) framework. NMF and NTF are low-rank approximation methods for matrices and tensors in which the low-rank factors are constrained to have only nonnegative elements. The nonnegativity constraints have been shown to enable natural interpretations and allow better solutions in numerous applications including text analysis, computer vision, and bioinformatics. However, the computation of NMF and NTF remains challenging and expensive due the constraints. Numerous algorithmic approaches have been proposed to efficiently compute NMF and NTF. The BCD framework in constrained non-linear optimization readily explains the theoretical convergence properties of several efficient NMF and NTF algorithms, which are consistent with experimental observations reported in literature. In addition, we discuss algorithms that do not fit in the BCD framework contrasting them from those based on the BCD framework. With insights acquired from the unified perspective, we also propose efficient algorithms for updating NMF when there is a small change in the reduced dimension or in the data. The effectiveness of the proposed updating algorithms are validated experimentally with synthetic and real-world data sets. 相似文献

14.

Block coordinate descent for smooth nonconvex constrained minimization

Birgin E. G. Martínez J. M. 《Computational Optimization and Applications》2022,81(1):1-30

We present a new algorithm for large-scale unconstrained minimization that, at each iteration, minimizes, approximately, a quadratic model of the objective function plus a regularization term, not necessarily based on a norm. We prove convergence assuming only gradient continuity and complexity results assuming Lipschitz conditions. For solving the subproblems in the case of regularizations based on the 3-norm, we introduce a new method that quickly obtains the approximate solutions required by the theory. We present numerical experiments.

相似文献

15.

A coordinate gradient descent method for nonsmooth separable minimization 总被引：1，自引：0，他引：1

Paul Tseng Sangwoon Yun 《Mathematical Programming》2009,117(1-2):387-423

We consider the problem of minimizing the sum of a smooth function and a separable convex function. This problem includes as special cases bound-constrained optimization and smooth optimization with ?₁-regularization. We propose a (block) coordinate gradient descent method for solving this class of nonsmooth separable problems. We establish global convergence and, under a local Lipschitzian error bound assumption, linear convergence for this method. The local Lipschitzian error bound holds under assumptions analogous to those for constrained smooth optimization, e.g., the convex function is polyhedral and the smooth function is (nonconvex) quadratic or is the composition of a strongly convex function with a linear mapping. We report numerical experience with solving the ?₁-regularization of unconstrained optimization problems from Moré et al. in ACM Trans. Math. Softw. 7, 17–41, 1981 and from the CUTEr set (Gould and Orban in ACM Trans. Math. Softw. 29, 373–394, 2003). Comparison with L-BFGS-B and MINOS, applied to a reformulation of the ?₁-regularized problem as a bound-constrained optimization problem, is also reported. 相似文献

16.

Unified complexity analysis for Newton LP methods

James Renegar Michael Shub 《Mathematical Programming》1992,53(1-3):1-16

We show that a theorem of Smale can be used to unify the polynomial-time bound proofs of several of the recent interior algorithms for linear programming and convex quadratic programming.This research is supported by NSF grants. 相似文献

17.

A complexity analysis of Policy Iteration through combinatorial matrices arising from Unique Sink Orientations

《Journal of Discrete Algorithms》2017

相似文献

18.

Iteration methods in eigenvalue problems

E. G. D'yakonov 《Mathematical Notes》1983,34(6):945-953

相似文献

19.

Efficient random coordinate descent algorithms for large-scale structured nonconvex optimization

Andrei Patrascu Ion Necoara 《Journal of Global Optimization》2015,61(1):19-46

相似文献

20.

Most continuous descent methods converge

Sergiu Aizicovici Simeon Reich Alexander J. Zaslavski 《Archiv der Mathematik》2005,85(3):268-277

We consider continuous descent methods for the minimization of convex functions defined on a general Banach space and show that most of them (in the sense of Baire category) converge.Received: 21 July 2004 相似文献