首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper, we study the Kurdyka–?ojasiewicz (KL) exponent, an important quantity for analyzing the convergence rate of first-order methods. Specifically, we develop various calculus rules to deduce the KL exponent of new (possibly nonconvex and nonsmooth) functions formed from functions with known KL exponents. In addition, we show that the well-studied Luo–Tseng error bound together with a mild assumption on the separation of stationary values implies that the KL exponent is \(\frac{1}{2}\). The Luo–Tseng error bound is known to hold for a large class of concrete structured optimization problems, and thus we deduce the KL exponent of a large class of functions whose exponents were previously unknown. Building upon this and the calculus rules, we are then able to show that for many convex or nonconvex optimization models for applications such as sparse recovery, their objective function’s KL exponent is \(\frac{1}{2}\). This includes the least squares problem with smoothly clipped absolute deviation regularization or minimax concave penalty regularization and the logistic regression problem with \(\ell _1\) regularization. Since many existing local convergence rate analysis for first-order methods in the nonconvex scenario relies on the KL exponent, our results enable us to obtain explicit convergence rate for various first-order methods when they are applied to a large variety of practical optimization models. Finally, we further illustrate how our results can be applied to establishing local linear convergence of the proximal gradient algorithm and the inertial proximal algorithm with constant step sizes for some specific models that arise in sparse recovery.  相似文献   

2.
For strictly increasing concave functions \({\varphi}\) whose inverse functions are log-concave, the \({\varphi}\)-Brunn–Minkowski inequality for planar convex bodies is established. It is shown that for convex bodies in \({\mathbb{R}^n}\) the \({\varphi}\)-Brunn–Minkowski is equivalent to the \({\varphi}\)-Minkowski mixed volume inequalities.  相似文献   

3.
We prove \(L^p\) bounds for partial polynomial Carleson operators along monomial curves \((t,t^m)\) in the plane \(\mathbb {R}^2\) with a phase polynomial consisting of a single monomial. These operators are “partial” in the sense that we consider linearizing stopping-time functions that depend on only one of the two ambient variables. A motivation for studying these partial operators is the curious feature that, despite their apparent limitations, for certain combinations of curve and phase, \(L^2\) bounds for partial operators along curves imply the full strength of the \(L^2\) bound for a one-dimensional Carleson operator, and for a quadratic Carleson operator.  相似文献   

4.
We describe methods for proving bounds on infinite-time averages in differential dynamical systems. The methods rely on the construction of nonnegative polynomials with certain properties, similarly to the way nonlinear stability can be proved using Lyapunov functions. Nonnegativity is enforced by requiring the polynomials to be sums of squares, a condition which is then formulated as a semidefinite program (SDP) that can be solved computationally. Although such computations are subject to numerical error, we demonstrate two ways to obtain rigorous results: using interval arithmetic to control the error of an approximate SDP solution, and finding exact analytical solutions to relatively small SDPs. Previous formulations are extended to allow for bounds depending analytically on parametric variables. These methods are illustrated using the Lorenz equations, a system with three state variables (xyz) and three parameters \((\beta ,\sigma ,r)\). Bounds are reported for infinite-time averages of all eighteen moments \(x^ly^mz^n\) up to quartic degree that are symmetric under \((x,y)\mapsto (-x,-y)\). These bounds apply to all solutions regardless of stability, including chaotic trajectories, periodic orbits, and equilibrium points. The analytical approach yields two novel bounds that are sharp: the mean of \(z^3\) can be no larger than its value of \((r-1)^3\) at the nonzero equilibria, and the mean of \(xy^3\) must be nonnegative. The interval arithmetic approach is applied at the standard chaotic parameters to bound eleven average moments that all appear to be maximized on the shortest periodic orbit. Our best upper bound on each such average exceeds its value on the maximizing orbit by less than 1%. Many bounds reported here are much tighter than would be possible without computer assistance.  相似文献   

5.
New bounds for the \(k\)th-order derivatives of the solutions of the normal and multivariate normal Stein equations are obtained. Our general order bounds involve fewer derivatives of the test function than those in the existing literature. We apply these bounds and local approach couplings to obtain an order \(n^{-(p-1)/2}\) bound, for smooth test functions, for the distance between the distribution of a standardised sum of independent and identically distributed random variables and the standard normal distribution when the first \(p\) moments of these distributions agree. We also obtain a bound on the convergence rate of a sequence of distributions to the normal distribution when the moment sequence converges to normal moments.  相似文献   

6.
The worst case complexity of direct-search methods has been recently analyzed when they use positive spanning sets and impose a sufficient decrease condition to accept new iterates. For smooth unconstrained optimization, it is now known that such methods require at most \(\mathcal {O}(n^2\epsilon ^{-2})\) function evaluations to compute a gradient of norm below \(\epsilon \in (0,1)\), where n is the dimension of the problem. Such a maximal effort is reduced to \(\mathcal {O}(n^2\epsilon ^{-1})\) if the function is convex. The factor \(n^2\) has been derived using the positive spanning set formed by the coordinate vectors and their negatives at all iterations. In this paper, we prove that such a factor of \(n^2\) is optimal in these worst case complexity bounds, in the sense that no other positive spanning set will yield a better order of n. The proof is based on an observation that reveals the connection between cosine measure in positive spanning and sphere covering.  相似文献   

7.
We show that the complexity of computing the second order moment bound on the expected optimal value of a mixed integer linear program with a random objective coefficient vector is closely related to the complexity of characterizing the convex hull of the points \(\{{1 \atopwithdelims (){\varvec{x}}}{1 \atopwithdelims (){\varvec{x}}}' \ | \ {\varvec{x}} \in {\mathcal {X}}\}\) where \({\mathcal {X}}\) is the feasible region. In fact, we can replace the completely positive programming formulation for the moment bound on \({\mathcal {X}}\), with an associated semidefinite program, provided we have a linear or a semidefinite representation of this convex hull. As an application of the result, we identify a new polynomial time solvable semidefinite relaxation of the distributionally robust multi-item newsvendor problem by exploiting results from the Boolean quadric polytope. For \({\mathcal {X}}\) described explicitly by a finite set of points, our formulation leads to a reduction in the size of the semidefinite program. We illustrate the usefulness of the reduced semidefinite programming bounds in estimating the expected range of random variables with two applications arising in random walks and best–worst choice models.  相似文献   

8.
In this paper, we prove new complexity bounds for methods of convex optimization based only on computation of the function value. The search directions of our schemes are normally distributed random Gaussian vectors. It appears that such methods usually need at most n times more iterations than the standard gradient methods, where n is the dimension of the space of variables. This conclusion is true for both nonsmooth and smooth problems. For the latter class, we present also an accelerated scheme with the expected rate of convergence \(O\Big ({n^2 \over k^2}\Big )\), where k is the iteration counter. For stochastic optimization, we propose a zero-order scheme and justify its expected rate of convergence \(O\Big ({n \over k^{1/2}}\Big )\). We give also some bounds for the rate of convergence of the random gradient-free methods to stationary points of nonconvex functions, for both smooth and nonsmooth cases. Our theoretical results are supported by preliminary computational experiments.  相似文献   

9.
Based on work of Atkin and Swinnerton-Dyer on partition rank difference functions, and more recent work of Lovejoy and Osburn, Mao has proved several inequalities between partition ranks modulo 10, and additional results modulo 6 and 10 for the \(M_2\) rank of partitions without repeated odd parts. Mao conjectured some additional inequalities. We prove some of Mao’s rank inequality conjectures for both the rank and the \(M_2\) rank modulo 10 using elementary methods.  相似文献   

10.
W. Hare 《Optimization Letters》2017,11(7):1217-1227
Derivative-free optimization (DFO) is the mathematical study of the optimization algorithms that do not use derivatives. One branch of DFO focuses on model-based DFO methods, where an approximation of the objective function is used to guide the optimization algorithm. Proving convergence of such methods often applies an assumption that the approximations form fully linear models—an assumption that requires the true objective function to be smooth. However, some recent methods have loosened this assumption and instead worked with functions that are compositions of smooth functions with simple convex functions (the max-function or the \(\ell _1\) norm). In this paper, we examine the error bounds resulting from the composition of a convex lower semi-continuous function with a smooth vector-valued function when it is possible to provide fully linear models for each component of the vector-valued function. We derive error bounds for the resulting function values and subgradient vectors.  相似文献   

11.
Let \(P\) be a set of \(n\) points in the plane. A geometric graph \(G\) on \(P\) is said to be locally Gabriel if for every edge \((u,v)\) in \(G\), the Euclidean disk with the segment joining \(u\) and \(v\) as diameter does not contain any points of \(P\) that are neighbors of \(u\) or \(v\) in \(G\). A locally Gabriel graph(LGG) is a generalization of Gabriel graph and is motivated by applications in wireless networks. Unlike a Gabriel graph, there is no unique LGG on a given point set since no edge in a LGG is necessarily included or excluded. Thus the edge set of the graph can be customized to optimize certain network parameters depending on the application. The unit distance graph(UDG), introduced by Erdos, is also a LGG. In this paper, we show the following combinatorial bounds on edge complexity and independent sets of LGG: (i) For any \(n\), there exists LGG with \(\Omega (n^{5/4})\) edges. This improves upon the previous best bound of \(\Omega (n^{1+\frac{1}{\log \log n}})\). (ii) For various subclasses of convex point sets, we show tight linear bounds on the maximum edge complexity of LGG. (iii) For any LGG on any \(n\) point set, there exists an independent set of size \(\Omega (\sqrt{n}\log n)\).  相似文献   

12.
We prove strong-type \(A_p\)\(A_\infty \) estimate for square functions, improving on the \(A_p\) bound due to Lerner. Entropy bounds, in the recent innovation of Treil–Volberg, are then proved. The techniques of proof include parallel stopping cubes, pigeon-hole arguments, and the approach to entropy bounds of Lacey–Spencer.  相似文献   

13.
We extend the range of N to negative values in the (KN)-convexity (in the sense of Erbar–Kuwada–Sturm), the weighted Ricci curvature \(\mathop {\mathrm {Ric}}\nolimits _N\) and the curvature-dimension condition \(\mathop {\mathrm {CD}}\nolimits (K,N)\). We generalize a number of results in the case of \(N>0\) to this setting, including Bochner’s inequality, the Brunn–Minkowski inequality and the equivalence between \(\mathop {\mathrm {Ric}}\nolimits _N \ge K\) and \(\mathop {\mathrm {CD}}\nolimits (K,N)\). We also show an expansion bound for gradient flows of Lipschitz (KN)-convex functions.  相似文献   

14.
Classical stochastic gradient methods are well suited for minimizing expected-value objective functions. However, they do not apply to the minimization of a nonlinear function involving expected values or a composition of two expected-value functions, i.e., the problem \(\min _x \mathbf{E}_v\left[ f_v\big (\mathbf{E}_w [g_w(x)]\big ) \right] .\) In order to solve this stochastic composition problem, we propose a class of stochastic compositional gradient descent (SCGD) algorithms that can be viewed as stochastic versions of quasi-gradient method. SCGD update the solutions based on noisy sample gradients of \(f_v,g_{w}\) and use an auxiliary variable to track the unknown quantity \(\mathbf{E}_w\left[ g_w(x)\right] \). We prove that the SCGD converge almost surely to an optimal solution for convex optimization problems, as long as such a solution exists. The convergence involves the interplay of two iterations with different time scales. For nonsmooth convex problems, the SCGD achieves a convergence rate of \(\mathcal {O}(k^{-1/4})\) in the general case and \(\mathcal {O}(k^{-2/3})\) in the strongly convex case, after taking k samples. For smooth convex problems, the SCGD can be accelerated to converge at a rate of \(\mathcal {O}(k^{-2/7})\) in the general case and \(\mathcal {O}(k^{-4/5})\) in the strongly convex case. For nonconvex problems, we prove that any limit point generated by SCGD is a stationary point, for which we also provide the convergence rate analysis. Indeed, the stochastic setting where one wants to optimize compositions of expected-value functions is very common in practice. The proposed SCGD methods find wide applications in learning, estimation, dynamic programming, etc.  相似文献   

15.
A theorem of W. Derrick ensures that the volume of any Riemannian cube \(([0,1]^n,g)\) is bounded below by the product of the distances between opposite codimension-1 faces. In this paper, we establish a discrete analog of Derrick’s inequality for weighted open covers of the cube \([0,1]^n\), which is motivated by a question about lower volume bounds in metric spaces. Our main theorem generalizes a previous result of the author in Kinneberg (J Differ Geom 100(2):349–388, 2015) which gave a combinatorial version of Derrick’s inequality and was used in the analysis of boundaries of hyperbolic groups. As an application, we answer a question of Y. Burago and V. Zalgaller about length-volume inequalities for pseudometrics on the unit cube.  相似文献   

16.
It is well-known that if a real valued function acting on a convex set satisfies the n-variable Jensen inequality, for some natural number \(n\ge 2\), then, for all \(k\in \{1,\dots , n\}\), it fulfills the k-variable Jensen inequality as well. In other words, the arithmetic mean and the Jensen inequality (as a convexity property) are both reducible. Motivated by this phenomenon, we investigate this property concerning more general means and convexity notions. We introduce a wide class of means which generalize the well-known means for arbitrary linear spaces and enjoy a so-called reducibility property. Finally, we give a sufficient condition for the reducibility of the (MN)-convexity property of functions and also for Hölder–Minkowski type inequalities.  相似文献   

17.
Let \(X_1,\ldots ,X_n\) be, possibly dependent, [0, 1]-valued random variables. What is a sharp upper bound on the probability that their sum is significantly larger than their mean? In the case of independent random variables, a fundamental tool for bounding such probabilities is devised by Wassily Hoeffding. In this paper, we provide a generalisation of Hoeffding’s theorem. We obtain an estimate on the aforementioned probability that is described in terms of the expectation, with respect to convex functions, of a random variable that concentrates mass on the set \(\{0,1,\ldots ,n\}\). Our main result yields concentration inequalities for several sums of dependent random variables such as sums of martingale difference sequences, sums of k-wise independent random variables, as well as for sums of arbitrary [0, 1]-valued random variables.  相似文献   

18.
Motivated by a partition inequality of Bessenrodt and Ono, we obtain analogous inequalities for k-colored partition functions \(p_{-k}(n)\) for all \(k\ge 2\). This enables us to extend the k-colored partition function multiplicatively to a function on k-colored partitions and characterize when it has a unique maximum. We conclude with one conjectural inequality that strengthens our results.  相似文献   

19.
A family of identities primarily associated with isoperimetric inequalities for planar convex domains was discovered by Pleijel in 1956. We call these identities classical Pleijel identities. R. V. Ambartzumian gave combinatorial proof of these identities and pointed out that they can be applied to find chord length distribution functions for convex domains. In the classical Pleijel identities integration is over the measure in the space \(\mathbb{G}\) of lines which is invariant with respect to the all Euclidean motions. In the present paper they are considered for any locally-finite measure in the space \(\mathbb{G}\). These identities are applied to find the so-called orientation-dependent chord length distribution (or density) functions for bounded convex domains.  相似文献   

20.
In this paper we consider a new kind of inequality related to fractional integration, motivated by Gressman’s paper. Based on it we investigate its multilinear analogue inequalities. Combining with Gressman’s work on multilinear integral, we establish this new kind of geometric inequalities with bilinear form and multilinear form in more general settings. Moreover, in some cases we also find the best constants and optimisers for these geometric inequalities on Euclidean spaces with Lebesgue measure settings with \(L^{p}\) bounds.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号