共查询到20条相似文献,搜索用时 312 毫秒
1.
We present a new column generation algorithm for the determination of a classifier in the two classes LAD (Logical Analysis
of Data) model. Unlike existing algorithms who seek a classifier that at the same time maximizes the margin of correctly classified
observations and minimizes the amount of violations of incorrectly classified observations, we fix the margin to a difficult-to-achieve
target and minimize a piecewise convex linear function of the violation of incorrectly classified observations. Moreover a
part of the training set, called control set, is reserved to select, among all feasible classifiers found by the algorithm, the one with highest performance on that set.
One advantage of the proposed algorithm is that it essentially does not require any calibration. Computational results are
presented that show the effectiveness of this approach. 相似文献
2.
3.
In real-time trajectory planning for unmanned vehicles, on-board sensors, radars and other instruments are used to collect information on possible obstacles to be avoided and pathways to be followed. Since, in practice, observations of the sensors have measurement errors, the stochasticity of the data has to be incorporated into the models. In this paper, we consider using a genetic algorithm for the constrained optimization problem of finding the trajectory with minimum length between two locations, avoiding the obstacles on the way. To incorporate the variability of the sensor readings, we propose a modified genetic algorithm, addressing the stochasticity of the feasible regions. In this way, the probability that a possible solution in the search space, say x, is feasible can be derived from the random observations of obstacles and pathways, creating a real-time data learning algorithm. By building a confidence region from the observed data such that its border intersects with the solution point x, the level of the confidence region defines the probability that x is feasible. We propose using a smooth penalty function based on the Gaussian distribution, facilitating the borders of the feasible regions to be reached by the algorithm. 相似文献
4.
5.
In the standard optimal stopping problems, actions are artificially restricted to the moments of observations of costs or benefits. In the standard experimentation and learning models based on two-armed Poisson bandits, it is possible to take an action between two sequential observations. The latter models do not recognize the fact that timing decisions depend not only on the rate of arrival of observations, but also on the stochastic dynamics of costs or benefits. We combine these two strands of literature and consider optimal stopping problems with random observations and updating. We formulate the dichotomy principle, an extension of the smooth pasting principle.
相似文献6.
Réal A. Carbonneau Gilles CaporossiPierre Hansen 《European Journal of Operational Research》2011,212(1):213-222
Exact global optimization of the clusterwise regression problem is challenging and there are currently no published feasible methods for performing this clustering optimally, even though it has been over thirty years since its original proposal. This work explores global optimization of the clusterwise regression problem using mathematical programming and related issues. A mixed logical-quadratic programming formulation with implication of constraints is presented and contrasted against a quadratic formulation based on the traditional big-M, which cannot guarantee optimality because the regression line coefficients, and thus errors, may be arbitrarily large. Clusterwise regression optimization times and solution optimality for two clusters are empirically tested on twenty real datasets and three series of synthetic datasets ranging from twenty to one hundred observations and from two to ten independent variables. Additionally, a few small real datasets are clustered into three lines. 相似文献
7.
Zhiqiang Tan 《Journal of computational and graphical statistics》2017,26(1):54-65
Consider the two problems of simulating observations and estimating expectations and normalizing constants for multiple distributions. First, we present a self-adjusted mixture sampling method, which accommodates both adaptive serial tempering and a generalized Wang–Landau algorithm. The set of distributions are combined into a labeled mixture, with the mixture weights depending on the initial estimates of log normalizing constants (or free energies). Then, observations are generated by Markov transitions, and free energy estimates are adjusted online by stochastic approximation. We propose two stochastic approximation schemes by Rao–Blackwellization of the scheme commonly used, and derive the optimal choice of a gain matrix, resulting in the minimum asymptotic variance for free energy estimation, in a simple and feasible form. Second, we develop an offline method, locally weighted histogram analysis, for estimating free energies and expectations, using all the simulated data from multiple distributions by either self-adjusted mixture sampling or other sampling algorithms. This method can be computationally much faster, with little sacrifice of statistical efficiency, than a global method currently used, especially when a large number of distributions are involved. We provide both theoretical results and numerical studies to demonstrate the advantages of the proposed methods. 相似文献
8.
Chisheng Huang John Alan McDonald Werner Stuetzle 《Journal of computational and graphical statistics》2013,22(4):383-396
Abstract Scatterplots are the method of choice for displaying the distribution of points in two dimensions. They are used to discover patterns such as holes, outliers, modes, and association between the two variables. A common problem is overstriking—the overlap on the plotting surface of glyphs representing individual observations. Overstriking can create a misleading impression of the data distribution. The variable resolution bivariate plots (Varebi plots) proposed in this article deal with the problem of overstriking by mixing display of a density estimate and display of individual observations. The idea is to determine the display format by analyzing the actual amount of overstriking on the screen. Thus, the display format will depend on the sample size, the distribution of the observations, the size and shape of individual icons, and the size of the window. It may change automatically when the window is resized. Varebi plots reveal detail wherever possible, and show the overall trend when displaying detail is not feasible. 相似文献
9.
Jacques Bénasséni 《Computational Statistics》2018,33(4):1939-1955
Principal component analysis is a method of dimensionality reduction based on the eigensystem of the covariance matrix of a set of multivariate observations. Analyzing the effects of some specific observations on this eigensystem is therefore of particular importance in the sensitivity study of the results. In this framework, approximations for the perturbed eigenvalues and eigenvectors when deleting one or several observations are useful from a computational standpoint. Indeed, they allow one to evaluate the effects of these observations without having to recompute the exact perturbed eigenvalues and eigenvectors. However, it turns out that some approximations which have been suggested are based on an incorrect application of matrix perturbation theory. The aim of this short note is to provide the correct formulations which are illustrated with a numerical study. 相似文献
10.
Melek Demirhan Linet Özdamar Levent Helvacıoğlu Şevket Ilker Birbil 《Journal of Global Optimization》1999,14(4):415-436
We propose a new metaheuristic, FRACTOP, for global optimization. FRACTOP is based on the geometric partitioning of the feasible region so that search metaheuristics such as Simulated Annealing (SA), or Genetic Algorithms (GA) which are activated in smaller subregions, have increased reliability in locating the global optimum. FRACTOP is able to incorporate any search heuristic devised for global optimization. The main contribution of FRACTOP is that it provides an intelligent guidance (through fuzzy measures) in locating the subregion containing the global optimum solution for the search heuristics imbedded in it. By executing the search in nonoverlapping subregions, FRACTOP eliminates the repetitive visits of the search heuristics to the same local area and furthermore, it becomes amenable for parallel processing. As FRACTOP conducts the search deeper into smaller subregions, many unpromising subregions are discarded from the feasible region. Thus, the initial feasible region gains a fractal structure with many space gaps which economizes on computation time. Computational experiments with FRACTOP indicate that the metaheuristic improves significantly the results obtained by random search (RS), SA and GA. 相似文献
11.
R. Baker Kearfott 《Mathematical Programming》1998,83(1-3):89-100
Various algorithms can compute approximate feasible points or approximate solutions to equality and bound constrained optimization
problems. In exhaustive search algorithms for global optimizers and other contexts, it is of interest to construct bounds
around such approximate feasible points, then to verify (computationally but rigorously) that an actual feasible point exists
within these bounds. Hansen and others have proposed techniques for proving the existence of feasible points within given
bounds, but practical implementations have not, to our knowledge, previously been described. Various alternatives are possible
in such an implementation, and details must be carefully considered. Also, in addition to Hansen’s technique for handling
the underdetermined case, it is important to handle the overdetermined case, when the approximate feasible point corresponds
to a point with many active bound constraints. The basic ideas, along with experimental results from an actual implementation,
are summarized here.
This work was supported in part by National Science Foundation grant CCR-9203730. 相似文献
12.
Chun-xia Zhang Chang-lin Mei Jiang-she Zhang 《应用数学学报(英文版)》2007,23(4):619-628
When a real-world data set is fitted to a specific type of models,it is often encountered that oneor a set of observations have undue influence on the model fitting,which may lead to misleading conclusions.Therefore,it is necessary for data analysts to identify these influential observations and assess their impacton various aspects of model fitting.In this paper,one type of modified Cook's distances is defined to gaugethe influence of one or a set observations on the estimate of the constant coefficient part in partially varying-coefficient models,and the Cook's distances are expressed as functions of the corresponding residuals andleverages.Meanwhile,a bootstrap procedure is suggested to derive the reference values for the proposed Cook'sdistances.Some simulations are conducted,and a real-world data set is further analyzed to examine theperformance of the proposed method.The experimental results are satisfactory. 相似文献
13.
Sorin Alexe 《Discrete Applied Mathematics》2006,154(7):1050-1063
Sets of “positive” and “negative” points (observations) in n-dimensional discrete space given along with their non-negative integer multiplicities are analyzed from the perspective of the Logical Analysis of Data (LAD). A set of observations satisfying upper and/or lower bounds imposed on certain components is called a positive pattern if it contains some positive observations and no negative one. The number of variables on which such restrictions are imposed is called the degree of the pattern. A total polynomial algorithm is proposed for the enumeration of all patterns of limited degree, and special efficient variants of it for the enumeration of all patterns with certain “sign” and “coverage” requirements are presented and evaluated on a publicly available collection of benchmark datasets. 相似文献
14.
《Stochastics An International Journal of Probability and Stochastic Processes》2013,85(1-2):137-155
We present a method for detecting changes in the AR parameters of an ARMA process with arbitrarily time varying MA parameters. Assuming that a collection of observations and a set of nominal time invariant AR parameters are given, we test if the observations are generated by the nominal AR parameters or by a different set of time invariant AR parameters. The detection method is derived by using a local asymptotic approach and it is based on an estimation procedure which was shown to be consistent under nonstationarities. 相似文献
15.
Explicit formula is given for the lifetime distribution of a consecutive-k-out-of-n:F system. It is given as a linear combination of distributions of order statistics of the lifetimes of n components. We assume that the lifetimes are independent and identically distributed. The results should make it possible to treat the parametric estimation problems based on the observations of the lifetimes of the system. In fact, we take up, as some examples, the cases where the lifetimes of the components follow the exponential, the Weibull, and the Pareto distributions, and obtain feasible estimators by moment method. In particular, it is shown that the moment estimator is quite good for the exponential case in the sense that the asymptotic efficiency is close to one.This research was partially supported by the ISM Cooperative Research Program (94-ISM-CRP-5). 相似文献
16.
17.
Seung Jun Shin Yichao Wu Hao Helen Zhang 《Journal of computational and graphical statistics》2013,22(2):383-402
The support vector machine (SVM) is a popular learning method for binary classification. Standard SVMs treat all the data points equally, but in some practical problems it is more natural to assign different weights to observations from different classes. This leads to a broader class of learning, the so-called weighted SVMs (WSVMs), and one of their important applications is to estimate class probabilities besides learning the classification boundary. There are two parameters associated with the WSVM optimization problem: one is the regularization parameter and the other is the weight parameter. In this article, we first establish that the WSVM solutions are jointly piecewise-linear with respect to both the regularization and weight parameter. We then develop a state-of-the-art algorithm that can compute the entire trajectory of the WSVM solutions for every pair of the regularization parameter and the weight parameter at a feasible computational cost. The derived two-dimensional solution surface provides theoretical insight on the behavior of the WSVM solutions. Numerically, the algorithm can greatly facilitate the implementation of the WSVM and automate the selection process of the optimal regularization parameter. We illustrate the new algorithm on various examples. This article has online supplementary materials. 相似文献
18.
《Optimization》2012,61(3):213-222
We give several results, some new and some old, but apparently overlooked, that provide useful characterizations of barrier functions and their relationship to problem function properties. In particular, we show that level sets of a barrier function are bounded if and only if feasible level sets of the objective function are bounded and we obtain conditions that imply solution existence, strict convexity or a positive definite Hessian of a barrier function. Attention is focused on convex programs and the logarithmic barrier function. Such results suggest that it would seem possible to extend many recent complexity results by relaxing feasible set compactness to the feasible objective function level set boundedness assumption. 相似文献
19.
K. Ahmadi F. Hasani B. Kheirfam 《Journal of Mathematical Modelling and Algorithms》2014,13(2):191-208
We present a full-Newton step primal-dual infeasible interior-point algorithm based on Darvay’s search directions. These directions are obtained by an equivalent algebraic transformation of the centering equation. The algorithm decreases the duality gap and the feasibility residuals at the same rate. During this algorithm we construct strictly feasible iterates for a sequence of perturbations of the given problem and its dual problem. Each main iteration of the algorithm consists of a feasibility step and some centering steps. The starting point in the first iteration of the algorithm depends on a positive number ξ and it is strictly feasible for a perturbed pair, and feasibility steps find strictly feasible iterate for the next perturbed pair. By using centering steps for the new perturbed pair, we obtain strictly feasible iterate close to the central path of the new perturbed pair. The algorithm finds an ?-optimal solution or detects infeasibility of the given problem. The iteration bound coincides with the best known iteration bound for linear optimization problems. 相似文献
20.
《Optimization》2012,61(6):839-860
This paper introduces an efficient approach to the solution of the linear mini-max approximation problem. The classical nonlinear minimax problem is cast into a linear formulation. The proposed optimization procedure consists of specifying first a feasible point belonging to the feasible boundary surface. Next, feasible directions of decreasing values of the objective function are determined. The algorithm proceeds iteratively and terminates when the absolute minimum value of the objective function is reached. The initial point May be selected arbitrarily or it May be optimally determined through a linear method to speed up algorithmic convergence. The algorithm was applied to a number of approximation problems and results were compared to those derived using the revised simplex method. The new algorithm is shown to speed up the problem solution by at least on order of magnitude. 相似文献