首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
An appropriate distance is an essential ingredient in various real-world learning tasks. Distance metric learning proposes to study a metric, which is capable of reflecting the data configuration much better in comparison with the commonly used methods. We offer an algorithm for simultaneous learning the Mahalanobis like distance and K-means clustering aiming to incorporate data rescaling and clustering so that the data separability grows iteratively in the rescaled space with its sequential clustering. At each step of the algorithm execution, a global optimization problem is resolved in order to minimize the cluster distortions resting upon the current cluster configuration. The obtained weight matrix can also be used as a cluster validation characteristic. Namely, closeness of such matrices learned during a sample process can indicate the clusters readiness; i.e. estimates the true number of clusters. Numerical experiments performed on synthetic and on real datasets verify the high reliability of the proposed method.  相似文献   

2.
The variational image decomposition model decomposes an image into a structural and an oscillatory component by regularization technique and functional minimization. It is an important task in various image processing methods, such as image restoration, image segmentation, and object recognition. In this paper, we propose a non-convex and non-smooth variational decomposition model for image restoration that uses non-convex and non-smooth total variation (TV) to measure the structure component and the negative Sobolev space H1 to model the oscillatory component. The new model combines the advantages of non-convex regularization and weaker-norm texture modeling, and it can well remove the noises while preserving the valuable edges and contours of the image. The iteratively reweighted l1 (IRL1) algorithm is employed to solve the proposed non-convex minimization problem. For each subproblem, we use the alternating direction method of multipliers (ADMM) algorithm to solve it. Numerical results validate the effectiveness of the proposed model for both synthetic and real images in terms of peak signal-to-noise ratio (PSNR) and mean structural similarity index (MSSIM).  相似文献   

3.
We consider the numerical simulation of contact problems in elasticity with large deformations. The non-penetration condition is described by means of a signed distance function to the obstacle's boundary. Techniques from level set methods allow for an appropriate numerical approximation of the signed distance function preserving its non-smooth character. The emerging non-convex optimization problem subject to non-smooth inequality constraints is solved by a non-smooth multiscale SQP method in combination with a non-smooth multigrid method as interior solver. Several examples in three space dimensions including applications in biomechanics illustrate the capability of our methods.  相似文献   

4.
Clustering is a popular data analysis and data mining technique. Since clustering problem have NP-complete nature, the larger the size of the problem, the harder to find the optimal solution and furthermore, the longer to reach a reasonable results. A popular technique for clustering is based on K-means such that the data is partitioned into K clusters. In this method, the number of clusters is predefined and the technique is highly dependent on the initial identification of elements that represent the clusters well. A large area of research in clustering has focused on improving the clustering process such that the clusters are not dependent on the initial identification of cluster representation. Another problem about clustering is local minimum problem. Although studies like K-Harmonic means clustering solves the initialization problem trapping to the local minima is still a problem of clustering. In this paper we develop a new algorithm for solving this problem based on a tabu search technique—Tabu K-Harmonic means (TabuKHM). The experiment results on the Iris and the other well known data, illustrate the robustness of the TabuKHM clustering algorithm.  相似文献   

5.
Two robustness criteria are presented that are applicable to general clustering methods. Robustness and stability in cluster analysis are not only data dependent, but even cluster dependent. Robustness is in the present paper defined as a property of not only the clustering method, but also of every individual cluster in a data set. The main principles are: (a) dissimilarity measurement of an original cluster with the most similar cluster in the induced clustering obtained by adding data points, (b) the dissolution point, which is an adaptation of the breakdown point concept to single clusters, (c) isolation robustness: given a clustering method, is it possible to join, by addition of g points, arbitrarily well separated clusters?Results are derived for k-means, k-medoids (k estimated by average silhouette width), trimmed k-means, mixture models (with and without noise component, with and without estimation of the number of clusters by BIC), single and complete linkage.  相似文献   

6.
Many challenging problems in automatic control may be cast as optimization programs subject to matrix inequality constraints. Here we investigate an approach which converts such problems into non-convex eigenvalue optimization programs and makes them amenable to non-smooth analysis techniques like bundle or cutting plane methods. We prove global convergence of a first-order bundle method for programs with non-convex maximum eigenvalue functions. Dedicated to R. T. Rockafellar on the occasion of his 70th anniversary  相似文献   

7.
We discuss a bundle method for non-smooth non-convex optimization programs. In the absence of convexity, a substitute for the cutting plane mechanism has to be found. We propose such a mechanism and prove convergence of our method in the sense that every accumulation point of the sequence of serious iterates is critical.  相似文献   

8.
Constrained Optimization Problems (COP) often take place in many practical applications such as kinematics, chemical process optimization, power systems and so on. These problems are challenging in terms of identifying feasible solutions when constraints are non-linear and non-convex. Therefore, finding the location of the global optimum in the non-convex COP is more difficult as compared to non-convex bound-constrained global optimization problems. This paper proposes a Hybrid Simulated Annealing method (HSA), for solving the general COP. HSA has features that address both feasibility and optimality issues and here, it is supported by a local search procedure, Feasible Sequential Quadratic Programming (FSQP). We develop two versions of HSA. The first version (HSAP) incorporates penalty methods for constraint handling and the second one (HSAD) eliminates the need for imposing penalties in the objective function by tracing feasible and infeasible solution sequences independently. Numerical experiments show that the second version is more reliable in the worst case performance.  相似文献   

9.
To find optimal clusters of functional objects in a lower-dimensional subspace of data, a sequential method called tandem analysis, is often used, though such a method is problematic. A new procedure is developed to find optimal clusters of functional objects and also find an optimal subspace for clustering, simultaneously. The method is based on the k-means criterion for functional data and seeks the subspace that is maximally informative about the clustering structure in the data. An efficient alternating least-squares algorithm is described, and the proposed method is extended to a regularized method. Analyses of artificial and real data examples demonstrate that the proposed method gives correct and interpretable results.  相似文献   

10.
We introduce and study the properties of Boolean autoencoder circuits. In particular, we show that the Boolean autoencoder circuit problem is equivalent to a clustering problem on the hypercube. We show that clustering m binary vectors on the n-dimensional hypercube into k clusters is NP-hard, as soon as the number of clusters scales like ${m^\epsilon (\epsilon >0 )}$ , and thus the general Boolean autoencoder problem is also NP-hard. We prove that the linear Boolean autoencoder circuit problem is also NP-hard, and so are several related problems such as: subspace identification over finite fields, linear regression over finite fields, even/odd set intersections, and parity circuits. The emerging picture is that autoencoder optimization is NP-hard in the general case, with a few notable exceptions including the linear cases over infinite fields or the Boolean case with fixed size hidden layer. However learning can be tackled by approximate algorithms, including alternate optimization, suggesting a new class of learning algorithms for deep networks, including deep networks of threshold gates or artificial neurons.  相似文献   

11.
We analyze nonlinear stochastic optimization problems with probabilistic constraints described by continuously differentiable non-convex functions. We describe the tangent and the normal cone to the level sets of the underlying probability function and provide new insight into their structure. Furthermore, we formulate fist order and second order conditions of optimality for these problems based on the notion of p-efficient points. We develop an augmented Lagrangian method for the case of discrete distribution functions. The method is based on progressive inner approximation of the level set of the probability function by generation of p-efficient points. Numerical experience is provided.  相似文献   

12.
The uncapacitated multi-facility Weber problem is concerned with locating m facilities in the Euclidean plane and allocating the demands of n customers to these facilities with the minimum total transportation cost. This is a non-convex optimization problem and difficult to solve exactly. As a consequence, efficient and accurate heuristic solution procedures are needed. The problem has different types based on the distance function used to model the distance between the facilities and customers. We concentrate on the rectilinear and Euclidean problems and propose new vector quantization and self-organizing map algorithms. They incorporate the properties of the distance function to their update rules, which makes them different from the existing two neural network methods that use rather ad hoc squared Euclidean metric in their updates even though the problem is originally stated in terms of the rectilinear and Euclidean distances. Computational results on benchmark instances indicate that the new methods are better than the existing ones, both in terms of the solution quality and computation time.  相似文献   

13.
The NP-hard nature of cardinality constrained mean-variance portfolio optimization problems has led to a number of different algorithms with varying degrees of success in reaching optimality given limited computational resources and under the presence of strict time constraints present in practice. The proposed local relaxation algorithm explores the inherent structure of the objective function. It solves a sequence of small, local, quadratic-programs by first projecting asset returns onto a reduced metric space, followed by clustering in this space to identify sub-groups of assets that best accentuate a suitable measure of similarity amongst different assets. The algorithm can either be cold started using a suitable heuristic method such as the centroids of initial clusters or be warm started based on the last output. Results, using a basket of up to 3,000 stocks and with different cardinality constraints, indicates that the proposed algorithm can lead to significant performance gain over popular branch-and-cut methods. One key application of this algorithm is in dealing with large scale cardinality constrained portfolio optimization under tight time constraint, such as for the purpose of index tracking or index arbitrage at high frequency.  相似文献   

14.
In this paper, we investigate the problem of determining the number of clusters in the k-modes based categorical data clustering process. We propose a new categorical data clustering algorithm with automatic selection of k. The new algorithm extends the k-modes clustering algorithm by introducing a penalty term to the objective function to make more clusters compete for objects. In the new objective function, we employ a regularization parameter to control the number of clusters in a clustering process. Instead of finding k directly, we choose a suitable value of regularization parameter such that the corresponding clustering result is the most stable one among all the generated clustering results. Experimental results on synthetic data sets and the real data sets are used to demonstrate the effectiveness of the proposed algorithm.  相似文献   

15.
The partitioning clustering is a technique to classify n objects into k disjoint clusters, and has been developed for years and widely used in many applications. In this paper, a new overlapping cluster algorithm is defined. It differs from traditional clustering algorithms in three respects. First, the new clustering is overlapping, because clusters are allowed to overlap with one another. Second, the clustering is non-exhaustive, because an object is permitted to belong to no cluster. Third, the goals considered in this research are the maximization of the average number of objects contained in a cluster and the maximization of the distances among cluster centers, while the goals in previous research are the maximization of the similarities of objects in the same clusters and the minimization of the similarities of objects in different clusters. Furthermore, the new clustering is also different from the traditional fuzzy clustering, because the object–cluster relationship in the new clustering is represented by a crisp value rather than that represented by using a fuzzy membership degree. Accordingly, a new overlapping partitioning cluster (OPC) algorithm is proposed to provide overlapping and non-exhaustive clustering of objects. Finally, several simulation and real world data sets are used to evaluate the effectiveness and the efficiency of the OPC algorithm, and the outcomes indicate that the algorithm can generate satisfactory clustering results.  相似文献   

16.
We address estimation problems where the sought-after solution is defined as the minimizer of an objective function composed of a quadratic data-fidelity term and a regularization term. We especially focus on non-convex and possibly non-smooth regularization terms because of their ability to yield good estimates. This work is dedicated to the stability of the minimizers of such piecewise Cm, with m ≥ 2, non-convex objective functions. It is composed of two parts. In the previous part of this work we considered general local minimizers. In this part we derive results on global minimizers. We show that the data domain contains an open, dense subset such that for every data point therein, the objective function has a finite number of local minimizers, and a unique global minimizer. It gives rise to a global minimizer function which is Cm-1 everywhere on an open and dense subset of the data domain.  相似文献   

17.
非凸极小极大问题是近期国际上优化与机器学习、信号处理等交叉领域的一个重要研究前沿和热点,包括对抗学习、强化学习、分布式非凸优化等前沿研究方向的一些关键科学问题都归结为该类问题。国际上凸-凹极小极大问题的研究已取得很好的成果,但非凸极小极大问题不同于凸-凹极小极大问题,是有其自身结构的非凸非光滑优化问题,理论研究和求解难度都更具挑战性,一般都是NP-难的。重点介绍非凸极小极大问题的优化算法和复杂度分析方面的最新进展。  相似文献   

18.
We consider here a multicommodity flow network optimization problem with non-convex but piecewise convex arc cost functions. We derive complete optimality conditions for local minima based on negative-cost cycles associated with each commodity. These conditions do not extend to the convex non-smooth case.  相似文献   

19.
The taxonomy of the N2-fixing bacteria belonging to the genus Bradyrhizobium is still poorly refined, mainly due to conflicting results obtained by the analysis of the phenotypic and genotypic properties. This paper presents an application of a method aiming at the identification of possible new clusters within a Brazilian collection of 119 Bradyrhizobium strains showing phenotypic characteristics of B. japonicum and B. elkanii. The stability was studied as a function of the number of restriction enzymes used in the RFLP-PCR analysis of three ribosomal regions with three restriction enzymes per region. The method proposed here uses clustering algorithms with distances calculated by average-linkage clustering. Introducing perturbations using sub-sampling techniques makes the stability analysis. The method showed efficacy in the grouping of the species B. japonicum and B. elkanii. Furthermore, two new clusters were clearly defined, indicating possible new species, and sub-clusters within each detected cluster.  相似文献   

20.
We propose in this paper two new competitive unsupervised clustering algorithms: the first algorithm deals with ultrametric data, it has a computational cost of O(n). The second algorithm has two strong features: it is fast and flexible on the processed data type as well as in terms of precision. The second algorithm has a computational cost, in the worst case, of O(n2), and in the average case, of O(n). These complexities are due to exploitation of ultrametric distance properties. In the first method, we use the order induced by an ultrametric in a given space to demonstrate how we can explore quickly data proximity. In the second method, we create an ultrametric space from a sample data, chosen uniformly at random, in order to obtain a global view of proximities in the data set according to the similarity criterion. Then, we use this proximity profile to cluster the global set. We present an example of our algorithms and compare their results with those of a classic clustering method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号