首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
Some problems of partitioning a finite set of points of Euclidean space into two clusters are considered. In these problems, the following criteria are minimized: (1) the sum over both clusters of the sums of squared pairwise distances between the elements of the cluster and (2) the sum of the (multiplied by the cardinalities of the clusters) sums of squared distances from the elements of the cluster to its geometric center, where the geometric center (or centroid) of a cluster is defined as the mean value of the elements in that cluster. Additionally, another problem close to (2) is considered, where the desired center of one of the clusters is given as input, while the center of the other cluster is unknown (is the variable to be optimized) as in problem (2). Two variants of the problems are analyzed, in which the cardinalities of the clusters are (1) parts of the input or (2) optimization variables. It is proved that all the considered problems are strongly NP-hard and that, in general, there is no fully polynomial-time approximation scheme for them (unless P = NP).  相似文献   

2.
The strongly NP-hard problem of partitioning a finite set of points of Euclidean space into two clusters of given sizes (cardinalities) minimizing the sum (over both clusters) of the intracluster sums of squared distances from the elements of the clusters to their centers is considered. It is assumed that the center of one of the sought clusters is specified at the desired (arbitrary) point of space (without loss of generality, at the origin), while the center of the other one is unknown and determined as the mean value over all elements of this cluster. It is shown that unless P = NP, there is no fully polynomial-time approximation scheme for this problem, and such a scheme is substantiated in the case of a fixed space dimension.  相似文献   

3.
We consider the strongly NP-hard problem of partitioning a set of Euclidean points into two clusters so as to minimize the sum (over both clusters) of the weighted sum of the squared intracluster distances from the elements of the clusters to their centers. The weights of sums are the sizes of the clusters. The center of one cluster is given as input, while the center of the other cluster is unknown and determined as the average value over all points in the cluster (as the geometric center). Two variants of the problems are analyzed in which the cluster sizes are either given or unknown. We present and prove some exact pseudopolynomial algorithms in the case of integer components of the input points and fixed space dimension.  相似文献   

4.
We consider a strongly NP-hard problem of partitioning a finite sequence of vectors in Euclidean space into two clusters using the criterion of minimum sum-of-squares of distances from the elements of clusters to their centers. We assume that the cardinalities of the clusters are fixed. The center of one cluster has to be optimized and is defined as the average value over all vectors in this cluster. The center of the other cluster lies at the origin. The partition satisfies the condition: the difference of the indices of the next and previous vectors in the first cluster is bounded above and below by two given constants. We propose a 2-approximation polynomial algorithm to solve this problem.  相似文献   

5.
We consider a strongly NP-hard problem of partitioning a finite sequence of points in Euclidean space into the two clustersminimizing the sum over both clusters of intra-cluster sums of squared distances from the clusters elements to their centers. The sizes of the clusters are fixed. The centroid of the first cluster is defined as the mean value of all vectors in the cluster, and the center of the second cluster is given in advance and equals 0. Additionally, the partition must satisfy the restriction that for all vectors in the first cluster the difference between the indices of two consequent points from this cluster is bounded from below and above by some given constants.We present a fully polynomial-time approximation scheme for the case of fixed space dimension.  相似文献   

6.
We consider the problem of partitioning a finite sequence of Euclidean points into a given number of clusters (subsequences) using the criterion of the minimal sum (over all clusters) of intercluster sums of squared distances from the elements of the clusters to their centers. It is assumed that the center of one of the desired clusters is at the origin, while the center of each of the other clusters is unknown and determined as the mean value over all elements in this cluster. Additionally, the partition obeys two structural constraints on the indices of sequence elements contained in the clusters with unknown centers: (1) the concatenation of the indices of elements in these clusters is an increasing sequence, and (2) the difference between an index and the preceding one is bounded above and below by prescribed constants. It is shown that this problem is strongly NP-hard. A 2-approximation algorithm is constructed that is polynomial-time for a fixed number of clusters.  相似文献   

7.
Problems of partitioning a finite set of Euclidean points (vectors) into clusters are considered. The criterion is to minimize the sum, over all clusters, of (1) squared norms of the sums of cluster elements normalized by the cardinality, (2) squared norms of the sums of cluster elements, and (3) norms of the sum of cluster elements. It is proved that all these problems are strongly NP-hard if the number of clusters is a part of the input and are NP-hard in the ordinary sense if the number of clusters is not a part of the input (is fixed). Moreover, the problems are NP-hard even in the case of dimension 1 (on a line).  相似文献   

8.
Problems of partitioning a finite set of Euclidean points (vectors) into clusters are considered. The criterion is to minimize the sum, over all clusters, of (1) squared norms of the sums of cluster elements normalized by the cardinality, (2) squared norms of the sums of cluster elements, and (3) norms of the sum of cluster elements. It is proved that all these problems are strongly NP-hard if the number of clusters is a part of the input and are NP-hard in the ordinary sense if the number of clusters is not a part of the input (is fixed). Moreover, the problems are NP-hard even in the case of dimension 1 (on a line).  相似文献   

9.
NP-completeness of two clustering (partition) problems is proved for a finite sequence of Euclidean vectors. In the optimization versions of both problems it is required to partition the elements of the sequence into a fixed number of clusters minimizing the sum of squares of the distances from the cluster elements to their centers. In the first problem the sizes of clusters are the part of input, while in the second they are unknown (they are the variables for optimization). Except for the center of one (special) cluster, the center of each cluster is the mean value of all vectors contained in it. The center of the special cluster is zero. Also, the partition must satisfy the following condition: The difference between the indices of two consecutive vectors in every nonspecial cluster is bounded below and above by two given constants.  相似文献   

10.
In this paper, a strongly NP-hard problem of finding a family of disjoint subsets with given cardinalities in a finite set of points from a Euclidean space is considered. Minimization of the sum over all required subsets of the sum of the squared distances from the elements of these subsets to their geometric centers is used as the search criterion. It is proved that if the coordinates of the input points are integer and the space dimension and the number of required subsets are fixed (i.e., bounded by some constants), the problem is a pseudopolynomial time solvable one.  相似文献   

11.
A 2-approximation algorithm is presented for some NP-hard data analysis problem that consists in partitioning a set of Euclidean vectors into two subsets (clusters) under the criterion of minimum sum-of-squares of distances from the elements of clusters to their centers. The center of the first cluster is the average value of vectors in the cluster, and the center of the second one is the origin.  相似文献   

12.
We introduce large families of subsets arising from Woods problem and study their cardinalities. Estimates of character sums over the subsets are given. We study the pesudorandom properties of the subsets and show that their well-distribution measures are very high.  相似文献   

13.
The graph partitioning problem is to partition the vertex set of a graph into a number of nonempty subsets so that the total weight of edges connecting distinct subsets is minimized. Previous research requires the input of cardinalities of subsets or the number of subsets for equipartition. In this paper, the problem is formulated as a zero-one quadratic programming problem without the input of cardinalities. We also present three equivalent zero-one linear integer programming reformulations. Because of its importance in data biclustering, the bipartite graph partitioning is also studied. Several new methods to determine the number of subsets and the cardinalities are presented for practical applications. In addition, hierarchy partitioning and partitioning of bipartite graphs without reordering one vertex set, are studied.  相似文献   

14.
Computational Mathematics and Mathematical Physics - We consider a strongly NP-hard problem of partitioning a finite Euclidean sequence into two clusters of given cardinalities minimizing the sum...  相似文献   

15.
The problem tackled in this paper is as follows: consider a set ofn interacting points in a two-dimensional space. The levels of interactions between the observations are given exogenously. It is required to cluster then observations intop groups, so that the sum of squared deviations from the cluster means is as small as possible. Further, assume that the cluster means are adjusted to reflect the interaction between the entities. (It is this latter consideration which makes the problem interesting.) A useful property of the problem is that the use of a squared distance term yields a linear system of equations for the coordinates of the cluster centroids. These equations are derived and solved repeatedly for a given set of cluster allocations. A sequential reallocation of the observations between the clusters is then performed. One possible application of this problem is to the planar hub location problem, where the interacting observations are a system of cities and the interaction effects represent the levels of flow or movement between the entities. The planar hub location problem has been limited so far to problems with fewer than 100 nodes. The use of the squared distance formulation, and a powerful supercomputer (Cray Y-MP) has enabled quick solution of large systems with 250 points and four groups. The paper includes both small illustrative examples and computational results using systems with up to 500 observations and 9 clusters.  相似文献   

16.
This paper proposes a scatter search-based heuristic approach to the capacitated clustering problem. In this problem, a given set of customers with known demands must be partitioned into p distinct clusters. Each cluster is specified by a customer acting as a cluster center for this cluster. The objective is to minimize the sum of distances from all cluster centers to all other customers in their cluster, such that a given capacity limit of the cluster is not exceeded and that every customer is assigned to exactly one cluster. Computational results on a set of instances from the literature indicate that the heuristic is among the best heuristics developed for this problem.  相似文献   

17.
未确知均值聚类结合未确知理论和聚类理论构造未确知测度作为集合隶属度来表示样本与各类间的隶属关系.从产品合格、柔性、可靠性等几方面对影响供应链客户满意度的因素进行分析,构建供应链环境下的客户满意度评价指标体系.在此基础上,应用未确知均值聚类理论对供应链环境下的客户满意度进行综合评价,得出聚类结果,找出各类类中心,并给出样本属于各类的隶属度,较好的解决了对供应链环境下客户满意度的分类问题,最后以实例来论证该方法的可行性和有效性.  相似文献   

18.
Summary DCT Given a finite set of points in an Euclidean space the \emph{spanning tree} is a tree of minimal length having the given points as vertices. The length of the tree is the sum of the distances of all connected point pairs of the tree. The clustering tree with a given length of a given finite set of points is the spanning tree of an appropriately chosen other set of points approximating the given set of points with minimal sum of square distances among all spanning trees with the given length. DCM A matrix of real numbers is said to be column monotone orderable if there exists an ordering of columns of the matrix such that all rows of the matrix become monotone after ordering. The {\emph{monotone sum of squares of a matrix}} is the minimum of sum of squares of differences of the elements of the matrix and a column monotone orderable matrix where the minimum is taken on the set of all column monotone orderable matrices. Decomposition clusters of monotone orderings of a matrix is a clustering ofthe rows of the matrix into given number of clusters such that thesum of monotone sum of squares of the matrices formed by the rowsof the same cluster is minimal.DCP A matrix of real numbers is said to be column partitionable if there exists a partition of the columns such that the elements belonging to the same subset of the partition are equal in each row. Given a partition of the columns of a matrix the partition sum of squares of the matrix is the minimum of the sum of square of differences of the elements of the matrix and a column partitionable matrix where the minimum is taken on the set of all column partitionable matrices. Decomposition of the rows of a matrix into clusters of partitions is the minimization of the corresponding partition sum of squares given the number of clusters and the sizes of the subsets of the partitions.  相似文献   

19.
Doklady Mathematics - We consider the problem of partitioning a finite set of points in Euclidean space into clusters so as to minimize the sum, over all clusters, of the intracluster sums of the...  相似文献   

20.
In the Capacitated Clustering Problem (CCP), a given set of n weighted points is to be partitioned into p clusters such that, the total weight of the points in each cluster does not exceed a given cluster capacity. The objective is to find a set of p centers that minimises total scatter of points allocated to them. In this paper a new constructive method, a general framework to improve the performance of greedy constructive heuristics, and a problem space search procedure for the CCP are proposed. The constructive heuristic finds patterns of natural subgrouping in the input data using concept of density of points. Elements of adaptive computation and periodic construction–deconstruction concepts are implemented within the constructive heuristic to develop a general framework for building efficient heuristics. The problem-space search procedure is based on perturbations of input data for which a controlled perturbation strategy, intensification and diversification strategies are developed. The implemented algorithms are compared with existing methods on a standard set of bench-marks and on new sets of large-sized instances. The results illustrate the strengths of our algorithms in terms of solution quality and computational efficiency.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号