首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 390 毫秒
1.
We consider a strongly NP-hard problem of partitioning a finite sequence of vectors in Euclidean space into two clusters using the criterion of minimum sum-of-squares of distances from the elements of clusters to their centers. We assume that the cardinalities of the clusters are fixed. The center of one cluster has to be optimized and is defined as the average value over all vectors in this cluster. The center of the other cluster lies at the origin. The partition satisfies the condition: the difference of the indices of the next and previous vectors in the first cluster is bounded above and below by two given constants. We propose a 2-approximation polynomial algorithm to solve this problem.  相似文献   

2.
NP-completeness of two clustering (partition) problems is proved for a finite sequence of Euclidean vectors. In the optimization versions of both problems it is required to partition the elements of the sequence into a fixed number of clusters minimizing the sum of squares of the distances from the cluster elements to their centers. In the first problem the sizes of clusters are the part of input, while in the second they are unknown (they are the variables for optimization). Except for the center of one (special) cluster, the center of each cluster is the mean value of all vectors contained in it. The center of the special cluster is zero. Also, the partition must satisfy the following condition: The difference between the indices of two consecutive vectors in every nonspecial cluster is bounded below and above by two given constants.  相似文献   

3.
We consider a strongly NP-hard problem of partitioning a finite sequence of points in Euclidean space into the two clustersminimizing the sum over both clusters of intra-cluster sums of squared distances from the clusters elements to their centers. The sizes of the clusters are fixed. The centroid of the first cluster is defined as the mean value of all vectors in the cluster, and the center of the second cluster is given in advance and equals 0. Additionally, the partition must satisfy the restriction that for all vectors in the first cluster the difference between the indices of two consequent points from this cluster is bounded from below and above by some given constants.We present a fully polynomial-time approximation scheme for the case of fixed space dimension.  相似文献   

4.
We consider the strongly NP-hard problem of partitioning a set of Euclidean points into two clusters so as to minimize the sum (over both clusters) of the weighted sum of the squared intracluster distances from the elements of the clusters to their centers. The weights of sums are the sizes of the clusters. The center of one cluster is given as input, while the center of the other cluster is unknown and determined as the average value over all points in the cluster (as the geometric center). Two variants of the problems are analyzed in which the cluster sizes are either given or unknown. We present and prove some exact pseudopolynomial algorithms in the case of integer components of the input points and fixed space dimension.  相似文献   

5.
A strongly NP-hard problem of partitioning a finite set of points of Euclidean space into two clusters is considered. The solution criterion is the minimum of the sum (over both clusters) of weighted sums of squared distances from the elements of each cluster to its geometric center. The weights of the sums are equal to the cardinalities of the desired clusters. The center of one cluster is given as input, while the center of the other is unknown and is determined as the point of space equal to the mean of the cluster elements. A version of the problem is analyzed in which the cardinalities of the clusters are given as input. A polynomial-time 2-approximation algorithm for solving the problem is constructed.  相似文献   

6.
Some problems of partitioning a finite set of points of Euclidean space into two clusters are considered. In these problems, the following criteria are minimized: (1) the sum over both clusters of the sums of squared pairwise distances between the elements of the cluster and (2) the sum of the (multiplied by the cardinalities of the clusters) sums of squared distances from the elements of the cluster to its geometric center, where the geometric center (or centroid) of a cluster is defined as the mean value of the elements in that cluster. Additionally, another problem close to (2) is considered, where the desired center of one of the clusters is given as input, while the center of the other cluster is unknown (is the variable to be optimized) as in problem (2). Two variants of the problems are analyzed, in which the cardinalities of the clusters are (1) parts of the input or (2) optimization variables. It is proved that all the considered problems are strongly NP-hard and that, in general, there is no fully polynomial-time approximation scheme for them (unless P = NP).  相似文献   

7.
针对传统的群决策专家客观权重确定的两类主要方法(即基于判断矩阵一致性程度和基于专家排序向量或判断矩阵元素的聚类分析)的缺陷,提出了一种综合考虑两方面因素的改进方法.首先利用聚类分析方法根据各专家的排序向量得到专家类别间的权重,然后根据单个专家的判断矩阵一致性以及排序向量到类核心的距离确定最终权重.文末给出的算例表明该方法是可行、有效的.  相似文献   

8.
Problems of partitioning a finite set of Euclidean points (vectors) into clusters are considered. The criterion is to minimize the sum, over all clusters, of (1) squared norms of the sums of cluster elements normalized by the cardinality, (2) squared norms of the sums of cluster elements, and (3) norms of the sum of cluster elements. It is proved that all these problems are strongly NP-hard if the number of clusters is a part of the input and are NP-hard in the ordinary sense if the number of clusters is not a part of the input (is fixed). Moreover, the problems are NP-hard even in the case of dimension 1 (on a line).  相似文献   

9.
Problems of partitioning a finite set of Euclidean points (vectors) into clusters are considered. The criterion is to minimize the sum, over all clusters, of (1) squared norms of the sums of cluster elements normalized by the cardinality, (2) squared norms of the sums of cluster elements, and (3) norms of the sum of cluster elements. It is proved that all these problems are strongly NP-hard if the number of clusters is a part of the input and are NP-hard in the ordinary sense if the number of clusters is not a part of the input (is fixed). Moreover, the problems are NP-hard even in the case of dimension 1 (on a line).  相似文献   

10.
For conventional fuzzy clustering-based approaches to fuzzy system identification, a fuzzy function is used for cluster formation and another fuzzy function is used for cluster validation to determine the number and location of the clusters which define IF parts of the rule base. However, the different fuzzy functions used for cluster formation and validation may not indicate the same best number and location of the clusters. This potential disparity motivates us to propose a new fuzzy clustering-based approach to fuzzy system identification based on the bi-objective fuzzy c-means (BOFCM) cluster analysis. In this approach, we use the BOFCM function for both cluster formation and validation to simultaneously determine the number and location of the clusters which we hope can efficiently and effectively define IF parts of the rule base. The proposed approach is validated by applying it to the truck backer-upper problem with an obstacle in the center of the field.  相似文献   

11.
Cluster analysis is used in various scientific and applied fields and is a topical subject of research. In contrast to the existing methods, the algorithms offered in this paper are intended for clustering objects described by feature vectors in a space in which the symmetry axiom is not satisfied. In this case, the clustering problem is solved using an asymmetric proximity measure. The essence of the first of the proposed clustering algorithms consists in sequential generation of clusters with simultaneous transfer of the objects clustered from previously created clusters into a current cluster if this reduces the quality criterion. In comparison with the existing algorithms of non-hierarchical clustering, such an approach to cluster generation makes it possible to reduce the computational costs. The second algorithmis a modified version of the first one andmakes it possible to reassign the main objects of clusters to further decrease the value of the proposed quality criterion.  相似文献   

12.
We consider the problem of partitioning a finite sequence of Euclidean points into a given number of clusters (subsequences) using the criterion of the minimal sum (over all clusters) of intercluster sums of squared distances from the elements of the clusters to their centers. It is assumed that the center of one of the desired clusters is at the origin, while the center of each of the other clusters is unknown and determined as the mean value over all elements in this cluster. Additionally, the partition obeys two structural constraints on the indices of sequence elements contained in the clusters with unknown centers: (1) the concatenation of the indices of elements in these clusters is an increasing sequence, and (2) the difference between an index and the preceding one is bounded above and below by prescribed constants. It is shown that this problem is strongly NP-hard. A 2-approximation algorithm is constructed that is polynomial-time for a fixed number of clusters.  相似文献   

13.
未确知均值聚类结合未确知理论和聚类理论构造未确知测度作为集合隶属度来表示样本与各类间的隶属关系.从产品合格、柔性、可靠性等几方面对影响供应链客户满意度的因素进行分析,构建供应链环境下的客户满意度评价指标体系.在此基础上,应用未确知均值聚类理论对供应链环境下的客户满意度进行综合评价,得出聚类结果,找出各类类中心,并给出样本属于各类的隶属度,较好的解决了对供应链环境下客户满意度的分类问题,最后以实例来论证该方法的可行性和有效性.  相似文献   

14.
The strongly NP-hard problem of partitioning a finite set of points of Euclidean space into two clusters of given sizes (cardinalities) minimizing the sum (over both clusters) of the intracluster sums of squared distances from the elements of the clusters to their centers is considered. It is assumed that the center of one of the sought clusters is specified at the desired (arbitrary) point of space (without loss of generality, at the origin), while the center of the other one is unknown and determined as the mean value over all elements of this cluster. It is shown that unless P = NP, there is no fully polynomial-time approximation scheme for this problem, and such a scheme is substantiated in the case of a fixed space dimension.  相似文献   

15.
In this paper we introduce COV, a novel information retrieval (IR) algorithm for massive databases based on vector space modeling and spectral analysis of the covariance matrix, for the document vectors, to reduce the scale of the problem. Since the dimension of the covariance matrix depends on the attribute space and is independent of the number of documents, COV can be applied to databases that are too massive for methods based on the singular value decomposition of the document-attribute matrix, such as latent semantic indexing (LSI). In addition to improved scalability, theoretical considerations indicate that results from our algorithm tend to be more accurate than those from LSI, particularly in detecting subtle differences in document vectors. We demonstrate the power and accuracy of COV through an important topic in data mining, known as outlier cluster detection. We propose two new algorithms for detecting major and outlier clusters in databases—the first is based on LSI, and the second on COV. Our implementation studies indicate that our cluster detection algorithms outperform the basic LSI and COV algorithm in detecting outlier clusters.  相似文献   

16.
We prove polynomial-time solvability of a large class of clustering problems where a weighted set of items has to be partitioned into clusters with respect to some balancing constraints. The data points are weighted with respect to different features and the clusters adhere to given lower and upper bounds on the total weight of their points with respect to each of these features. Further the weight-contribution of a vector to a cluster can depend on the cluster it is assigned to. Our interest in these types of clustering problems is motivated by an application in land consolidation where the ability to perform this kind of balancing is crucial.Our framework maximizes an objective function that is convex in the summed-up utility of the items in each cluster. Despite hardness of convex maximization and many related problems, for fixed dimension and number of clusters, we are able to show that our clustering model is solvable in time polynomial in the number of items if the weight-balancing restrictions are defined using vectors from a fixed, finite domain. We conclude our discussion with a new, efficient model and algorithm for land consolidation.  相似文献   

17.
本文给出了一种在类别数一定且各类中样本数一定的条件下的两种分类方法,它将传统分类方法向类中心聚集改变为向类优向量聚集,使具有此种约束条件的C分类更加合理。  相似文献   

18.
k-均值问题自提出以来一直吸引组合优化和计算机科学领域的广泛关注, 是经典的NP-难问题之一. 给定N个d维实向量构成的观测集, 目标是把这N个观测点划分到k(\leq N)个集合中, 使得所有集合中的点到对应的聚类中心距离的平方和最小, 一个集合的聚类中心指的是该集合 中所有观测点的均值. k-均值算法作为解决k-均值问题的启发式算法,在实际应用中因其出色的收敛速度而倍受欢迎. k-均值算法可描述为: 给定问题的初始化分组, 交替进行指派(将观测点分配到离其最近的均值点)和更新(计算新的聚类的均值点)直到收敛到某一解. 该算法通常被认为几乎是线性收敛的. 但缺点也很明显, 无法保证得到的是全局最优解, 并且算法结果好坏过于依赖初始解的选取. 于是学者们纷纷提出不同的初始化方法来提高k-均值算法的质量. 现筛选和罗列了关于选取初始解的k-均值算法的初始化方法供读者参考.  相似文献   

19.
This paper proposes a scatter search-based heuristic approach to the capacitated clustering problem. In this problem, a given set of customers with known demands must be partitioned into p distinct clusters. Each cluster is specified by a customer acting as a cluster center for this cluster. The objective is to minimize the sum of distances from all cluster centers to all other customers in their cluster, such that a given capacity limit of the cluster is not exceeded and that every customer is assigned to exactly one cluster. Computational results on a set of instances from the literature indicate that the heuristic is among the best heuristics developed for this problem.  相似文献   

20.
Sequential clustering aims at determining homogeneous and/or well-separated clusters within a given set of entities, one at a time, until no more such clusters can be found. We consider a bi-criterion sequential clustering problem in which the radius of a cluster (or maximum dissimilarity between an entity chosen as center and any other entity of the cluster) is chosen as a homogeneity criterion and the split of a cluster (or minimum dissimilarity between an entity in the cluster and one outside of it) is chosen as a separation criterion. An O(N 3) algorithm is proposed for determining radii and splits of all efficient clusters, which leads to an O(N 4) algorithm for bi-criterion sequential clustering with radius and split as criteria. This algorithm is illustrated on the well known Ruspini data set.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号