共查询到20条相似文献,搜索用时 125 毫秒
1.
针对模糊C均值聚类算法对初始聚类中心值敏感和抗噪声能力差的问题,提出一种基于改进的量子遗传优化初始聚类中心的算法,改进双链编码的量子遗传算法增加了全局搜索能力,改变传统的FCM算法计算迭代慢和易陷入局部极值的问题.同时引入空间邻域信息,利用加权隶属度矩阵建立适应度函数来改善对噪声的鲁棒性,实验结果表明,算法具有很好的分割效果和较强的抗噪能力. 相似文献
2.
3.
4.
5.
基于有限维离散数据的传统聚类分析并不能直接用于函数型数据的分类挖掘。本文针对函数型数据的稀疏性和无穷维特殊性展开讨论,在综合剖析现有函数型聚类方法优势与不足的基础上,依据聚类指标的信息量差异重构加权主成分距离为函数相似性测度,提出了一种函数型数据的自适应权重聚类分析。相对同类函数型聚类算法,新方法的核心优势在于:(1)自适应赋权的距离函数体现了聚类指标分类效率的差异,并且有充分的理论基础保证其必要性和客观合理性;(2)基于有限维离散数据的聚类实现了无限维连续函数的聚类,能够显著降低计算成本。实证检验表明,新方法的分类正确率明显提高,能够有效解决传统聚类算法极端情形下的失效问题,有着复杂函数型数据分类问题下的灵活性和普遍适用性。 相似文献
6.
遗传模糊聚类算法在图像边缘检测中的应用 总被引:1,自引:0,他引:1
将一种改进的遗传模糊c-均值聚类(GFGA)算法应用到图像的边缘检测中.我们将灰度图像中的每一个像素点看成是一个数据样本,将该点的灰度值经过Robert算子、Sobel算子和Prewitt算子处理构成它的特性向量,形成具有三维特征的数据集,然后对这个数据集应用遗传模糊聚类算法进行分类,自适应地检测出图像的边缘点,达到提取边缘的目的.实验结果表明,这种混合算法能得到很好的边缘效果,并且得到的结果无需再细化处理,提高了边缘定位的精度. 相似文献
7.
8.
《数学的实践与认识》2017,(19)
考虑到构建二叉树支持向量机时样本的分布情况对分类器推广能力具有较大影响,提出一种改进的二叉树支持向量机层次结构构建方法.以类间样本距离和带权值的类内样本距离与其标准差的比值作为类的分类度.将类间距离大且类内样本平均分布广的类最先分离.利用标准数据集,通过与不同多类分类算法比较,验证了改进的二叉树支持向量机的优越性.对双转子涡喷发动机气路部件进行应用改进的算法进行故障诊断,得到了较好的故障识别率. 相似文献
9.
《数学的实践与认识》2015,(13)
针对传统k-均值聚类算法事先必须获知类别数和难以确定初始聚类中心的缺点,建立了关于聚类中心和类别数k的双层规划模型,结合粒子群算法确定出聚类中心,通过在迭代过程中不断更新准则函数的方法搜索并确定出最佳类别数惫,基于所建模型,提出了一种改进的k-均值聚类算法,并将算法应用于冰脊表面形态分析中.结果表明,算法得到的聚类结果不但具有相邻类别边界清晰的优点,而且能够较好地反映出地理位置和生长环境对冰脊形成的影响. 相似文献
10.
针对传统k-均值聚类算法事先必须获知类别数和难以确定初始聚类中心的缺点,建立了关于聚类中心和类别数k的双层规划模型,结合粒子群算法确定出聚类中心,通过在迭代过程中不断更新准则函数的方法搜索并确定出最佳类别数惫,基于所建模型,提出了一种改进的k-均值聚类算法,并将算法应用于冰脊表面形态分析中.结果表明,算法得到的聚类结果不但具有相邻类别边界清晰的优点,而且能够较好地反映出地理位置和生长环境对冰脊形成的影响. 相似文献
11.
k-均值问题自提出以来一直吸引组合优化和计算机科学领域的广泛关注, 是经典的NP-难问题之一. 给定N个d维实向量构成的观测集, 目标是把这N个观测点划分到k(\leq N)个集合中, 使得所有集合中的点到对应的聚类中心距离的平方和最小, 一个集合的聚类中心指的是该集合 中所有观测点的均值. k-均值算法作为解决k-均值问题的启发式算法,在实际应用中因其出色的收敛速度而倍受欢迎. k-均值算法可描述为: 给定问题的初始化分组, 交替进行指派(将观测点分配到离其最近的均值点)和更新(计算新的聚类的均值点)直到收敛到某一解. 该算法通常被认为几乎是线性收敛的. 但缺点也很明显, 无法保证得到的是全局最优解, 并且算法结果好坏过于依赖初始解的选取. 于是学者们纷纷提出不同的初始化方法来提高k-均值算法的质量. 现筛选和罗列了关于选取初始解的k-均值算法的初始化方法供读者参考. 相似文献
12.
K-平均算法属于聚类分析中的动态聚类法,但其聚类效果受初始聚类分类或初始点的影响较大。本文提出一种遗传算法(GA)来进行近代初始分类,以内部聚类准则作为评价指标,实验结果表明,该算法明显好于K-平均算法。 相似文献
13.
KmL: k-means for longitudinal data 总被引:2,自引:0,他引:2
Cohort studies are becoming essential tools in epidemiological research. In these studies, measurements are not restricted
to single variables but can be seen as trajectories. Statistical methods used to determine homogeneous patient trajectories
can be separated into two families: model-based methods (like Proc Traj) and partitional clustering (non-parametric algorithms
like k-means). KmL is a new implementation of k-means designed to work specifically on longitudinal data. It provides scope
for dealing with missing values and runs the algorithm several times, varying the starting conditions and/or the number of
clusters sought; its graphical interface helps the user to choose the appropriate number of clusters when the classic criterion
is not efficient. To check KmL efficiency, we compare its performances to Proc Traj both on artificial and real data. The
two techniques give very close clustering when trajectories follow polynomial curves. KmL gives much better results on non-polynomial
trajectories. 相似文献
14.
Antoine Jouglet Ceyda Oğuz Marc Sevaux 《Journal of Mathematical Modelling and Algorithms》2009,8(3):271-292
The paper considers the hybrid flow-shop scheduling problem with multiprocessor tasks. Motivated by the computational complexity
of the problem, we propose a memetic algorithm for this problem in the paper. We first describe the implementation details
of a genetic algorithm, which is used in the memetic algorithm. We then propose a constraint programming based branch-and-bound
algorithm to be employed as the local search engine of the memetic algorithm. Next, we present the new memetic algorithm.
We lastly explain the computational experiments carried out to evaluate the performance of three algorithms (genetic algorithm,
constraint programming based branch-and-bound algorithm, and memetic algorithm) in terms of both the quality of the solutions
produced and the efficiency. These results demonstrate that the memetic algorithm produces better quality solutions and that
it is very efficient. 相似文献
15.
Daniel A. Coleman David L. Woodruff 《Journal of computational and graphical statistics》2013,22(4):672-688
Abstract The primary model for cluster analysis is the latent class model. This model yields the mixture likelihood. Due to numerous local maxima, the success of the EM algorithm in maximizing the mixture likelihood depends on the initial starting point of the algorithm. In this article, good starting points for the EM algorithm are obtained by applying classification methods to randomly selected subsamples of the data. The performance of the resulting two-step algorithm, classification followed by EM, is compared to, and found superior to, the baseline algorithm of EM started from a random partition of the data. Though the algorithm is not complicated, comparing it to the baseline algorithm and assessing its performance with several classification methods is nontrivial. The strategy employed for comparing the algorithms is to identify canonical forms for the easiest and most difficult datasets to cluster within a large collection of cluster datasets and then to compare the performance of the two algorithms on these datasets. This has led to the discovery that, in the case of three homogeneous clusters, the most difficult datasets to cluster are those in which the clusters are arranged on a line and the easiest are those in which the clusters are arranged on an equilateral triangle. The performance of the two-step algorithm is assessed using several classification methods and is shown to be able to cluster large, difficult datasets consisting of three highly overlapping clusters arranged on a line with 10,000 observations and 8 variables. 相似文献
16.
Vidyut Dey Dilip Kumar Pratihar G. L. Datta 《Fuzzy Optimization and Decision Making》2011,10(2):153-166
A modified approach had been developed in this study by combining two well-known algorithms of clustering, namely fuzzy c-means
algorithm and entropy-based algorithm. Fuzzy c-means algorithm is one of the most popular algorithms for fuzzy clustering.
It could yield compact clusters but might not be able to generate distinct clusters. On the other hand, entropy-based algorithm
could obtain distinct clusters, which might not be compact. However, the clusters need to be both distinct as well as compact.
The present paper proposes a modified approach of clustering by combining the above two algorithms. A genetic algorithm was
utilized for tuning of all three clustering algorithms separately. The proposed approach was found to yield both distinct
as well as compact clusters on two data sets. 相似文献
17.
In the paper, two evolutionary approaches to the general DNA sequencing problem, assuming both negative and positive errors
in the spectrum, are compared. The older of them is based on the idea of genetic approach and is enhanced by a greedy algorithm.
The newly proposed algorithm combines the tabu search and the scatter search methods. After conducting experiments with random
and coding DNA sequences, our results suggest that the tabu and scatter search algorithm finds solutions of higher quality
and more reliably than the genetic algorithm. 相似文献
18.
二阶段随机规划问题基于随机模拟的遗传算法 总被引:1,自引:0,他引:1
利用遗传算法不过多依赖目标函数性质.适应于全局搜索的特点.提出了求解二阶段随机规划的基于随机模拟的遗传算法,算法采用随机模拟技术利用样本均值近似代替期望值,使计算得以简化,计算实例表明该算法是有效和可行的。 相似文献
19.
A genetic k-medoids clustering algorithm 总被引:1,自引:0,他引:1
We propose a hybrid genetic algorithm for k-medoids clustering. A novel heuristic operator is designed and integrated with the genetic algorithm to fine-tune the search.
Further, variable length individuals that encode different number of medoids (clusters) are used for evolution with a modified
Davies-Bouldin index as a measure of the fitness of the corresponding partitionings. As a result the proposed algorithm can
efficiently evolve appropriate partitionings while making no a priori assumption about the number of clusters present in the datasets. In the experiments, we show the effectiveness of the proposed
algorithm and compare it with other related clustering methods. 相似文献
20.
探讨了预知服务需求信息能力下的集装箱码头泊位与岸桥联合调度 over-list 在线模型. 在每个船舶服务请求释放时, 决策者预知后续 k(k \geq 2)个请求的信息,目标为最小化所有请求的最大完工时间. 针对由3个离散泊位组成的混合型泊位与4个岸桥, 以及只有大小两种服务请求的情形, 给出了预知任意 k \geq 2个请求下的竞争比下界; 同时, 对于k=2的特定情形, 给出了具有最优竞争比7/6 的在线策略. 数值实验进一步表明了所设计策略的良好执行性能. 相似文献