期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

逯清玉张晓明《数学的实践与认识》2016,(17):174-181

针对模糊C均值聚类算法对初始聚类中心值敏感和抗噪声能力差的问题,提出一种基于改进的量子遗传优化初始聚类中心的算法,改进双链编码的量子遗传算法增加了全局搜索能力,改变传统的FCM算法计算迭代慢和易陷入局部极值的问题.同时引入空间邻域信息,利用加权隶属度矩阵建立适应度函数来改善对噪声的鲁棒性,实验结果表明,算法具有很好的分割效果和较强的抗噪能力. 相似文献

2.

一个新的模糊聚类有效性指标

《系统科学与数学》2014,(9)

提出了一个判别模糊聚类中聚类数有效性的新指标.首先利用FCM算法对数据集进行模糊聚类,通过隶属度矩阵和聚类中心构建加权二分网络.然后通过改进加权二分网络的模函数,定义一个新的聚类有效性指标.为了检验该有效性指标的性能,选取了三个常见的有效性指标在十五个数据集上进行了对比.实验结果表明,该有效性指标具有较好的性能. 相似文献

3.

基于方差与改进群智能算法的K-means聚类优化

《系统科学与数学》2018,(10)

利用K-means进行数据聚类时,借用不同处理手段其统计距离和聚类中心等会有所差异,从而影响聚类结果,尤其是当数据维度增高时,这种现象更为明显.对此,文章提出一种基于样本方差的多元统计距离算法,并引入改进人工蜂群算法及评价准则函数确定聚类中心和最佳聚类数,优化K-means算法.理论上,该方法可以克服原算法易陷入局部最优和固定聚类数等缺陷.最后,通过特异值检测,人工数据集以及UCI真实数据集测试验证该优化算法性能. 相似文献

4.

与一般相似度函数相关的谱聚类的收敛性 总被引：1，自引：0，他引：1

下载免费PDF全文

高炜周定轩《中国科学:数学》2012,42(10):985-994

谱聚类算法由与相似度函数相关的图Laplace 算子的特征函数产生. 本文证明与一般相似度函数相关的谱聚类算法的收敛性, 并使用覆盖数方法对收敛性给出量化估计. 当相似度函数是欧氏空间子集上一个Lipschitz s > 0 函数时, O(√log(n + 1)/√n) 形式的收敛率得到证实. 我们同时指出一个相应函数集的覆盖数的增长可以表现任意差. 相似文献

5.

基于自适应权重的函数型数据聚类方法研究

王德青朱建平王洁丹《数理统计与管理》2015,(1):84-92

基于有限维离散数据的传统聚类分析并不能直接用于函数型数据的分类挖掘。本文针对函数型数据的稀疏性和无穷维特殊性展开讨论,在综合剖析现有函数型聚类方法优势与不足的基础上,依据聚类指标的信息量差异重构加权主成分距离为函数相似性测度,提出了一种函数型数据的自适应权重聚类分析。相对同类函数型聚类算法,新方法的核心优势在于:(1)自适应赋权的距离函数体现了聚类指标分类效率的差异,并且有充分的理论基础保证其必要性和客观合理性;(2)基于有限维离散数据的聚类实现了无限维连续函数的聚类,能够显著降低计算成本。实证检验表明,新方法的分类正确率明显提高,能够有效解决传统聚类算法极端情形下的失效问题,有着复杂函数型数据分类问题下的灵活性和普遍适用性。相似文献

6.

遗传模糊聚类算法在图像边缘检测中的应用 总被引：1，自引：0，他引：1

董云影于东张运杰畅春玲《模糊系统与数学》2004,18(Z1):376-379

将一种改进的遗传模糊c-均值聚类(GFGA)算法应用到图像的边缘检测中.我们将灰度图像中的每一个像素点看成是一个数据样本,将该点的灰度值经过Robert算子、Sobel算子和Prewitt算子处理构成它的特性向量,形成具有三维特征的数据集,然后对这个数据集应用遗传模糊聚类算法进行分类,自适应地检测出图像的边缘点,达到提取边缘的目的.实验结果表明,这种混合算法能得到很好的边缘效果,并且得到的结果无需再细化处理,提高了边缘定位的精度. 相似文献

7.

分类属性数据的泛化中心聚类算法

武森张桂琼潘静全敏《运筹与管理》2014,(6)

针对采用经典划分思想的聚类算法以一个点来代表类的局限,提出一种基于泛化中心的分类属性数据聚类算法。该算法通过定义包含多个点的泛化中心来代表类,能够体现出类的数据分布特征,并进一步提出泛化中心距离及类间距离度量的新方法,给出泛化中心的确定方法及基于泛化中心进行对象到类分配的聚类策略,一般只需一次划分迭代就能得到最终聚类结果。将泛化中心算法应用到四个基准数据集,并与著名的划分聚类算法K-modes及其两种改进算法进行比较,结果表明泛化中心算法聚类正确率更高,迭代次数更少,是有效可行的。相似文献

8.

一种改进二叉树支持向量机在故障诊断中的应用

《数学的实践与认识》2017,(19)

考虑到构建二叉树支持向量机时样本的分布情况对分类器推广能力具有较大影响,提出一种改进的二叉树支持向量机层次结构构建方法.以类间样本距离和带权值的类内样本距离与其标准差的比值作为类的分类度.将类间距离大且类内样本平均分布广的类最先分离.利用标准数据集,通过与不同多类分类算法比较,验证了改进的二叉树支持向量机的优越性.对双转子涡喷发动机气路部件进行应用改进的算法进行故障诊断,得到了较好的故障识别率. 相似文献

9.

κ-均值聚类算法的改进及其在冰脊表面形态分析中的应用

《数学的实践与认识》2015,(13)

针对传统k-均值聚类算法事先必须获知类别数和难以确定初始聚类中心的缺点,建立了关于聚类中心和类别数k的双层规划模型,结合粒子群算法确定出聚类中心,通过在迭代过程中不断更新准则函数的方法搜索并确定出最佳类别数惫,基于所建模型,提出了一种改进的k-均值聚类算法,并将算法应用于冰脊表面形态分析中.结果表明,算法得到的聚类结果不但具有相邻类别边界清晰的优点,而且能够较好地反映出地理位置和生长环境对冰脊形成的影响. 相似文献

10.

κ-均值聚类算法的改进及其在冰脊表面形态分析中的应用北大核心

谭冰王骁力李志军卢鹏《数学的实践与认识》2015,(13):140-145

针对传统k-均值聚类算法事先必须获知类别数和难以确定初始聚类中心的缺点,建立了关于聚类中心和类别数k的双层规划模型,结合粒子群算法确定出聚类中心,通过在迭代过程中不断更新准则函数的方法搜索并确定出最佳类别数惫,基于所建模型,提出了一种改进的k-均值聚类算法,并将算法应用于冰脊表面形态分析中.结果表明,算法得到的聚类结果不但具有相邻类别边界清晰的优点,而且能够较好地反映出地理位置和生长环境对冰脊形成的影响. 相似文献

11.

k-均值算法的初始化方法综述

徐大川许宜诚张冬梅《运筹学学报》2018,22(2):31-40

k-均值问题自提出以来一直吸引组合优化和计算机科学领域的广泛关注, 是经典的NP-难问题之一. 给定N个d维实向量构成的观测集, 目标是把这N个观测点划分到k(\leq N)个集合中, 使得所有集合中的点到对应的聚类中心距离的平方和最小, 一个集合的聚类中心指的是该集合中所有观测点的均值. k-均值算法作为解决k-均值问题的启发式算法,在实际应用中因其出色的收敛速度而倍受欢迎. k-均值算法可描述为: 给定问题的初始化分组, 交替进行指派(将观测点分配到离其最近的均值点)和更新(计算新的聚类的均值点)直到收敛到某一解. 该算法通常被认为几乎是线性收敛的. 但缺点也很明显, 无法保证得到的是全局最优解, 并且算法结果好坏过于依赖初始解的选取. 于是学者们纷纷提出不同的初始化方法来提高k-均值算法的质量. 现筛选和罗列了关于选取初始解的k-均值算法的初始化方法供读者参考. 相似文献

12.

聚类分析中非0,1编码的遗传算法

唐立新杨自厚《应用数学与计算数学学报》1998,12(1):22-28

K-平均算法属于聚类分析中的动态聚类法,但其聚类效果受初始聚类分类或初始点的影响较大。本文提出一种遗传算法(GA)来进行近代初始分类,以内部聚类准则作为评价指标,实验结果表明,该算法明显好于K-平均算法。相似文献

13.

KmL: k-means for longitudinal data 总被引：2，自引：0，他引：2

Christophe Genolini Bruno Falissard 《Computational Statistics》2010,25(2):317-328

Cohort studies are becoming essential tools in epidemiological research. In these studies, measurements are not restricted to single variables but can be seen as trajectories. Statistical methods used to determine homogeneous patient trajectories can be separated into two families: model-based methods (like Proc Traj) and partitional clustering (non-parametric algorithms like k-means). KmL is a new implementation of k-means designed to work specifically on longitudinal data. It provides scope for dealing with missing values and runs the algorithm several times, varying the starting conditions and/or the number of clusters sought; its graphical interface helps the user to choose the appropriate number of clusters when the classic criterion is not efficient. To check KmL efficiency, we compare its performances to Proc Traj both on artificial and real data. The two techniques give very close clustering when trajectories follow polynomial curves. KmL gives much better results on non-polynomial trajectories. 相似文献

14.

Hybrid Flow-Shop: a Memetic Algorithm Using Constraint-Based Scheduling for Efficient Search

Antoine Jouglet Ceyda Oğuz Marc Sevaux 《Journal of Mathematical Modelling and Algorithms》2009,8(3):271-292

The paper considers the hybrid flow-shop scheduling problem with multiprocessor tasks. Motivated by the computational complexity of the problem, we propose a memetic algorithm for this problem in the paper. We first describe the implementation details of a genetic algorithm, which is used in the memetic algorithm. We then propose a constraint programming based branch-and-bound algorithm to be employed as the local search engine of the memetic algorithm. Next, we present the new memetic algorithm. We lastly explain the computational experiments carried out to evaluate the performance of three algorithms (genetic algorithm, constraint programming based branch-and-bound algorithm, and memetic algorithm) in terms of both the quality of the solutions produced and the efficiency. These results demonstrate that the memetic algorithm produces better quality solutions and that it is very efficient. 相似文献

15.

Cluster Analysis for Large Datasets: An Effective Algorithm for Maximizing the Mixture Likelihood

Daniel A. Coleman David L. Woodruff 《Journal of computational and graphical statistics》2013,22(4):672-688

Abstract

The primary model for cluster analysis is the latent class model. This model yields the mixture likelihood. Due to numerous local maxima, the success of the EM algorithm in maximizing the mixture likelihood depends on the initial starting point of the algorithm. In this article, good starting points for the EM algorithm are obtained by applying classification methods to randomly selected subsamples of the data. The performance of the resulting two-step algorithm, classification followed by EM, is compared to, and found superior to, the baseline algorithm of EM started from a random partition of the data. Though the algorithm is not complicated, comparing it to the baseline algorithm and assessing its performance with several classification methods is nontrivial. The strategy employed for comparing the algorithms is to identify canonical forms for the easiest and most difficult datasets to cluster within a large collection of cluster datasets and then to compare the performance of the two algorithms on these datasets. This has led to the discovery that, in the case of three homogeneous clusters, the most difficult datasets to cluster are those in which the clusters are arranged on a line and the easiest are those in which the clusters are arranged on an equilateral triangle. The performance of the two-step algorithm is assessed using several classification methods and is shown to be able to cluster large, difficult datasets consisting of three highly overlapping clusters arranged on a line with 10,000 observations and 8 variables. 相似文献

16.

Genetic algorithm-tuned entropy-based fuzzy C-means algorithm for obtaining distinct and compact clusters

Vidyut Dey Dilip Kumar Pratihar G. L. Datta 《Fuzzy Optimization and Decision Making》2011,10(2):153-166

A modified approach had been developed in this study by combining two well-known algorithms of clustering, namely fuzzy c-means algorithm and entropy-based algorithm. Fuzzy c-means algorithm is one of the most popular algorithms for fuzzy clustering. It could yield compact clusters but might not be able to generate distinct clusters. On the other hand, entropy-based algorithm could obtain distinct clusters, which might not be compact. However, the clusters need to be both distinct as well as compact. The present paper proposes a modified approach of clustering by combining the above two algorithms. A genetic algorithm was utilized for tuning of all three clustering algorithms separately. The proposed approach was found to yield both distinct as well as compact clusters on two data sets. 相似文献

17.

Evolutionary Approaches to DNA Sequencing with Errors

Jacek Blazewicz Fred Glover Marta Kasprzak 《Annals of Operations Research》2005,138(1):67-78

In the paper, two evolutionary approaches to the general DNA sequencing problem, assuming both negative and positive errors in the spectrum, are compared. The older of them is based on the idea of genetic approach and is enhanced by a greedy algorithm. The newly proposed algorithm combines the tabu search and the scatter search methods. After conducting experiments with random and coding DNA sequences, our results suggest that the tabu and scatter search algorithm finds solutions of higher quality and more reliably than the genetic algorithm. 相似文献

18.

二阶段随机规划问题基于随机模拟的遗传算法 总被引：1，自引：0，他引：1

何志勇黄崇超《数学杂志》2004,24(6):690-694

利用遗传算法不过多依赖目标函数性质．适应于全局搜索的特点．提出了求解二阶段随机规划的基于随机模拟的遗传算法，算法采用随机模拟技术利用样本均值近似代替期望值，使计算得以简化，计算实例表明该算法是有效和可行的。相似文献

19.

A genetic k-medoids clustering algorithm 总被引：1，自引：0，他引：1

Weiguo Sheng Xiaohui Liu 《Journal of Heuristics》2006,12(6):447-466

We propose a hybrid genetic algorithm for k-medoids clustering. A novel heuristic operator is designed and integrated with the genetic algorithm to fine-tune the search. Further, variable length individuals that encode different number of medoids (clusters) are used for evolution with a modified Davies-Bouldin index as a measure of the fitness of the corresponding partitionings. As a result the proposed algorithm can efficiently evolve appropriate partitionings while making no a priori assumption about the number of clusters present in the datasets. In the experiments, we show the effectiveness of the proposed algorithm and compare it with other related clustering methods. 相似文献

20.

具有预知信息的集装箱码头泊位与岸桥联合调度在线模型

李英乔龙亮郑斐峰《运筹学学报》2018,22(3):28-36

探讨了预知服务需求信息能力下的集装箱码头泊位与岸桥联合调度 over-list 在线模型. 在每个船舶服务请求释放时, 决策者预知后续 k(k \geq 2)个请求的信息,目标为最小化所有请求的最大完工时间. 针对由3个离散泊位组成的混合型泊位与4个岸桥, 以及只有大小两种服务请求的情形, 给出了预知任意 k \geq 2个请求下的竞争比下界; 同时, 对于k=2的特定情形, 给出了具有最优竞争比7/6 的在线策略. 数值实验进一步表明了所设计策略的良好执行性能. 相似文献