首页 | 本学科首页   官方微博 | 高级检索  
     检索      

稀疏数据非线性降维的CN-Isomap算法
引用本文:武森,全喜伟,陈学昌.稀疏数据非线性降维的CN-Isomap算法[J].数学的实践与认识,2010,40(17).
作者姓名:武森  全喜伟  陈学昌
摘    要:针对经典的流形学习算法Isomap在非线性数据稀疏时降维效果下降甚至失效的问题,提出改进的切近邻等距特征映射算法(Cut-Neighbors Isometric feature mapping,CN-Isomap).该算法在数据稀疏的情况下首先通过有效识别样本点的"流形邻居"来剔除近邻图上的"短路"边,然后再通过最短路径算法拟合测地线距离,使得拟合的测地线距离不会偏离流形区域,从而低维嵌入映射能够正确地反映高维输入空间样本点间的内在拓扑特征,很好地发现蕴含在高维空间里的低维流形,有效地对非线性稀疏数据进行降维.通过对Benchmark数据集的实验表明了算法的有效性.CN-Isomap算法是Isomap算法的推广,不仅能有效地对非线性稀疏数据进行降维,同样也适用于数据非稀疏的情况.

关 键 词:降维  流形学习  等距特征映射  稀疏数据

CN-Isomap Algorithm for Nonlinear Dimensionality Reduction of Sparse Data
WU Sen,QUAN Xi-wei,CHEN Xue-chang.CN-Isomap Algorithm for Nonlinear Dimensionality Reduction of Sparse Data[J].Mathematics in Practice and Theory,2010,40(17).
Authors:WU Sen  QUAN Xi-wei  CHEN Xue-chang
Abstract:An improved algorithm Cut-Neighbors Isometric feature mapping(CN-Isomap) is proposed after analyzing why Isomap,a classic manifold learning algorithm,can not reduce dimensionality effectively for sparse nonlinear data.The algorithm first identifies the 'manifold neighbors' effectively when data is sparse in order to delete the 'short circuit' edges in neighborhood graph.Then it simulates the geodesic distance by shortest path algorithm,so that the geodesic distance will not deviate from the manifold region.Thus low-dimensional embedded mapping can correctly reflect the inherent topological features of sample points in high-dimensional input space,which enables that the algorithm finds the low-dimensional manifolds implicated in high-dimensional space better and reduce dimensionality of sparse nonlinear data effectively.The effectiveness of the algorithm is verified by experiment on Benchmark data set.CN-Isomap is the extension of Isomap.It is not only effective for dimensionality reduction of sparse nonlinear data,but also applicable to non-sparse data.
Keywords:dimensionality reduction  manifold learning  isometric feature mapping  sparse data
本文献已被 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号