首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于Isomap算法的恒星光谱离群点挖掘
引用本文:卜育德,潘景昌,陈福强.基于Isomap算法的恒星光谱离群点挖掘[J].光谱学与光谱分析,2014,34(1):267-273.
作者姓名:卜育德  潘景昌  陈福强
作者单位:1. 山东大学(威海)数学与统计学院,山东 威海 264209
2. 山东大学(威海)信息工程学院, 山东 威海 264209
3. 同济大学电子与信息工程学院,上海 201804
基金项目:国家自然科学基金项目(11078013)资助
摘    要:如何从已分类的海量光谱中发现被错分的光谱一直是天文数据处理专家重点研究的问题,探讨的Isomap算法在该问题方面有很好的表现。通过Isomap算法与主成分分析方法(PCA)算法的实验结果对比发现:(1)PCA将具有不同特征的光谱投影到邻近的区域,而Isomap算法却可以将具有相似特征的光谱投影到邻近区域,而将具有不同特征的光谱投影到相距较远的区域;(2)Isomap算法给出的大部分离群点较易判断,且是具有很高科学价值的双星;而PCA给出的离群点难以判断,科学价值不高。因此,在光谱离群点发掘上Isomap算法比PCA有明显优势。由于使用的数据为SDSS最新发布的M型的九种光谱次型的光谱,因而Isomap算法能够快速发现被斯隆数字巡天数据处理流程(SDSS pipeline)错分的光谱,可帮助有效提高现有光谱分类算法的准确率。更进一步,由于被SDSS pipeline错分的光谱大部分是双星,因而Isomap算法还可以进一步帮助我们发现有很高科学研究价值的双星,提高双星的发现效率。虽然实验显示Isomap算法对信噪比变化较为敏感,在具有较低信噪比的光谱上表现较差,但由于信噪比低的光谱的光谱型难以判断,因而该缺点并不影响Isomap算法的在光谱发掘上的应用。

关 键 词:流形学习算法  Isomap算法  主成分分析  数据挖掘    
收稿时间:2013/3/25

Stellar Spectral Outliers Detection Based on Isomap
BU Yu-de;PAN Jing-chang;CHEN Fu-qiang.Stellar Spectral Outliers Detection Based on Isomap[J].Spectroscopy and Spectral Analysis,2014,34(1):267-273.
Authors:BU Yu-de;PAN Jing-chang;CHEN Fu-qiang
Institution:1. School of Mathematics and Statistics, Shandong University, Weihai, Weihai 264209, China2. School of Information Engineering, Shandong University, Weihai, Weihai 264209, China3. College of Electronics and Information Engineering, Tongji University, Shanghai 201804, China
Abstract:How to find the spectra misclassified by traditional methods is the key problem that has been widely studied by the experts of astronomical data processing. We found that Isomap algorithm performs well for this problem. By comparing the performance of Isomap with that of principal component analysis (PCA), we found that (1) Isomap can project the spectra with similar features together and project the spectra with different features far away, while PCA may project the spectra with different features into nearby regions; (2) the outliers given by Isomap can be easily determined, and most of the outliers are binary stars with high scientific values; while the outliers given by PCA are difficult to determine and most of outliers are not binary stars. Thus, Isomap is more efficient than PCA in finding the outliers. Since the spectral data used in experiment are the spectra from the ninth data release of Sloan Digital Sky Survey (SDSS DR9), Isomap can find the spectra misclassified by SDSS pipeline efficiently and improve the classification accuracy obviously. Furthermore, since most of the spectra misclassified by SDSS pipeline are binary stars, Isomap can improve the efficiency of finding the binary stars with high scientific values. Though the experiment results show that Isomap is more sensitive to the noise than PCA, this disadvantage will not affect the application of Isomap in spectral classification since most of the spectra with low signal-to-noise ratios are the spectra whose spectral type cant be determined manually.
Keywords:Manifold learning algorithm  Isomap algorithm  PCA  Data mining
本文献已被 CNKI 等数据库收录!
点击此处可从《光谱学与光谱分析》浏览原始摘要信息
点击此处可从《光谱学与光谱分析》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号