首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于卷积神经网络的恒星光谱自动分类方法
引用本文:石超君,邱波,周亚同,段福庆.基于卷积神经网络的恒星光谱自动分类方法[J].光谱学与光谱分析,2019,39(4):1312-1316.
作者姓名:石超君  邱波  周亚同  段福庆
作者单位:河北工业大学电子信息工程学院,天津,300401;北京师范大学信息科学与技术学院,北京,100875
基金项目:国家自然科学基金委员会-中国科学院天文联合基金项目(U1531242)资助
摘    要:恒星光谱自动分类是研究恒星光谱的基础内容,快速、准确自动识别、分类恒星光谱可提高搜寻特殊天体速度,对天文学研究有重大意义。目前我国大型巡天项目LAMOST每年发布数百万条光谱数据,对海量恒星光谱进行快速、准确自动识别与分类研究已成为天文学大数据分析与处理领域的研究热点之一。针对恒星光谱自动分类问题,提出一种基于卷积神经网络(CNN)的K和F型恒星光谱分类方法,并与支持向量机(SVM)、误差反向传播算法(BP)对比,采用交叉验证方法验证分类器性能。与传统方法相比CNN具有权值共享,减少模型学习参数;可直接对训练数据自动进行特征提取等优点。实验采用Tensorflow深度学习框架,Python3.5编程环境。K和F恒星光谱数据集采用国家天文台提供的LAMOST DR3数据。截取每条光谱波长范围为3 500~7 500 部分,对光谱均匀采样生成数据集样本,采用min-max归一化方法对数据集样本进行归一化处理。CNN结构包括:输入层,卷积层C1,池化层S1,卷积层C2,池化层S2,卷积层C3,池化层S3,全连接层,输出层。输入层为一批K和F型恒星光谱相同的3 700个波长点处流量值。C1层设有10个大小为1×3步长为1的卷积核。S1层采用最大池化方法,采样窗口大小为1×2,无重叠采样,生成10张特征图,与C1层特征图数量相同,大小为C1层特征图的二分之一。C2层设有20个大小为1×2步长为1的卷积核,输出20张特征图。S2层对C2层20张特征图下采样输出20张特征图。C3层设有30个大小为1×3步长为1的卷积核,输出30张特征图。S3层对C3层30张特征图下采样输出30张特征图。全连接层神经元个数设置为50,每个神经元都与S3层的所有神经元连接。输出层神经元个数设置为2,输出分类结果。卷积层激活函数采用ReLU函数,输出层激活函数采用softmax函数。对比算法SVM类型为C-SVC,核函数采用径向基函数,BP算法设有3个隐藏层,每个隐藏层设有20,40和20个神经元。数据集分为训练数据和测试数据,将训练数据的40%,60%,80%和100%作为5个训练集,测试数据作为测试集。分别将5个训练集放入模型中训练,共迭代8 000次,每次训练好的模型用测试集进行验证。对比实验采用100%的训练数据作为训练集,测试数据作为测试集。采用精确率、召回率、F-score、准确率四个评价指标评价模型性能,对实验结果进行详细分析。分析结果表明CNN算法可对K和F型恒星光谱快速自动分类和筛选,训练集数据量越大,模型泛化能力越强,分类准确率越高。对比实验结果表明采用CNN算法对K和F型恒星光谱自动分类较传统机器学习SVM和BP算法自动分类准确率更高。

关 键 词:恒星光谱  自动分类  卷积神经网络  交叉验证  评价指标
收稿时间:2018-02-07

Automatic Classification Method of Star Spectra Data Based on Convolutional Neural Network
SHI Chao-jun,QIU Bo,ZHOU Ya-tong,DUAN Fu-qing.Automatic Classification Method of Star Spectra Data Based on Convolutional Neural Network[J].Spectroscopy and Spectral Analysis,2019,39(4):1312-1316.
Authors:SHI Chao-jun  QIU Bo  ZHOU Ya-tong  DUAN Fu-qing
Institution:1. School of Electronics and Information Engineering, Hebei University of Technology, Tianjin 300401, China 2. College of Information Science and Technology, Beijing Normal University, Beijing 100875, China
Abstract:Star spectral automatic classification is the basis for the study of Star Spectral analysis. The fast and accurate automatic identification and classification of the star spectra can improve the search for the speed of the special celestial bodies, which is of great significance to the study of astronomy. At present, LAMOST, a large-scale spacecraft project in China, releases millions of spectral data every year. Fast and accurate automatic identification and classification of massive star spectra has become one of the hot spots in the field of astronomical data analysis and processing. Aiming at the problem of star spectral automatic classification, a new spectral classification method of K, F stellar based on convolutional neural network (CNN) is proposed. Support Vector Machine (SVM) and Back Propagation (BP) neural network algorithms are compared algorithms. The cross-validation method is used to verify the performance of the classifier. Compared with the traditional method, CNN has the advantages of sharing the weight and reducing the learning parameters of the model. It can automatically extract training data features. The experiment uses the Tensorflow depth learning framework and the Python 3.5 programming environment. The K, F stellar spectral dataset uses the LAMOST DR3 data provided by the National astronomical observatory of the Chinese academy of sciences. Spectra with wavelengths in the 3 500 to 7 500 range are sampled evenly to generate data sets. Data sets were normalized using the min-max normalization method. The CNN structure includes an input layer, a convolution layer C1, a pooling layer S1, a convolution layer C2, a pooling layer S2, a convolution layer C3, a pooling layer S3, a full connection layer and an output layer. The input layer is the flow value at 3 700 wavelength points of a group of K and F stars. The C1 layer has 10 convolution kernels in size of 1×3 steps of 1. S1 layer using the maximum pooling method. The size of the sampling window is 1×2, no overlapping sampling. The sampling result produces 10 features, which is the same as the number of the C1 features, and each feature is one-half the size of the C1 feature. The C2 layer has 20 convolution kernels of size 1×2 steps of 1 which outputs 20 feature maps. S2 layer outputs 20 features. The C3 layer has 30 convolution kernels of size 1×3 steps of 1 which outputs 30 feature maps. S3 layer outputs 30 features. The number of fully connected layer neurons is set to 50, and each neuron is connected to all the neurons in the S3 layer. The number of neurons in the output layer is set to 2, and the output classification results are obtained. The activation function of convolution layer uses the ReLU function, and the activation function of output layer uses the softmax function. The contrast algorithm SVM type is C-SVC, and its kernel function uses the radial basis function. The BP algorithm has three hidden layers, each with 20, 40 and 20 neurons. Data set is divided into training data and test data. The training data of 40%, 60%, 80% and 100% are used as training sets and the test data is used as a test set. The training sets are put into the model for training. Each training iteration 8 000 times. Each trained model is validated with a test set. The training data of 100% are used as a training set for comparative experiments. And test data are used as a test set. The accuracy, recall, F-score and accuracy are used to evaluate the performance of the model. The results of experiments are analyzed in detail. Analysis results show that CNN algorithm can quickly and automatically classify and screen K, F star spectra. The greater the amount of data in the training set, the stronger the model generalization ability and the higher the classification accuracy. Contrast experiment results demonstrate that CNN algorithm significantly outperform the competitors SVM and BP algorithms on automatic classification method of K and F star spectra data.
Keywords:Star spectra data  Automatic classification  Convolutional neural network  Cross-validation  Evaluation index  
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《光谱学与光谱分析》浏览原始摘要信息
点击此处可从《光谱学与光谱分析》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号