用户名: 密码: 验证码:
信噪比巡天数据中特殊恒星光谱的搜寻方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A Method to Search Special Stellar Spectra from Low Signal-to-Noise Ratio Spectral Sky Survey Data
  • 作者:吴明磊 ; 潘景昌 ; 衣振萍 ; 韦鹏
  • 英文作者:WU Ming-lei;PAN Jing-chang;YI Zhen-ping;WEI Peng;School of Mechanical,Electrical &Information Engineering,Shandong University;Key Laboratory of Optical Astronomy,NAOC,Chinese Academy of Sciences;
  • 关键词:银河系巡天 ; 离群数据挖掘 ; 信噪比光谱
  • 英文关键词:Galaxy survey;;Outlier data mining;;Low SNR spectra
  • 中文刊名:GUAN
  • 英文刊名:Spectroscopy and Spectral Analysis
  • 机构:山东大学(威海)机电与信息工程学院;中国科学院光学天文重点实验室国家天文台;
  • 出版日期:2019-02-15
  • 出版单位:光谱学与光谱分析
  • 年:2019
  • 期:v.39
  • 基金:国家自然科学基金项目(U1431102,11603014)资助
  • 语种:中文;
  • 页:GUAN201902051
  • 页数:4
  • CN:02
  • ISSN:11-2200/O4
  • 分类号:292-295
摘要
特殊恒星是金属丰度异常的恒星,其中包含的信息对于研究宇宙起源、太阳系的演变以及生命的演化都有着重要的意义。因此,特殊恒星的搜寻是国内外巡天项目中的重要目标。恒星光谱中包含着恒星的化学成分、物理性质以及运动状态等丰富的信息,它是开展恒星研究的重要依据。恒星的识别、分类以及特殊恒星的发现主要依据的是恒星光谱数据。随着LAMOST和SDSS等国内外大规模数字巡天项目的深入展开,恒星光谱的数据量达到了前所未有的高度,如此大的数据量为特殊恒星的发现提供了强有力的支撑。因此如何利用这些数据快速准确地发现特殊、稀少甚至于未知类型的恒星光谱是天文学研究的重要问题。数据挖掘是结合模式识别、机器学习、统计分析及相关专家背景知识,从数据中提取出隐含的过去未知的有价值的潜在信息的技术,其在处理大数据方面有着天然的优势,越来越多的数据挖掘方法被应用到巡天数据处理及分析之中。目前针对特殊恒星搜寻的数据挖掘算法主要包含随机森林、聚类分析以及异常值检测等,但随着巡天深度的拓展,观测的目标越来越暗,进而观测光谱的信噪比也随之变低。低信噪比光谱中存在着大量的无用信息,直接利用相关算法对其进行分析处理得到的结果往往存在很大的偏差。因此,如何从大量低信噪比恒星光谱巡天数据中有效地搜寻出特殊的恒星光谱,是当前面临的一个重要问题。由于低信噪比恒星光谱本身的特点,对于从中搜寻特殊恒星光谱的工作开展较少。为了解决此问题,在仔细研究光谱数据处理方法的基础上,针对低信噪比巡天数据中特殊恒星光谱的搜寻,提出了一种以主成分分析(PCA)和基于密度峰值聚类为基础的方法。该方法首先选取O,B,A,F,G,K和M各种类型的高信噪比恒星光谱,进行波长统一和流量插值后,利用主成分分析得到特征光谱;然后利用方差贡献率最大的前几个特征光谱对低信噪比的恒星光谱进行重构得到高信噪比的光谱;最后利用重构之后的高信噪比光谱进行聚类,聚类分析中得到的离群数据即为所要搜寻的特殊恒星光谱。在聚类时,考虑到恒星光谱数据本身的特点,采用了一种基于密度峰值的聚类方法来进行聚类及离群点的挖掘。实验表明,该方法能够在低信噪比的恒星光谱巡天数据中准确地搜寻出数量相对较少的特殊恒星。同时,也可应用于诸如LAMOST、SDSS等各种银河系巡天的光谱数据分析与挖掘中。
        Special stars are stars with anomalous metal abundance,the information of which is of great importance to the study of the origin of the universe,the evolution of the solar system and the evolution of life.Therefore,the search of special stars is an important goal in the large-scale survey project at home and abroad.Stellar spectra contain a wealth of information on the chemical composition,the physical property,and the movement state of stars,which is an important basis for conducting stellar studies.Stellar identification,classification,and the discovery of special stars are largely based on stellar spectral data.With the development of large-scale digital survey projects at home and abroad,such as LAMOST and SDSS,the data amount of stellar spectra has reached an unprecedented height.Such a large amount of data provide strong support for the discovery of special stars.Therefore,how to use these data to find the special,rare and even unknown types of stellar spectra rapidly and accurately is an important issue in astronomical research.Data mining is a technology that combines the pattern recognition,machine learning,statistical analysis and background knowledge of relevant experts to extract the potential unknown valuable information in the past.It has a natural advantage in dealing with big data.More and more data mining methods are applied to the survey data processing and analysis.At present,the data mining algorithms for special stars search mainly include stochastic forest,cluster analysis and outlier detection and so on.However,as the depth of the survey is expanded,the target of observation is getting darker and the signal-to-noise ratio of the observed spectrum accordingly lowers.There is a lot of useless information in the low signal-to-noise ratio spectrum,and the results obtained by directly analyzing and processing the relevant algorithms often have great deviations.Therefore,how to efficiently search out the special stellar spectra from a large number of low-SNR stellar data is an important issue nowadays.Due to the characteristics of the low-SNR stellar spectra themselves,a few studies are being done to search for the special stellar spectra.In order to solve this problem,a method based on principal component analysis(PCA)and the density peak approach is proposed to search special stellar spectra in low-S/N stellar data on the basis of careful study of the relevant methods.In this method,firstly,various types of high-SNR star spectra of O,B,A,F,G,K and M are selected,and then characteristic spectra are obtained by principal component analysis after wavelength unification and flux interpolation;secondly,the stellar spectra are reconstructed to obtain high-SNR spectra by using the first few characteristic spectra;finally,high-SNR spectra are clustered,and the outlier data is the special stellar spectrum.When clustering,this method uses a clustering method based on density peak for clustering and outlier mining with taking into account the characteristics of stellar spectral data itself.Experiments show that the proposed method can accurately search for a relatively smaller number of special stars in the low-SNR stellar data.At the same time,the proposed method can be applied to the spectral data analysis and mining of various galactic survey such as LAMOST and SDSS.
引文
[1] Cui X Q,Zhao Y H,Chu Y Q,et al.Research in Astronomy&Astrophysics,2012,12(9):1197.
    [2] Luo A L,Zhao Y H,Zhao G,et al.Research in Astronomy&Astrophysics,2015,15(8):1095.
    [3] Wei P,Luo A L,Li Y B,et al.Monthly Notices of the Royal Astronomical Society,2013,431(2):1800.
    [4] Navarro S G,Corradi R L M,Mampaso A.Astronomy and Astrophysics,2012,538:A76.
    [5] LIU Jie,PAN Jing-chang,WU Ming-lei,et al(刘杰,潘景昌,吴明磊,等).Spectroscopy and Spectral Analysis(光谱学与光谱分析),2017,37(12):3904.
    [6] Peng N,Zhang Y,Zhao Y,et al.Monthly Notices of the Royal Astronomical Society,2012,425(4):2599.
    [7] Shi J R,Luo A L,Li Y B,et al.Science China Physics,Mechanics and Astronomy,2014,57(1):176.
    [8] Mohamad A,Eva K G,et al.Astronomical Journal,2014,148(1):8.
    [9] Wei P,Luo A,Li Y,et al.Astronomical Journal,2014,147(5):101.
    [10] Rodriguez A,LAIO A.Science,2014,344(6191):1492.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700