用户名: 密码: 验证码:
基于高光谱成像技术的不同产地小米判别分析
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Discriminant Analysis of Millet from Different Origins Based on Hyperspectral Imaging Technology
  • 作者:吉海彦 ; 任占奇 ; 饶震红
  • 英文作者:JI Hai-yan;REN Zhan-qi;RAO Zhen-hong;Key Laboratory of Modern Precision Agriculture System Integration Research, Ministry of Education, China Agricultural University;Key Laboratory of Agricultural Information Acquisition Technology, Ministry of Agriculture, China Agricultural University;College of Science, China Agricultural University;
  • 关键词:高光谱成像 ; 小米 ; 判别分析 ; 递归特征消除
  • 英文关键词:Hyperspectral imaging technology;;Millet;;Discriminant analysis;;Recursive feature elimination
  • 中文刊名:GUAN
  • 英文刊名:Spectroscopy and Spectral Analysis
  • 机构:中国农业大学现代精细农业系统集成研究教育部重点实验室;中国农业大学农业部农业信息获取技术重点实验室;中国农业大学理学院;
  • 出版日期:2019-07-15
  • 出版单位:光谱学与光谱分析
  • 年:2019
  • 期:v.39
  • 基金:国家“十三五”重点研发计划课题(2016YFD0200602)资助
  • 语种:中文;
  • 页:GUAN201907052
  • 页数:7
  • CN:07
  • ISSN:11-2200/O4
  • 分类号:285-291
摘要
高光谱成像技术被广泛应用于农产品的检测。基于高光谱成像技术结合机器学习算法无损鉴别不同地区的小米样本。将来源7个省份共计23份样品的小米样本根据地理区域划分为东北地区、河北、陕西、山东和山西共5大类,其中东北地区共6份样品,山西地区5份样品,河北、陕西和山东各4份样品。将每份样品均分为10等份并利用高光谱成像仪采集900~1 700 nm波段内小米的高光谱数据。为了减少光照不均匀和暗电流对实验的影响,对采集到的高光谱数据进行黑白校正。利用ENVI软件选取小米高光谱图像的感兴趣区域(ROI),每份小米样品选取9个ROI。计算ROI内的平均光谱值,以此平均值作为该样本的一条光谱记录,最后共收集到2 070条光谱曲线,其中东北类540条,山西类450条,其他河北类、山东类、陕西类各360条。为了减少样品表面的不平整性引起的散射现象,进而影响小米的真实光谱信息,对收集到的原始光谱进行多元散射校正预处理(MSC)。采用随机划分法对校正过后的光谱数据划分训练集和测试集,测试集占的比例为0.3。利用线性判别分析(LDA)对不同产地小米的光谱数据进行可视化分析,将测试集代入训练好的LDA模型,做出预测结果的混淆矩阵(Confusion Matrix),结果表明LDA对于陕西和山西类的预测准确率为0.84和0.99,对于东北、河北和山东的预测准确率仅为0.68, 0.68和0.40。进而采用递归特征消除(RFE)对小米的光谱信息进行特征选择,去除冗余的信息,提高模型的预测准确率。将RFE分别与支持向量机(SVM)和逻辑回归(LR)结合,对不同产地小米的判别进行对比分析。将小米光谱数据的训练集分别代入SVM-RFE和LR-RFE模型并结合3折交叉验证技术,以模型F值的微平均(Micro-averaging)最优选择出相应的特征子集。结果表明, LR-RFE选择的波长数为74个,其模型的Micro_F为0.59; SVM-RFE选择的波长数为220,其模型的Micro_F为0.66。将选择后的特征子集应用到测试集并将测试集分别代入SVM和LR模型,采用模型预测结果的混淆矩阵和模型的受试者工作特征曲线(ROC)作为评价方法。结果表明SVM-RFE对东北地区、河北、陕西、山东和山西的预测准确率分别为1, 0.37, 0.72, 0和1,其ROC曲线下面积(AUC)分别为0.82, 0.92, 0.93, 0.70和0.99。LR-RFE的预测准确率分别为0.92, 0, 0.97, 0和0.80,其AUC分别为0.72, 0.74, 0.94, 0.66和0.88。从预测结果可以看出SVM-RFE模型的综合分类性能优于LR-RFE,而对陕西类的判别LR-RFE要优于SVM-RFE,对于河北类和山东类两个模型都不能有效判别。这两个模型的预测准确率相比LDA有了一定的提升。
        Hyperspectral imaging technology has been widely used in the detection of agricultural products. This paper studies the non-destructive identification of millet samples from different regions based on hyperspectral imaging and machine learning algorithms. The millet samples from seven provinces were divided into five categories according to geographical regions. They were Dongbei, Hebei, Shaanxi, Shandong, and Shanxi, respectively. A total of 23 samples were collected in these areas, including 6 samples in Dongbei, 5 samples in Shanxi, and respective 4 samples in Hebei, Shaanxi, and Shandong. Each sample was equally divided into 10 equal parts and the hyperspectral data of millet in the wavelength band from 900 to 1 700 nm was collected using a hyperspectral imager. In order to reduce the influence of uneven illumination and dark current on the experiment, the collected hyperspectral data was corrected in black and white. The ENVI software was used to select the region of interest(ROI) of millet hyperspectral image, and 9 ROIs were selected for each sample of millet. The average spectral value in the ROI was calculated, which was used as a spectrum record of the sample. Finally, a total of 2 070 spectral curves were collected, of which 540 from Dongbei, 450 from Shanxi, and several 360 from Hebei, Shandong, and Shaanxi respectively. In order to reduce the scattering phenomenon caused by the unevenness of the sample surface, which would affect the true spectral information of millet, the multivariate scatter correction(MSC) pretreatment was performed on the original spectrum. In addition, randomized division method was used to divide the corrected spectral data into training set and test set. The ratio of test set was 0.3. Linear Discriminant Analysis(LDA) was used to visualize spectral data of millet from different origins. Substituting the test set into a well-trained LDA model, and finally a confusion matrix of prediction results was created. The results showed that LDA had a prediction accuracy of 0.84 and 0.99 for Shaanxi and Shanxi, and only 0.68, 0.68, and 0.40 for Dongbei, Hebei, and Shandong. Therefore, the recursive feature elimination(RFE) was used to select useful spectral information, remove redundant information, and improve the prediction accuracy. The RFE combined with support vector machine(SVM) and Logistic Regression(LR) were used to compare and analyze the discriminant of millet from different regions. Substituting training set of millet spectral data into SVM-RFE and LR-RFE models, and the corresponding feature subsets were selected optimally by the micro-averaging of the model F-values and 3-fold cross validation technology. The results showed that the number of wavelengths selected by the LR-RFE was 74 and the Micro_F of the model was 0.59; Meanwhile the number of wavelengths selected by the SVM-RFE was 220 and the Micro_F of the model was 0.66. The selected feature subset was applied to the test set. Substituting the test set into SVM and LR models respectively, and confusion matrix of model prediction results and the receiver operating characteristic curve(ROC) of the model were used as the evaluation method. The results showed that the accuracy of SVM-RFE prediction was 1, 0.37, 0.72, 0, and 1 for Dongbei, Hebei, Shaanxi, Shandong, and Shanxi, and the area under ROC curve(AUC) was 0.82, 0.92, 0.93, 0.70, and 0.99 respectively. The accuracy of LR-RFE prediction was 0.92, 0, 0.97, 0, and 0.80, and the AUC was 0.72, 0.74, 0.94, 0.66, and 0.88 respectively. It can be seen from the prediction results that the overall classification performance of SVM-RFE model was better than that of LR-RFE, while the discrimination of Shaanxi class LR-RFE was better than that of SVM-RFE. For the Hebei and Shandong categories, neither model could effectively discriminate it. Compared with LDA, the prediction accuracy of these two models had been improved.
引文
[1] QIAO Ling,WANG Xin(乔玲,王欣).Agricultural Science & Technology and Equipment(农业科技与装备),2015,(11):41.
    [2] ZHAO Yu,CUI Ji-han,LI Shun-guo,et al(赵宇,崔纪菡,李顺国,等).Journal of Hebei Agricultural Sciences(河北农业科学),2017,(4):1.
    [3] CUI Ji-han,ZHAO Yu,LIU Meng,et al(崔纪菡,赵宇,刘猛,等).Journal of Agricultural Science and Technology(中国农业科技导报),2017,19(8):84.
    [4] LI Xing,WANG Hai-huan,SHEN Qun(李星,王海寰,沈群).Journal of Chinese Institute of Food Science and Technology(中国食品学报),2017,17(7):248.
    [5] ZHANG Ren-tang,DONG Hao,GAO Lin,et al(张仁堂,董浩,高琳,等).Food and Nutrition in China(中国食物与营养),2012,18(10):22.
    [6] SONG Xue-jian,QIAN Li-li,ZHOU Yi,et al(宋雪健,钱丽丽,周义,等).Food Research and Development(食品研究与开发),2017,38(11):134.
    [7] Wu D,Sun D W.Innovative Food Science & Emerging Technologies,2013,19(1):15.
    [8] Liu Y,Pu H,Sun D W.Trends in Food Science & Technology,2017,69.
    [9] Feng Y Z,Sun D W.Critical Reviews in Food Science & Nutrition,2012,52(11):1039.
    [10] Dale L M,Thewis A,Boudry C,et al.Applied Spectroscopy Reviews,2013,48(2):142.
    [11] WANG Qing-guo,HUANG Min,ZHU Qi-bing,et al(王庆国,黄敏,朱启兵,等).Journal of Food Science and Biotechnology(食品与生物技术学报),2014,33(2):163.
    [12] CHU Bing-quan,ZHAO Yan-ru,HE Yong(楚秉泉,赵艳茹,何勇).Spectroscopy and Spectral Analysis(光谱学与光谱分析),2017,37(6):1718.
    [13] WANG Wan-jiao,HE Xiao-guang,YANG Xiao-chen,et al(王婉娇,贺晓光,杨晓忱,等).Food Science and Technology(食品科技),2015,(6):344.
    [14] Barbin D,Elmasry G,Sun D W,et al.Meat Science,2012,90(1):259.
    [15] Liu D,Wang L,Sun D W,et al.Food Analytical Methods,2014,7(9):1848.
    [16] Rivera N V,Gómez-Sanchis J,Chanona-Pérez J,et al.Biosystems Engineering,2014,122(3):91.
    [17] Wu D,Sun D W,He Y.Innovative Food Science & Emerging Technologies,2012,16(39):361.
    [18] YAO Xu,WANG Xiao-dan,ZHANG Yu-xi,et al(姚旭,王晓丹,张玉玺,等).Control and Decision(控制与决策),2012,27(2):161.
    [19] ZHANG Rui,MA Jian-wen(张睿,马建文).Geomatics and Information Science of Wuhan University(武汉大学学报·信息科学版),2009,34(7):834.
    [20] Santos A M,Canuto A M P,Neto A F.International Journal of Computer Information Systems and Industrial Management Applications,2011,3(1):218.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700