用户名: 密码: 验证码:
基于局部密度和纯度的自适应k近邻算法
详细信息    查看官网全文
摘要
【目的】针对KNN算法中k值的选取通常是人为设定,而且通常是固定的缺点,研究如何更好地选取k值。【方法】引入了k的可信度的概念,提出了一种基于局部密度和纯度的自适应选取k值的方法,并将其引入到传统的KNN分类算法中。【结果】该算法合理的考虑了样本的局部密度、纯度与选取k值的关系,不仅解决了k值的选取问题,并且避免了固定k值对分类的影响。【结论】该算法是有效的,可以得到较高的准确率,且算法的时效性有待提高。
【Objective】Aiming at the selection of parameter k value(usually fixed) in KNN algorithm is usually set by users,we should study how to better select of k values.【Methods】This paper introduced the concept of the credibility of k, and proposed an improved adaptive selection of k values based on the local density and purity, and introduced into the traditional KNN classification algorithm. 【Results】The algorithm is reasonable to consider the relationship between the local density and purity and the seletion of k values, which not only solves the problems of choosing k values, but also avoids the influence of fixed k value on classification. 【Conclusion】The algorithm is effective and can get higher accuracy, and timeliness is also enhanced.
引文
[1]Wu X,Kumar V,Quinlan J R,et al.Top 10 algorithms in data mining[J].Knowledge and Information Systems,2008,14(1):1-37.
    [2]余蓓,王军,叶施仁.基于近邻方法的高维数据可视化聚类发现[J].计算机研究与发展,2000,37(6):714-720.YU Bei,WANG Jun,YE Shi-Ren.Visual Clustering for High Dimensional Data Based on Nearest Neighbor[J].Journal of Computer Research and Development,2000,37(6):714-720.
    [3]H.B.Mitchell,P.A.Schaefer.A“soft”K-nearest neighbor voting scheme[J].International journal of intelligent systems,2001,16:459-468.
    [4]胡元,石冰.基于区域划分的k NN文本快速分类算法研究[J].计算机科学,2012,39(10):182-186.HU Yuan,SHI Bing.Fast k NN Text Classification Algorithm Based on Area Division[J].Computer Science,2012,39(10):182-186.
    [5]林啟锋,蒙祖强,陈秋莲.结合同义向量聚合和特征多类别的KNN分类算法[J].计算机科学,2013,40(12):55-58.LIN Qi-feng,MENG Zu-qiang,CHEN Qiu-lian.KNN Text Categorization Algorithm Based on Semantic-Vector-Combination and Multiclass of Feature[J].Computer Science,2013,40(12):55-58.
    [6]耿丽娟,李星毅.用于大数据分类的KNN算法研究[J].计算机应用研究,2014,31(05):1342-1344.GENG Li-juan,LI Xing-yi.Improvements of KNN algorithm for big data classification[J].Application Research of Computers,2014,31(05):1342-1344.
    [7]Sun Shiliang,Huang Rongqing.An adaptive k-nearest neighbor algorithm[C]//Proc of the 7th Int Conf on fuzzy Systems and Knowledge Discovery.Piscataway,NJ:IEEE,2010:91-94
    [8]孙可,龚永红,邓振云.一种高效的K值自适应的SA-KNN算法[J].计算机工程与科学,2015,37(10):1965-1970.SUN Ke,GONG Yong-hong,DENG Zhen-yun.An efficient SA-KNN algorithm with adaptive K value[J].Computer Engineering and Science,2015,37(10):1965-1970.
    [9]杨柳,于剑,景丽萍.一种自适应的大间隔近邻分类算法[J].计算机研究与发展,2013,50(11):2269-2277.Yang Liu,Yu Jian,Jing Liping.An Adaptive Large Margin Nearest Neighbor Classification Algorithm[J].Journal of Computer Research and Development,2013,50(11):2269-2277.
    [10]黄少滨,李建,刘刚.一种基于自适应最近邻的聚类融合方法[J].计算机工程与应用,2012,48(19):157-162.HUANG Shaobin,LI Jian,LIU Gang.Clustering ensemble algorithm based on adaptive nearest neighbors[J].Computer Engineering and Applications,2012,48(19):157-162.
    [11]张莹.基于自然最近邻居的分类算法研究[D].重庆:重庆大学,2015.ZHANG Ying.Study on Classification algorithm based on natural nearest neighbor[D].Chongqing:Chongqing University,2015.
    [12]LIU Yu,CHEN Guisheng.KNN algorithm improving based on cloud model[C].2010 2nd International Conference on Advanced Computer Control(ICACC).Changsha,2010:63-66.
    [13]邓振云,龚永红,孙可,张继连.基于局部相关性的k NN分类算法[J].广西师范大学学报(自然科学版),2016,34(01):52-58.DENG Zhenyun,GONG Yonghong,SUN Ke,ZHANG Jilian.A k NN Classification Algorithm Based on Local Correlation[J].Journal of Guangxi Normal University(Natural Science Edition),2016,34(01):52-58

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700