用户名: 密码: 验证码:
基于构造性学习的覆盖算法的发展及应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
机器学习通过使用机器来模拟人类的学习活动,从已知事物中发现规律、获取知识,从而建立对未知事物的预测模型,根据经验不断提高自身的水平。研究者经过多年的探索,提出了如支持向量机、决策树、神经网络等多种优秀的学习方法,并将这些方法推广到机器学习中的各个领域。中国学者在基于覆盖思想的学习方法上进行了很多工作,张铃和张钹所提出的基于构造性学习的覆盖算法被认为是一种具有代表性的方法。
     覆盖算法能够根据样本的自身特点来构造神经网络,克服了传统神经网络中的一些缺陷,如网络结构难以确定、速度慢等。该方法形式直观,能够有效对多类分类问题和海量数据进行处理,在一些实际应用中表现出良好的性能。相关研究者围绕着该算法的改进和应用进行了大量的研究工作。
     目前已有的对覆盖算法的各项工作都是针对单示例单标记的学习方式来进行的,但随着机器学习的发展,不断出现一些新的学习问题。本文结合机器学习中出现的一些新模型,对覆盖算法进行了发展和应用,主要体现在如下几个方面:
     (1)对覆盖算法进行了全面研究,并将算法应用于实际分类问题的解决。
     本文对覆盖算法的基本模型以及近年来所取得的各项理论和应用成果进行了全面研究,探讨了如何将算法应用于文本分类和垃圾邮件过滤等问题的解决。在应用过程中,针对实际问题的特点,设计了不同的改进策略。在文本分类中,通过引入维数调节的策略,使不同类别文本的特征能够在特征向量中均等出现,提高文本分类的准确率。在垃圾邮件过滤中,将邮件的各类附加信息与正文内容一起构成复合特征,提高过滤器的分类效果,并针对垃圾邮件过滤中正常邮件的风险最小化问题进行了讨论。
     (2)对核覆盖算法进行了细致分析,将算法加强为模糊核覆盖算法。
     支持向量机方法通过将样本映射到高维特征空间后构造最优分类超平面,取得了优秀的分类性能。将核函数引入到覆盖算法后所得到的核覆盖算法能够有效提高分类能力,但仍存在不足之处。本文对核覆盖算法中的半径选择策略和分类原则进行了细致分析,指出现有处理方式所存在的缺陷。通过改变领域半径的确定原则,并对拒识样本引入新的隶属度函数来描述样本对各个类别的隶属度,将算法加强为模糊核覆盖算法,明确了隶属度函数的物理意义。引入几种性能不同的覆盖约简方法,结合模糊核覆盖算法,能够在保持识别性能的前提下,有效降低覆盖数量,提高分类效率。在一些数据集上的测试和对比表明了方法的有效性。
     (3)研究了多标记学习下的覆盖算法。
     传统学习问题中,每个样本只属于一个类别,即仅有一个标记。而在实际应用中,一个样本可能同时属于多个类别,如文本分类和场景分类等。本文对多标记学习中的样本集分解和算法改造两种策略进行了研究,针对多标记学习的特点和评价指标,探讨如何使用覆盖算法来解决多标记问题。实验表明,多标记覆盖算法的性能达到了同类算法中的先进水平,并且在时空开销上具有优势。由于多标记学习中对训练数据的标记需要更多的人力和物力,因此数据集中的已标记样本数量一般较少。为了能利用大量未标记样本来辅助学习,本文采用半监督学习中的自训练策略,结合已标记样本和未标记样本来训练分类器,提高分类性能,取得了一定成效。
     (4)讨论了如何在多示例学习中使用覆盖算法进行分类。
     多示例学习与传统的监督学习、无监督学习和强化学习都存在差异,是机器学习中的第四种框架,起源于药物分子活性预测的研究。在多示例学习中,学习的对象是由多个示例所构成的包,包的标记已知,示例的标记未知,但包的标记是由某些示例决定的。多示例学习的难度比带噪声的监督学习难度更大。本文对现有的各类多示例学习方法进行了研究,对如何将覆盖算法应用于多示例学习进行了探讨,根据不同的解决思路,给出几种多示例学习覆盖算法,算法效果达到大多数同类方法的水平。多示例多标记学习结合了多示例学习和多标记学习两种问题,是分类问题中的最一般情况,能够描述输入空间和输出空间中所具有的歧义性。本文探讨了如何将覆盖算法与其它方法相结合来解决该问题的思路,并给出初步的解决方案。
     在本文的研究工作中,进行了如下创新:
     (1)将覆盖算法应用于文本分类和垃圾邮件过滤等实际分类问题,并针对具体应用的特点分别提出不同的调整策略,提高分类器的性能。
     (2)对核覆盖算法中如何确定领域半径给出新的策略;对拒识点引入新的隶属度函数;对隶属度函数的物理意义给予明确的解释;将核覆盖算法加强为模糊核覆盖算法;结合模糊核覆盖算法给出几种覆盖精简的方法。
     (3)将覆盖算法推广到多标记学习,结合样本集分解和算法改造两种策略,提出多标记覆盖算法,其性能达到同类算法的先进水平;以自训练策略为指导,提出一种半监督学习方式下的多标记覆盖算法。
     (4)将覆盖算法推广到多示例学习,提出几种不同思路的多示例覆盖算法;针对多示例多标记学习给出初步的解决思路。
Machine learning is a subject of acquiring knowledge and rules from known material to build a forecasting model for unseen problems. It simulates human being's learning behavior and can improve on itself through continuous learning. After years of research, many outstanding learning methods, such as Support Vector Machine, Decision Tree and Neural Networks have been proposed, and applied to numerous machine learning areas. Chinese scholars have done substantial research work on covering-based learning methods, among which the covering algorithm based on constructive learning proposed by Zhang Ling and Zhang Bo is a good representative.
     Covering algorithm can construct neural networks based on samples' own characteristics and overcomes some general drawbacks of traditional neural networks, like learning is too slow, and the structure of network is hard to determine. Covering algorithm is very straightforward and can effectively handle multi-category classification and large-scale data, and performs well in many real applications. Many researches have been done to improve on this method and apply it to different domain problems. Current work focuses on single-instance single-label problems and can not solve some new learning questions. This dissertation extends covering algorithm in the following ways:
     (1) It does comprehensive research over covering algorithm and applies it to real classification problems.
     This dissertation researches the basic learning model of covering algorithm and the recent progress in theory and application comprehensively. It applies covering algorithm to text categorization and spam-filtering. And different strategies have been proposed according to specific-matters. In text categorization, dimension regulation is introduced to make different text categories get evenly represented in the feature vector which enhances the precision. In spam-filtering, extra information of every email is combined with body text to create the compounded feature to improve the accuracy. It also discusses how to minimize the risk of filtering out regular emails.
     (2) It analyzes kernel covering algorithm and extends it to fuzzy kernel covering algorithm.
     Support vector machine maps samples to high dimension space to construct optimal classification space and achieves excellent performance. Kernel covering algorithm utilizes kernel function and improves the accuracy effectively. But there are still some drawbacks. This dissertation analyzes the influence of proximity principle used to judge rejection points on classifier's effect. FKCA, i.e. fuzzy kernel covering algorithm, is proposed to improve the performance of classifier. The main improvement of FKCA is the change of radius selection and introduction of membership function. The physical explanation of the membership function is also discussed. A couple of reduction methods are introduced to improve on classifier which keep the number of covering down. Experiments show that the performance of these methods is effective.
     (3) It studies multi-label learning covering algorithm.
     In classic machine learning, each sample belongs to a single category, i.e., one label. But in real world, a sample can belong to multi categories, for instance, the text categorization and scene classification. This dissertation researches the decomposing of sample set and algorithm improvement, and explores applying covering algorithm to multi-label learning. Experiments show that multi-label covering algorithm performs at par with other multi-label learning algorithms and has the advantage of lower time/space cost. Since more efforts are required to label the multi-label training data, generally much data in training set is not labeled. To overcome this weakness, we adopt semi-supervised learning to improve the accuracy and it works well.
     (4) It discusses how to extend covering algorithm to multi-instance learning.
     Multi-instance learning is different from traditional supervised learning, unsupervised learning and reinforcement learning. It originates from predicting the molecules' activity and is regarded as the fourth learning framework. The learning object is the bag consisting of multiple instances. The labels of bags are known while labels of instances are unknown. And bag's label is determined by instances. Multi-instance learning is even harder than supervised learning with noise. This dissertation explores applying covering algorithm to multi-instance learning and proposes several algorithms which have comparable performance. We also discuss how to combine covering algorithm and other methods to solve the multi-instance multi-label learning and present the initial solution.
     The innovations of this dissertation are as follows:
     (1) Covering algorithm is applied to text categorization and spam filtering and different strategies are applied to improve the overall performance.
     (2) Presents the new method of determining the covering's radius; Introduces the new membership function for rejection samples; Gives physical explanation for membership function; Extends kernel covering algorithm to fuzzy covering algorithm; Presents several reduction methods.
     (3) Extends covering algorithm to multi-label learning, and new algorithm is proposed. The performance of MICA is at par with other well-known works, a multi-label covering algorithm based on semi-supervised learning is proposed.
     (4) Extends covering algorithm to multi-instance learning and proposes several methods. And an initial solution to multi-instance multi-label learning problem is provided.
引文
[1]SIMON H.A. Why should machine learn?[A]. Machine Learning:An Artificial Intelligence Approach[C] PaloAlto:CA Tioga Press,1983, pp.25-38.
    [2]ALPAYDIN E.机器学习导论[M].北京:机械工业出版社,2009.
    [3]沈晶.分层强化学习理论与方法[M].哈尔滨:哈尔滨工程大学出版社,2007.
    [4]DIETTERICH T.G, LATHROP R.H, LOZANO P.T. Solving the multiple-instance problem with axis parallel rectangles[J]. Artificial Intelligence,1997,89(1-2):31-71.
    [5]蔡自兴,李枚毅.多示例学习及其研究现状[J].控制与决策,2004,19(6):607-610.
    [6]王珏.机器学习及其应用[M].北京:清华大学出版社,2006.
    [7]MINSKY M.L. Theory of Neural-Analog Reinforcement Systems and its Application to the Brain-Model Problem[D]:Princeton University,1954.
    [8]WALTZ M.D, FU K.S. A Heuristic Approach to Reinforcement Learning Control System[J]. IEEE Trans on Automatic Control,1965,10(4):390-398.
    [9]蒋艳凰,赵强利.机器学习方法[M].北京:电子工业出版社,2009.
    [10]CHAPELLE O, SCHOLKOPF B, ZIEN A. Semi-supervised learning[M]. Cambridge:MIT Press,2006.
    [11]王万良.人工智能及其应用[M].北京:高等教育出版社,2005.
    [12]CORTES C, VAPNIK V. Support Vector Networks[J]. Machine Learning,1995, (20):273-297.
    [13]张学工.关于统计学习理论与支持向量机[J].自动化学报,2000,26(1):32-42.
    [14]AGRAWAL R, IMIELINSKI T, SWAMI A. Mining association rules between sets of items in large databases[A]. Proc of ACM SIGMOD Conf on Management of Data[C] Washington,1993, pp.207-216.
    [15]王守觉.仿生模式识别(拓扑模式识别)一种模式识别新模型的理论与应用[J].电子学报,2002,30(10):1417-1420.
    [16]XU Z.B, MENG D.Y, JING W.F. A new approach for classification:Visual
    simulation point of view[A]. Proceedings of the ISNN 2005[C]:LNCS 3497.Springer Verlag,2005, pp.1-7.
    [17]HE Q, SHI Z.Z, REN L. The classification method based on hyper surface[A]. Proceedings of the IEEE International Joint Conference on Neural Networks[C],2002, pp.1499-1503.
    [18]张铃,张钹.M-P神经元模型的几何意义及其应用[J].软件学报,1998,9(5):334-338.
    [19]张铃,张钹,殷海风.多层前向网络的交叉覆盖设计算法[J].软件学报,1999,10(7):737-742.
    [20]何清,史忠植.基于覆盖的分类算法研究进展[J].计算机学报,2007,30(8):1235-1243.
    [21]MCCULLOCH W.S, PITTS W. A Logical Caculus of the Ideas Immanent in Nervous Activity[J]. Bulletin of Mathematical Biophysics,1943, (5):115-133.
    [22]RUJAN P, MARCHAND M. A geometric approach to learning in neural networks[A]. Proceedings of the International Joint Conference on Neural Networks' 89[C] Washington, DC:IEEE TAB Neural Network Committee,1989, pp.105-110.
    [23]RAMACHER U, WESSELING M. A geometrical approach to neural network design[A]. Proceedings of the International Joint Conference on Neural Networks' 89[C] Washington, DC:IEEE TAB Neural Network Committee,1989, pp.147-154.
    [24]吴涛.构造性知识发现方法研究[D].合肥:安徽大学,2003.
    [25]张燕平,张铃,段震.构造性核覆盖算法在图像识别中的应用[J].中国图象图形学报,2004,9(11):1304-1308.
    [26]赵姝,张燕平,张媛,等.基于交叉覆盖算法的改进算法——核平移覆盖算法[J].微机发展,2004,14(11):1-3.
    [27]吴涛,张铃,张燕平.机器学习中的核覆盖算法[J].计算机学报,2005,28(8):1295-1301.
    [28]张燕平,张铃,吴涛.机器学习中的多侧面递进算法MIDA[J].电子学报,2005,33(2):327-331.
    [29]赵姝,张燕平,张铃,等.覆盖聚类算法[J].安徽大学学报(自然科学版),2005,29(2):28-32.
    [30]宋杰,程家兴,许中卫,等.一种改进的贪婪式覆盖算法[J].计算机技术与发展,2006,16(8):113-115.
    [31]吴涛,尚丽,陈黎伟.一种基于聚类的交叉覆盖算法[J].计算机技术与发展,2008,18(11):113-116.
    [32]赵姝,张燕平,张铃.基于粒度计算的覆盖算法[J].计算机科学,2008,35(3):225-227.
    [33]贾瑞玉,冯伦阔,李永顺,等.基于集成学习的覆盖算法[J].计算机技术与发展,2009,19(7):76-79.
    [34]张铃,吴涛,周瑛,等.覆盖算法的概率模型[J].软件学报,2007,18(11):2691-2699.
    [35]ZHAO S, ZHANG Y.P, ZHANG L, et al. Probability Model of Covering Algorithm. International Conference on Intelligent Computing[A]. ICIC 2006[C] Kunming, China,2006, pp.440-444.
    [36]周瑛,张铃.基于概率的覆盖算法的研究[J].计算机技术与发展,2006,16(3):29-30.
    [37]张旻,程家兴.基于粒度计算和覆盖算法的信号样式识别[J].计算机工程与应用,2003,39(24):56-59.
    [38]张燕平,张铃,吴涛,等.基于覆盖的构造性学习算法SLA及在股票预测中的应用[J].计算机研究与发展,2004,41(6):979-984.
    [39]施尧,赵勇,杨雪洁,等.基于覆盖算法的大气质量预测[J].计算机技术与发展,2008,18(7):190-192.
    [40]赵勇,施尧,杨雪洁,等.基于覆盖算法的降雨量预测[J].计算机工程与应用,2008,44(9):232-234.
    [41]赵姝.计算智能及其在气象信息分析中的应用[D].合肥:安徽大学,2007.
    [42]赵姝.基于交叉覆盖算法的入侵检测[J].计算机工程与应用,2005,41(1):]41-143.
    [43]王伦文,张旻,张铃.一种适合于短波通信信号监测的数据挖掘技术[J].计算机工程与应用,2004,40(4):37-40.
    [44]张持健.商空间下模糊系统与模糊控制的问题求解[D].合肥:安徽大学,2005.
    [45]张旻,张铃.基于构造性覆盖算法的离群数据挖掘研究[J].计算机科学,2005,32(4):27-30.
    [46]杨金福,吴福朝,罗阿理,等.基于覆盖算法的天体光谱自动分类[J].模式识别与人工智能,2006,19(3):368-374.
    [47]杨金福,许馨,吴福朝,等.核覆盖算法在光谱分类问题中的研究[J].光谱学与光谱分析,2007,27(3):602-605.
    [48]李丽华,高立艾,于尧.基于核覆盖算法的农村短期电力负荷预测[J].农机化研究,2008,(9):206-211.
    [49]周鸣争,楚宁,强俊.基于构造性核覆盖算法的异常入侵检测[J].电子学报,2007,35(5):862-867.
    [50]SALTON G, YANG C.S. On the specification of term value in automatic indexing[J]. Journal of Documentation,1973,29(4):351-372.
    [51]YANG Y, PEDERSEN J.O. A comparative study on feature selection in text categorization [A]. Proceedings of the 14th International Conference on Machine Learning(ICML97)[C] San Francisco:Morgan Kaufmann Publishers,1997, pp. 412-420.
    [52]HWEE T.N, WEI B.G, KOK L.L. Feature selection, perceptron learning, and a usability case study for text categorization [A]. Proceedings of the 20th ACM International Conference on Research and Development in Information Retrieval (SIGIR97)[C] Philadelphia:ACM Press,1997, pp.67-73.
    [53]程泽凯,林士敏.文本分类器准确性评估方法[J].情报学报,2004,23(5):631-636.
    [54]CHANG C.C, LIN C.J. LIBSVM:a library for support vector machines [EB/OL].
    [55]曹麒麟,张千里.垃圾邮件与反垃圾邮件技术[M].北京:人民邮电出版社,2003.
    [56]陈凯.反垃圾邮件技术的研究与实践[D].北京:北京邮电大学,2006.
    [57]LI W.B, LIU C.N, CHEN Y.Y. Design and Implement Cost-Sensitive Email Fitering Algorithms[A]. Proceedings of the Artificial Intelligence Applications and Innovations[C] Beijing,2005, pp.325-334.
    [58]ZADEH L.A. Fuzzy sets[J]. Information and Control,1965, (8):338-353.
    [59]高新波.模糊聚类分析及其应用[M].西安:西安电子科技大学出版社,2004.
    [60]ZADEH L.A. Fuzzy Logic[J]. IEEE Computer,1998,21(4):83-91.
    [61]DUDA R.O, HART P.E, STORK D.G. Pattern Classification[M]. Second ed. Beijing:CHina Machine Press,2004.
    [62]阳爱民.模糊分类模型及其集成方法[M].北京:科学出版社,2008.
    [63]张永,吴晓蓓,向峥嵘,等.基于多目标进化算法的高维模糊分类系统的设计[J].系统仿真学报,2007,19(1):210-215.
    [64]NAUCK D, KRUSE R. A neuro-fuzzy method to learn fuzzy classification rules from data[J]. Fuzzy Sets and Systems,1997, (89):277-288.
    [65]KAMEI K. An Application of Fuzzy Clustering to Controller Design[J]. Japan Society for Fuzzy Theory and Systems,1996,83(448-455).
    [66]KARR C.L, GENTRY E.J. Fuzzy control of pH using genetic algorithms[J]. IEEE Trans on Fuzzy System,1993, (1):5+-53.
    [67]阳爱民,陈火旺.一种基于模糊核超球感知器的模糊分类模型[J].南京大学学报(自然科学),2008,44(5):559-568.
    [68]阳爱民,胡运发.一种基于椭圆区域的进化式模糊分类系统[J].模式识别与人工智能,2005,18(6):698-707.
    [69]罗军.基于Boosting算法集成遗传模糊分类器的文本分类[J].计算机应用,2008,28(9):2386-2388.
    [70]阳爱民,胡运发,周咏梅.一种模糊分类器集成的方法[J].计算机工程,2006,32(17):44-47.
    [71]李春生,王耀南,陈光辉,等.基于层次分析法的模糊分类优选模型[J].控制与决策,2009,24(12):1881-1884.
    [72]张铃.基于核函数的SVM与三层前向神经网络的关系[J].计算机学报,2002,25(7):696-700.
    [73]张铃.机器学习中的覆盖算法与核函数法[A].第12届全国神经计算学术大会[C]北京,2002,pp.78-83.
    [74]杨金福,吴福朝.分类判别的覆盖算法研究[J].电子与信息学报,2007,29(7):1726-1730.
    [75]吴涛,尚丽,王伟,等.基于关联规则的覆盖领域约简算法[J].计算机工程,2008,34(5):57-59.
    [76]唐晓衡,夏利民.基于Boosting模糊分类的入侵检测[J].计算机工程,2008,34(5):225-227.
    [77]张红云.基于主曲线的相似字符模糊分类方法[J].模式识别与人工智能,2005,18(6):758-762.
    [78]张茂元,邹春燕,卢正鼎.一种基于变调整学习规则的模糊网页分类方法研究[J].计算机研究与发展,2007,44(1):99-104.
    [79]KAZAWA H, IZUMITANI T, TAIRA H, et al. Maximal margin labeling for multi-topic text categorization [A]. Proceedings of Advances in Neural Information Processing Systems 17[C] Cambridge:MIT Press,2005.
    [80]DIPLARIS S, TSOUMAKAS G, MITKAS P, et al. Protein classification with multiple algorithms[A]. Proceedings of the 10th Panhellenic Conference on Informatics[C], Vol.448-456 Volos:Springer,2005.
    [81]LI TAO, OGIHARA M. Detecting Emotion in Music[A]. Proceedings of the International Symposium on Music Information Retrieval[C] Washington D.C,2003, pp.239-240.
    [82]BOUTELL M.R, LUO J, SHEN X.P, et al. Learning multi-label scene classification[J]. Pattern Recognition,2004,37(9):1757-1771.
    [83]ZHANG M.L, ZHOU Z.H. Multi-label neural networks with applications to functional genomics and text categorization[J]. IEEE Transactions on Knowledge and Data Engineering,2006,18(10):1338-1351.
    [84]GODBOLE S, SARAWAGI S. Discriminative methods for multi-labeled classification[A]. Proceedings of the 8th Pacific-Asia Conference on Knowledge Discovery and Data Mining(PAKDD 2004)[C],2004, pp.22-30.
    [85]MCCALLUM A.K. Multi-label text classification with a mixture model trained by EM[A]. Proceedings of the AAAI'99 Workshop on Text Learning[C] Orlando, 1999.
    [86]ZHANG M.L, ZHOU Z.H. A k-nearest neighbor based algorithm for multi-label classification[A]. Proceedings of the 1st IEEE International Conference on Granular Computing[C] Beijing,2005, pp.718-721.
    [87]ZHANG M.L, ZHOU Z.H. ML-KNN:a lazy learning approach to multi-label learning[J]. Pattern Recognition,2007,40(7):2038-2048.
    [88]SCHAPIRE R.E, SINGER Y. Boostexter:a boosting-based system for text categorization[J]. Machine Learning,2000,39(2-3):135-168.
    [89]TSOUMAKAS G, KATAKIS I. Multi-label classification:an overview[J]. International Journal of Data Warehousing and Mining,2007,3(3):1-13.
    [90]ELISSEEFF A, WESTON J. A kernel method for multi-labelled classification[A]. Advances in Neural Information Processing Systems 14[C] Cambridge:MIT Press, 2001, pp.681-687.
    [91]陈晓峰,王士同,曹苏群.半监督多标记学习的基因功能分析[J].智能系统学报,2008,3(1):83-90.
    [92]ZHOU Z.H, ZHANG M.L. Multi-instance multi-label learning with application to scene classification[A]. Advances in Neural Information Processing Systems 19 (NIPS'06) [C] Vancouver:MIT Press,2007, pp.1609-1616.
    [93]LIU Y, JIN R, YANG L. Semi-supervised multi-label learning by constrained non-negative matrix factorization [A]. Proc of the 21st National Conf on Artificial Intelligence(AAAIp06) [C] Menlo Park:AAAI Press,2006, pp.421-426.
    [94]姜远.一种直推式多标记文档分类方法[J].计算机研究与发展,2008,45(11):1817-1823.
    [95] YAROWSKY D. Unsupervised word sense disambiguation rivaling supervised methods[A]. Proceedings of 33rd Annual Meeting of the Association for Computational Linguistics[C] Cambridge:MIT Press,1995, pp.189-196.
    [96]BLUM A, MITCHELL T. Combining labeled and unlabeled data with co-training[A]. Proceedings of the 11th Annual Conference on Computational Learning Theory[C] New York:ACM Press,1998, pp.92-100.
    [97]ZHOU Z.H, LI M. Tri-training:exploiting unlabeled data using three classifiers[J]. IEEE Trans on Knowledge and Data Engineering,2005,17(11):1529-1541.
    [98]HANSCH C, FUJITA T. ρ-σ-π analysis:A method for the correlation of biological activity and chemical structure[J]. Journal of the American Chemical Society,1964, (8):1616-1626.
    [99]HANSCH C, LEO A.J. Substituent constants for correlation analysis in chemistry and biology[M]. New York:Wiley Interscience,1979.
    [100] MARON O. Learning from ambiguity[D]:MIT,1998.
    [101] MARON O, LOZANO-P'EREZ T. A framework for multiple-instance learning[A]. Advances in Neural Information Processing Systems 10[C] Cambridge: MIT Press,1998, pp.570-576.
    [102] ZHANG Q, GOLDMAN S.A. EM-DD:an improved multi-instance learning technique[A]. Advances in Neural Information Processing Systems 14[C] Cambridge: MIT Press,2002, pp.1073-1080.
    [103] WANG J, ZUCKER J.D. Solving the multiple-instance problem:a lazy learning approach[A]. Proceedings of the 17th International Conference on Machine Learning[C] San Francisco,2000, pp.1119-1125.
    [104] ZHOU Z.H, JIANG K, LI M. Multi-instance learning based web mining[J]. Applied Intelligence,2005,22(2):135-147.
    [105] ZHOU Z.H, XUE X.B, JIANG YUAN.2005[A]. Proceedings of the 18th Australian Joint Conference on Artificial Intelligence (AJCAI'05)[C] Sydney:LNAI 3809, Locating regions of interest in CBIR with multi-instance learning techniques, pp.92-101.
    [106] ZHOU Z.H, ZHANG M.L. Neural networks for multi-instance learning[R]. Nanjing:AI Lab, Department of Computer Science & Technology, Nanjing University, 2002.
    [107] ZHANG M.L, ZHOU Z.H. Improve multi-instance neural networks through feature selection[J]. Neural Processing Letters,2004,19(1):1-10.
    [108] RUFFO G. Learning single and multiple instance decision tree for computer security applications[D]:Department of Computer Science, University of Turin, Torino, Italy,2000.
    [109] CHEVALEYRE Y, ZUCKER J.D. Solving multiple-instance and multiple-part learning problems with decision trees and decision rules. Application to the mutagenesis problem [A]. Proceedings of the 14th Biennial Conference of the Canadian Society for Computational Studies of Intelligence[C] Ottawa:LNAI 2056, 2001, pp.204-214.
    [110] ANDREWS S, TSOCHANTARIDIS I, HOFMANN T. Support vector machines for multiple-instance learning[A]. Advances of Neural Information Processing Systems 15[C] Cambridge:MIT Press,2003, pp.561-568.
    [111] ZHOU Z.H, ZHANG M.L. Ensembles of multi-instance learners[A]. Proceedings of the 14th European Conference on Machine Learning[C] Dubrovnik: LNAI 2837,2003, pp.492-502.
    [112] WEIDMANN N, FRANK E, PFAHRINGER B. A two-level learning method for generalized multi-instance problems[A]. Lecture Notes in Artificial Intelligence 2837[C] Berlin:Springer-Verlag,2003, pp.468-479.
    [113] RAY S, CRAVERN M. Supervised versus multiple instance learning:An empirical comparison[A]. Proceedings of 22nd International Conference on Machine Learning [C] Bonn,2005, pp.697-704.
    [114] GEHLER P, CHAPELLE O. Deterministic Annealing for Multiple-Instance Learning[A]. Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics[C] Brookline,2007, pp.123-130.
    [115] ZHOU Z.H. Multi-instance learning from supervised view[J]. Journal of Computer Science and Technology,2006,21(5):800-809.
    [116] HUTTENLOCHER D.P,KLANDERMAN G.A, RUCKLIDGE W.J. Comparing Images Using the Hausdorff Distance[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence 1993,15(9):850-863.
    [117] DUBUISSON M.P, JAIN A.K. A Modified Hausdorff Distance for Object Matching[A]. Proceedings of the 12th IAPR International Conference on Pattern Recognition[C] Jerusalem,1994, pp.566-568.
    [118] ZHOU Z.H, ZHANG M.L. Solving multi-instance problems with classifier ensemble based on constructive clustering[J]. Knowledge and Information Systems, 2007,11(2):155-170.
    [119] ZHOU Z.H. Multi-instance learning:A survey[R]. Nanjing:AI Lab, Department of Computer Science & Technology, Nanjing University,2004. Canadian Society for Computational Studies of Intelligence[C] Ottawa:LNAI 2056, 2001, pp.204-214.
    [110] ANDREWS S, TSOCHANTARIDIS I, HOFMANN T. Support vector machines for multiple-instance learning[A]. Advances of Neural Information Processing Systems 15[C] Cambridge:MIT Press,2003, pp.561-568.
    [111] ZHOU Z.H, ZHANG M.L. Ensembles of multi-instance learners[A]. Proceedings of the 14th European Conference on Machine Learning[C] Dubrovnik: LNAI 2837,2003, pp.492-502.
    [112] WEIDMANN N, FRANK E, PFAHRINGER B. A two-level learning method for generalized multi-instance problems[A]. Lecture Notes in Artificial Intelligence 2837[C] Berlin:Springer-Verlag,2003, pp.468-479.
    [113] RAY S, CRAVERN M. Supervised versus multiple instance learning:An empirical comparison[A]. Proceedings of 22nd International Conference on Machine Learning [C] Bonn,2005, pp.697-704.
    [114] GEHLER P, CHAPELLE O. Deterministic Annealing for Multiple-Instance Learning[A]. Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics[C] Brookline,2007, pp.123-130.
    [115] ZHOU Z.H. Multi-instance learning from supervised view[J]. Journal of Computer Science and Technology,2006,21(5):800-809.
    [116] HUTTENLOCHER D.P,KLANDERMAN G.A, RUCKLIDGE W.J. Comparing Images Using the Hausdorff Distance[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence 1993,15(9):850-863.
    [117] DUBUISSON M.P, JAIN A.K. A Modified Hausdorff Distance for Object Matching[A]. Proceedings of the 12th IAPR International Conference on Pattern Recognition[C] Jerusalem,1994, pp.566-568.
    [118] ZHOU Z.H, ZHANG M.L. Solving multi-instance problems with classifier ensemble based on constructive clustering[J]. Knowledge and Information Systems, 2007,11(2):155-170.
    [119] ZHOU Z.H. Multi-instance learning:A survey[R]. Nanjing:AI Lab, Department of Computer Science & Technology, Nanjing University,2004.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700