连续数据离散化方法研究

英文题名：Research on Discretization Methods for Continuous Data
作者：桑雨
论文级别：博士
学科专业名称：计算机应用技术
中文关键词：连续数据离散化 ; 最小描述长度理论 ; 高维数据 ; 降维
英文关键词：Discretization of Continuous Data ; Minimum Description Length Principle ; High-Dimensional Data ; Dimension Reduction
学位年度：2012
导师：李克秋
学科代码：081203
学位授予单位：大连理工大学
论文提交日期：2012-06-18

摘要

随着数据量的爆炸性增长以及信息技术的高速发展,数据挖掘与机器学习已成为当今研究的热点。目前,现实世界中往往呈现连续属性值的数据,而很多数据挖掘与机器学习分类算法仅仅适用离散属性值的数据。因此,必须将连续属性值的数据进行离散化,否则,这些分类学习算法无法正常工作。针对此问题,本文系统分析了现有的连续数据离散化方法,并从离散化标准等方面进行了深入研究,主要包括：
     (1)提出一种单属性与多属性相结合的自底向上离散化方法,在考虑属性间关系的同时,综合衡量各相邻区间对之间的差异,寻找最好的合并区间。首先,我们通过最小描述长度理论和连续属性中相邻区间对的重要性,提出一种结合单属性与多属性的离散化标准,并在理论上分析了此标准的优势；进一步,基于此标准,提出一种启发式的自底向上离散化算法,寻找最优的离散化结果；最后,在UCI数据集上的实验结果表明,与现有的离散化方法相比,此方法显著提高了C4.5决策树与支持向量机分类器的学习精度。
     (2)提出一种基于非线性降维技术的高维数据离散化方法,有效解决了高维非线性数据的离散化问题。首先,我们提出一种基于局部邻域优化的线性嵌入算法,将高维数据降维至低维空间中,有效保持了原始数据的几何关系结构。该算法克服了数据的几何关系结构容易被扭曲的缺陷；其次,提出一种基于面积的卡方离散化算法,从概率的角度考虑每对区间被合并的可能性,有效离散低维数据空间中的每个连续属性。实验结果表明,此方法得到了较好的离散化结果以及更简化的知识,提高了分类器的学习精度。另外,此方法应用在计算机视觉和图像分类中,取得了很好的效果。
     (3)提出一种改进卡方统计的数据离散化方法,提高了基于统计独立性离散化方法的质量。首先,我们分析了卡方函数中自由度选取的不足,给出了自由度选取的修正方案；其次,根据数据类分布等特点,提出了期望频数的改进方案,克服了不同数据集赋予相同期望频数的缺陷,提高了卡方计算的准确性。实验结果表明,改进的方法产生了较高的类属性相互依赖冗余值,并显著提高了C4.5决策树与Naive贝叶斯分类器的学习精度。
With the explosive growth of the amount of data and the rapid development of information technology, data mining and machine learning have become a hot research currently. At present, a large number of data with continuous attribute values are presented in the real world. However, many classification algorithms in data mining and machine learning are only applied to data with discrete attribute values. Therefore, data with continuous attribute values must be discretized. Otherwise, these classification algorithms do not work properly. To solve this problem, we systematically analyze existing discretization methods of continuous data and study them in-depth from different aspects such as discretization criterion. The main contributions of this dissertation can be summarized as follows:
     (1) A combined single attribute and multi-attribute bottom-up discretization method is pro-posed. It not only considers the correlations among the attributes, but also synthetically evalu-ates the variance among the adjacent interval pairs. This aims to find the best merged intervals. First, we propose a combined single attribute and multi-attribute discretization criterion, which is derived by minimum description length principle and significance of adjacent interval pairs among continuous attributes. The advantage of the criterion is further analyzed. Furthermore, we develop a heuristic bottom-up discretization algorithm to find the optimal discretization result based on the criterion. Finally, empirical experiments on UCI data sets show that the proposed method significantly improves the learning accuracy of C4.5decision tree and support vector machine classifier compared with existing discretization methods.
     (2) A discretization method for disposing high-dimensional data based on nonlinear dimension reduction technique is proposed. It solves the discretization problem of high-dimensional data. First, we propose a locally linear embedding algorithm based on local neigh-borhood optimization. It maps high-dimensional data into a low-dimensional space and ensures to keep geometric correlation structure of the original data. This algorithm overcomes the defi-ciency that the geometric correlation structure of the data is easily distorted when mapping data. Second, we propose an area-based chi-square discretization algorithm. It effectively discretizes each continuous attribute in the low-dimensional space by considering the possibility of being merged for each interval pair from the view of probability. The experimental results show that the proposed method yields a better discretization result and more concise knowledge of the data. It improves the learning accuracy of classifiers. In addition, the proposed discretization method has been applied to computer vision and image classification, and achieves a good result.
     (3) A data discretization method based on improved chi-square statistic is proposed. It im-proves the quality of discretization methods based on statistical independence. First, we analyze the deficiency of the selection of degree of freedom in chi-square function and give a modified scheme for selection of degree of freedom. Second, we propose an improved scheme for ex-pected frequency according to data distribution, which overcomes the deficiency that different datasets have the same expected frequency. This improves the accuracy of chi-square calcula-tion. The experimental results show that the improved method generates higher class-attribute interdependence redundancy value and significantly improves the learning accuracy of C4.5de-cision tree and Naive bayes classifier.

引文

[1]Wu X. Top 10 algorithms in data mining[J]. Knowledge Information System,2008,14(1):1-37.
    [2]Webb G. Multiboosting:a technique for combining boosting and wagging[J]. Machine Learning,2000, 40(2):159-196.
    [3]Charles L, Yang Q. Discovering classification from data of multiple sources[J]. Data Mining and Knowledge Discovery,2006,12(2-3):180-201.
    [4]Vadera S. Csnl:a cost-sensitive non-linear decision tree algorithm[J]. ACM Transactions on Knowl-edge Discovery from Data,2010,4(2):1-25.
    [5]Kaufman K, Michalski R. Learning from inconsistent and noisy data:the aq18 approach[C]. Proceed-ings of Eleventh International Symposium on Methodologies for Intelligent Systems.1999:411-419.
    [6]Quinlan J. C4.5:programs for machine learning[M]. San Mateo, Calif.:Morgan Kaufmann,1993.
    [7]Langley P. Induction of recursive bayesian classifiers[C]. In:Proceedings of The European Conference on Machine Learning, P.Brazdil, Berlin, Germany.1993:153-164.
    [8]Breiman L, Friedman J, Olshen R, et al. Classification and regression trees[M]. CRC Press.1998.
    [9]Hsu C, Huang H, Wong T. Why discretization works for naive bayesian classifiers[C]. In:Proceedings of The Seventeenth International Conference on Machine Learning.2000:309-406.
    [10]Kerber R. Chimerge:discretization of numeric attributes[C]. Proceedings of Ninth National Confer-ence on Artificial Intelligence, San Jose, California, AAAI Press.1992:123-128.
    [11]Su C, Hsu J. An extended chi2 algorithm for discretization of real value attributes[J]. IEEE Transactions on Knowledge and Data Engineering,2005,17(3):437-441.
    [12]Liu H, Hussain F, Tan C, et al. Discretization:an enabling technique[J]. Journal of Data Mining and Knowledge Discovery,2002,6(4):393-423.
    [13]Tsai C, Lee C, Yang W. A discretization algorithm based on class-attribute contingency coefficient[J]. Information Sciences,2008,178(17):714-731.
    [14]Gupta A, Mehrotra K, Mohan C. A clustering based discretization for supervised learning[J]. Statistics and Probability Letters,2009,80(9-10):816-824.
    [15]Liu H, Setiono R. Feature selection via discretization[J]. IEEE Transactions on Knowledge and Data Engineering,1997,9(4):642-645.
    [16]Tay E, Shen L. A modified chi2 algorithm for discretization[J], IEEE Transactions on Knowledge and Data Engineering,2002,14(3):666-670.
    [17]Boulle M. Khiops:A statistical discretization method of continuous attributes[J]. Machine Learning, 2004,55(1):53-69.
    [18]Zighed D, Rabaseda S, Rakotomalala R. Fusinter:a method for discretization of continuous at-tributes[J]. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems,1998, 6(3):307-326.
    [19]Wang K, Liu B. Concurrent discretization of multiple attributes[C]. In Pacific-Rim International Con-ference on Artificial Intelligent.1998:250-259.
    [20]Keivan K, Mohammed A, Reda A. Fuzzy clustering-based discretization for gene expression classifi-cation[J]. Knowledge Information System,2010,24(3):441-465.
    [21]Ching J, Wong A, Chan K. Class-dependent discretization for inductive learning from continuous and mixed-mode data[J]. IEEE Transactions Pattern Analysis and Machine Intelligence,1995,17(7):641-651.
    [22]Kurgan L, Cios K. Caim discretization algorithm[J]. IEEE Transactions on Knowledge and Data Engineering,2004,16(2):145-153.
    [23]Liu L, Wong A, Wang Y. A global optimal algorithm for class-dependent discretization of continuous data[J]. Intelligent Data Analysis,2004,8(2):151-170.
    [24]杨萍,杨天社,杜小宁.一种基于类别属性关联程度最大化离散算法[J].控制与决策,2011,26(4)：592-596.
    [25]Fayyad U, Irani K. Multi-interval discretization of continuous-valued attributes for classification learn-ing[C]. In Proceedings of Thirteenth International Joint Conference on Artificial Intelligence. San Ma-teo, CA:Morgan Kaufmann.1993:1022-1027.
    [26]Catlett J. On changing continuous attributes into ordered discrete attributes[C]. In Proceedings of Fifth European Working Session on Learning, Berlin:Springer-Verlag.1991:164-177.
    [27]Chiu D, Wong A, Cheung B. Information discovery through hierarchical maximum entropy dis-cretization and synthesis[M]. In:G. Piatetsky-Shapiro, W. J. Frawley (Eds.), Knowledge Discovery in Databases, AAAI Press,1991.
    [28]谢宏,程浩忠,牛东晓.基于信息熵的粗糙集连续属性离散化算法[J].计算机学报,2005,28(9)：1570-1574.
    [29]Wu Q, Bell D, Prasad G, et al. A distribution-index-based discretizer for decision-making with sym-bolic ai approaches[J]. IEEE Transactions on Knowledge and Data Engineering,2007,19(1):17-28.
    [30]Lee C. A hellinger-based discretization method for numeric attributes in classification learning[J]. Knowledge-Based Systems,2007,20(4):419-425.
    [31]Dougherty J, Kohavi R, Sahami M. Supervised and unsupervised discretization of continuous fea-ture[C]. Proceedings of 12th International Conference of Machine Learning, San Mateo, California: Morgan Kaufmann.1995:194-202.
    [32]Biba M, Esposito F, Ferilli S, et al. Unsupervised discretization using kernel density estimation[C]. The Twentieth International Joint Conference on Artificial Intelligence.2007:696-701.
    [33]Schmidberger G, Frank E. Unsupervised discretization using tree-based density estimation[C]. The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases.2005:240-251.
    [34]李刚.基于混合概率模型的无监督离散化算法[J].计算机学报,2002,25(2)：158-164.
    [35]Ho K, Scott P. Zeta:a global method for discretization of continuous variables[C]. In KDD97:3rd International Conference of Knowledge Discovery and Data Mining. Newport Beach, CA.1997:191-194.
    [36]Holte R. Very simple classification rules performwell on most commonly used datasets[J]. Machine Learning,1993,11(1):63-91.
    [37]Kohonen T. Self-organization and associative memory[M]. Springer-Verlag, Berlin, Germany,1989.
    [38]Quinlan J. Induction of decision trees[J]. Machine Learning,1986, 1(1):81-106.
    [39]Fayyad U, Irani K. On the handling of continuous-valued attributes in decision tree generation[J]. Machine Learning,1992,8(1):87-102.
    [40]Cios K, Kurgan L. Clip4:hybrid inductive machine learning algorithm that generates inequality rules[J]. Information Sciences,2004,163(1-3):37-83.
    [41]Huang W. Discretization of Continuous Attributes for Inductive Machine Learning[D]. PhD thesis, master's thesis, Department of Computer Science, University of Toledo, Ohio,1996.
    [42]Yang Y, Webb G. Discretization for naive-bayes learning:managing discretization bias and variance[J]. Machine Learning,2009,74(1):39-74.
    [43]Boulle M. Compression-based averaging of selective naive bayes classifiers[J]. Journal of Machine Learning Research,2007,8(7):1659-1685.
    [44]Bay S. Multivariate discretization for set mining[J]. Knowledge and Information Systems,2001, 3(4):491-512.
    [45]Boulle M. Optimum simultaneous discretization with data grid models in supervised classification:a bayesian model selection approach[J]. Advances in Data Analysis and Classification,2009,3(1):39-61.
    [46]Ferrandiz S, Boulle M. Multivariate discretization by recursive supervised bipartition of graph[C]. In: Proceedings of 4th International Conference on Machine Learning and Data Mining, Leipzig, Germany. 2005:253-264.
    [47]Mehta M, Parthasarathy S, Yang H. Toward unsupervised correlation preserving discretization[J]. IEEE Transactions on Knowledge and Data Engineering,2005,17(8):1-14.
    [48]Kang Y, Wang S, Liu X, et al. An ica-based multivariate discretization algorithm[C]. International Conference of Knowledge Science, Engineering and Management.2006:556-562.
    [49]Pawlak Z. Rough sets[J]. International Journal of Computer and Information Sciences,1982, 11(5):341-356.
    [50]王国胤Rough集理论与知识获取[M].西安：西安交通大学出版社,2001.
    [51]Ziarko W. Variable precision rough set model[J]. Journal of Computer and System Science,1993, 46(1):39-59.
    [52]Wong A, Liu T. Typicality, diversity and feature pattern of an ensemble[J]. IEEE Transactions on Computers,1975,24(2):158-181.
    [53]Au W, Chan K, Wong A. A fuzzy approach to partitioning continuous attributes for classification[J]. IEEE Transactions on Knowledge and Data Engineering,2006,18(5):715-719.
    [54]赵静娴,倪春鹏,詹原瑞.一种高效的连续属性离散化算法[J].系统工程与电子技术,2009,31(1)：195-199.
    [55]Kononenko I. Naive bayesian classifier and continuous attributes[J]. Informatica,1992,16(1):1-8.
    [56]Fayyad U, Irani K. Discretizing continuous attributes while learning bayesian networks[C]. In Proceed-ings of Thirteenth International Conference on Machine Learning, Morgan Kaufmann.1996:157-165.
    [57]Langley P, Iba W, Thompson K. An analysis of bayesian classifiers[C]. International Conference on Artificial Intelligence, San Jose, California.1992:223-228.
    [58]Yang Y, Webb G. A comparative study of discretization methods for naive-bayes classifiers[C]. In Proceedings of Pacific Rim Knowledge Acquisition Workshop.2002:159-173.
    [59]Yang Y, Webb G. Non-disjoint discretization for naive-bayes classifiers[C]. In Proceedings of The Nineteenth International Conference on Machine Learning.2002:666-673.
    [60]Yang Y, Webb G. Weighted proportional k-interval discretization for naive-bayes classifiers[C]. Pacific-Asia Conference on Knowledge Discovery and Data Mining.2003:501-512.
    [61]Yang Y, Webb G. Proportional k-interval discretization for naive-bayes classifiers[C]. In Proceedings of The Twelfth European Conference on Machine Learning.2001:564-575.
    [62]Pazzani M. An iterative improvement approach for the discretization of numeric attributes in bayesian classifiers[C]. In:Proceedings of The First International Conference on Knowledge Discovery and Data Mining.1995.
    [63]王飞,刘大有,薛万欣.基于遗传算法的bayesian网中连续变量离散化的研究[J].计算机学报,2002,25(8)：794-800.
    [64]Silverman B. Density estimation for statistics and data analysis[M]. Chapman and Hall, London,1986.
    [65]Torgo L, Gama J. Search-based class discretization[C]. In:Proceedings of The Ninth European Con-ference on Machine Learning.1997:266-273.
    [66]Jin R, Breitbart Y, Muoh C. Data discretization unification[C]. In:Proceedings of The Seventh IEEE International Conference on Data Mining.2007:183-192.
    [67]Jin R M, Breitbart Y, Muoh C. Data discretization unification[J]. Knowledge and Information System, 2008,19(1):115-142.
    [68]Mussard S, Seyte F, Terraza M. Decomposition of gini and the generalized entropy inequality mea-sures[J]. Economic Bulletin,2003,4(7):1-6.
    [69]Hand D, Mannila H, Smyth P. Principles of data mining[M]. MIT Press,2001.
    [70]Hansen M, Yu B. Model selection and the principle of minimum description length[J]. Journal of the American Statistical Association,2001,96(454):746-774.
    [71]Ruiz F, Angulo C, Agell N. Idd:a supervised interval distance-based method for discretization[J]. IEEE Transactions on Knowledge and Data Engineering,2008,20(9):1230-1238.
    [72]Boulle M. Modl:a bayes optimal discretization method for continuous attributes [J]. Machine Learning, 2006,65(1):131-165.
    [73]Boulle M. A bayes optimal approach for partitioning the values of categorical attributes [J]. Journal of Machine Learning Research,2005,6(9):1431-1452.
    [74]Bondu A, Boulle M, Lemaire V, et al. A non-parametric semi-supervised discretization method[C]. The Eighth IEEE International Conference on Data Mining.2008:53-62.
    [75]Liu X, Wang H. A discretization algorithm based on a heterogeneity criterion[J]. IEEE Transactions on Knowledge and Data Engineering,2005,17(9):1166-1173.
    [76]Shang L, Yu S, Jia X, et al. Selection and optimization of cut-points for numeric attribute values[J]. Computers and Mathematics with Applications,2009,57(6):1018-1023.
    [77]Park C, Lee M. A svm-based discretization method with application to associative classification[J]. Expert Systems with Applications,2009,36(3):4784-4787.
    [78]陈恩红,王清毅,蔡庆生.基于决策树学习中的测试生成及连续属性的离散化[J].计算机研究与发展,1998,35(5)：403-407.
    [79]李兴生,李德毅.一种基于云模型的决策表连续属性离散化方法[J].模式识别与人工智能,2003,16(1)：33-38.
    [80]Roy A, Pal S. Fuzzy discretization of feature space for a rough set classifier[J]. Pattern Recognition Letters,2003,24(6):895-902.
    [81]Ribeiro M, Traina A, Traina C, et al. An association rule-based method to support medical image diagnosis with efficiency[J]. IEEE Transactions on Multimedia,2008,10(2):277-285.
    [82]Polat K, Karab S, Guven A, et al. Utilization of discretization method on the diagnosis of optic nerve disease[J]. Computer Methods and Programs in Biomedicine,2008,91(3):255-264.
    [83]Joao G, Carlos P. Discretization from data streams:applications to histograms and data mining[C]. Proceedings of The 2006 ACM Symposium on Applied Computing.2006:23-27.
    [84]Jonathan L, Shyam V, Vanathi G, et al. Application of an efficient bayesian discretization method to biomedical data[J]. BMC Bioinformatics,2011,12(309):1471-2105.
    [85]Kirkpong K, Nitin A. Use of supervised discretization with pca in wavelet packet transformation-based surface electromyogram classification[J]. Biomedical Signal Processing and Control,2009,4(2):127-138.
    [86]Sellappan R, Tan K. Discretization of continuous valued dimensions in olap data cubes[J]. International Journal of Computer Science and Network Security,2008,8(11):116-126.
    [87]Grunwald P, Myung J, Pitt M. Advances in minimum description length:theory and applications[M]. The MIT Press,2004.
    [88]Charles L, Zhang H. The representational power of discrete bayesian networks[J]. Journal of Machine Learning Research,2002,3(12):709-721.
    [89]Wang H, Zaniolo C. Cmp:a fast decision tree classifier using multivariate predictions[C].16th Inter-national Conference on Data Engineering.2000:449-460.
    [90]Vitanyi P, Li M. Minimum description length induction, bayesianism, and kolmogorov complexity [J]. IEEE Transactions on Information Theory,2000,46(2):446-464.
    [91]Rissanen J. Modeling by shortest data description automatica[J]. Automatica,1978,14(5):465-471.
    [92]Fazlollah M. An introduction to information theory[M]. Dover Publications, Inc., New York,1994.
    [93]Jain A, Duin R, Mao J. Statistical pattern recognition:a review[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(1):4-37.
    [94]Abramowitz M, Stegun I. Handbook of mathematical functions[M]. New York:Dover Publications Inc,1970.
    [95]Agresti A. Categorical data analysis[M]. Wiley, New York,1990.
    [96]Cover T, Thomas J. Elements of information thoery[M].2nd edn. Wiley, New York,2006.
    [97]Nguyen H, Skowron A. Quantization of real values attributes, rough set and boolean reasoning ap-proaches[C]. In:Proceedings of the 2nd Joint Annual Conference on Information Science, Wrightsville Beach, NC.1995:34-37.
    [98]Weiss S, Kulikowski C. Computer systems that learn:classification and prediction methods from statis-tics, neural nets[M]. Machine Learning, and Expert Systems. San Mateo, Calif:Morgan Kaufmann, 1990.
    [99]Demsar J. Statistical comparisons of classifiers over multiple data sets[J]. Journal of Machine Learning Research,2006,7(1):1-30.
    [100]Zar J. Biostatistical analysis (4th edition)[M]. Prentice Hall, Englewood Clifs, New Jersey,1998.
    [101]Hsu C, Lin C. A comparison of methods for multiclass support vector machines[J]. IEEE Transactions on Neural Networks,2002,13(2):415-425.
    [102]Li G, Wang M, Zeng H. An introduction to support vector machines and other kernel-based learning methodsfM]. Beijing:Publishing House of Electronics Industry,2000.
    [103]Lee C S, Ahmed E. Human motion synthesis by motion manifold learning and motion primitive seg-mentation [C]. International Conference of Articulated Motion and Deformable Objects, Andratx, Mal-lorca, Spain.2006:464-473.
    [104]宋怀波,何东健.面向精细农业的高维数据本征维数估计方法研究进展[J].中国科学：信息科学,2010,40(增)：104-110.
    [105]朱明旱,罗大庸.2dfld与lpp相结合的人脸和表情识别方法[J].模式识别与人工智能,2009,22(1)：60-63.
    [106]Fang Y, Vishwanathan S, Sun M T, et al. Slle:spherical locally linear embedding with applications to tomography[C]. Proceedings of IEEE Computer Vision and Pattern Recognition.2011:1077-1129.
    [107]Bellman R. Adaptive control processes:a guided tour[M]. Princeton:Princeton University Press, 1961.
    [108]Saul L, Roweis S. Think globally, fit locally:unsupervised learning of low dimensional manifold[J]. Journal of Machine Learning Research,2003,4(12):119-155.
    [109]Seung H, Lee D. The manifold ways of perception[J]. Science,2000,290(5500):2268-2269.
    [110]Jolliffe I. Principal component analysis[M]. Springer-Verlag, New York,1986.
    [111]Fisher R. The use of multiple measurements in taxonomic problems[J]. Annals of Eugenics,1936, 7(2):179-188.
    [112]Cox T, Cox M. Multi-dimensional scaling[M]. London:Chapman and Hall,1994.
    [113]Roweis S, Saul L. Nonlinear dimensionality reduction by locally linear embedding[J]. Science,2000, 290(5500):2323-2326.
    [114]Zhang Z, Zha H. Principal manifolds and nonlinear dimension reduction via local tangent space align-ment[J]. SIAM Journal of Scientific Computing,2005,26(1):313-338.
    [115]Tenenbaum J, Sliva V, Landford J. A global geometric framework for nonlinear dimensionality reduc-tion[J]. Science,2000,290(5500):2319-2323.
    [116]Belkin M, Niyogi P. Laplacian eigenmaps for dimensionality reduction and representation[J]. Neural Computation,2003,15(6):1373-1396.
    [117]Donoho D, Grimes C. Hessian eigenmaps:locally linear embedding, techniques for high-dimensional data[C]. Proceedings of the National Academy of Sciences of the United States of America (PNAS). 2003:5591-5596.
    [118]He X, Niyogi P. Locality preserving projections[C]. Advances in Neural Information Processing Sys-tems, Vancouver, British Columbia, Canada.2003:153-160.
    [119]He X, Cai D, Yan S. Neighborhood preserving embedding[C]. In Proceeding of IEEE International Conference Computer Vision.2005:1208-1213.
    [120]Kokiopoulou E, Saad Y. Orthogonal neighborhood preserving projections[C]. IEEE International Con-ference of Data Mining, Houston, Texas, America.2005:234-241.
    [121]Zhang T, Yang J, Zhao D. Linear local tangent space alignment and application to face recognition[J]. Neuro-computing,2007,70(7-9):1547-1553.
    [122]Zhang T H, Tao D C, Li X L, et al. Patch alignment for dimensionality reduction[J]. IEEE Transactions on Knowledge and Data Engineering,2009,21(9):1299-1313.
    [123]Zhang T, Huang K, Li X. Discriminative orthogonal neighborhood-preserving projections for classifi-cation[J]. IEEE Transactions on System, Man, and Cybernetics-Part B:Cybernetics,2010,40(1):253-263.
    [124]Vlachos M, Domeniconi C, Gunopulos D, et al. Non-linear dimensionality reduction techniques for classification and visualization[C]. In:Proceedings of Eighth ACM SIGKDD International Conference of Knowledge Discovery and Data Mining.2002:1-7.
    [125]Pettis K W, Bailey T A, Jain A K, et al. An intrinsic dimensionality estimator from near-neighbor information[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1979,1(1):25-37.
    [126]Levina E, Bickel P. Maximum likelihood estimation of intrinsic dimension[C]. In Advances in Neural Information Processing Systems.2005:777-784.
    [127]Camastra F, Vinciarelli A. Estimating the intrinsic dimension of data with a fractal-based approach[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(10):1404-1407.
    [128]卞国瑞,吴立德,李贤平.概率论(第二册数理统计)[M].北京：人民教育出版社,1979.
    [129]Bian G, Wu L, Li X, et al. Probability theory (volume 2, mathematical statistics)[M]. People's Educa-tion Press, Beijing,1979.
    [130]Nene S, Nayar S, Murase S. Columbia object image library (coil2100)[M]. New York:Columbia University,1996.
    [131]Michalski R, Mozetic I, Hong J, et al. The multipurpose incremental learning system aq15 and its testing application to three medical domains[C]. Proceedings of Fifth National Conference on Artificial Intelligence, Pennsylvania, AAAI Press.1986:1041-1045.