自然最近邻居在高维数据结构学习中的应用

英文题名：Learning Structure Features in High Dimensional Data Based on Natural Nearest Neighbor
作者：邹咸林
论文级别：博士
学科专业名称：计算机应用技术
中文关键词：自然最近邻居 ; 自适应流形学习 ; 谱图聚类 ; 通用随机采样 ; 离群行为分析
英文关键词：Natural Nearest Neighbor (3N) ; Adaptive Manifold Learning ; Spectral Graph Clustering ; Universal Random Sampling ; Outlying Behavior Analysis
学位年度：2011
导师：朱庆生
学科代码：081203
学位授予单位：重庆大学
论文提交日期：2011-10-01

摘要

对大规模高维数据结构的分析和研究一直是数据挖掘、机器学习和模式识别等领域中十分重要的研究课题之一,同时也是现代科学技术所必须面对的困难之一。高维数据的结构特征主要包括聚类结构和流形结构,所用到的研究方法涉及到了多个数学分支,如多元分析、非线性泛函分析及统计等,寻找简单而有效的方法是人们一直追求的目标。因此对高维数据结构的学习具有十分重要的价值。
     本文提出了一种新的最近邻居概念,自然最近邻居(Natural Nearest Neighbor : 3N),它是一种无尺度(scale-free)的最近邻居,其核心的思想是最离群的数据对象只有一个最近邻居或者说具有最低的能量,而稠密的对象有较多的邻居或较高的能量,而且,自然最近邻居的计算过程可以自动地实现,这是本文的主要贡献和创新。由自然最近邻居可以自动地形成一种自适应的数据最近邻域图或数据网络,借助复杂网络的概念,这也是一种无标度(scale-free)的网络,这种邻域图或网络可以很好地表示数据对象之间的关联关系,能够明显地给出各个数据对象的密度信息。自然最近邻居的形成机制,能够有效地降低跨边界寻找最近邻居的风险,因而可保持具有聚类结构的数据的凝聚状态,自然地展示出聚类数据的基础结构(infrastructure),为后续的聚类和流形结构分析提供一个稀疏和无参数的基础图模型。将这种自适应邻域图用于流形学习算法,如Isomap、LLE、HLLE、LTSA等,可得到相应的无自由参数的自适应流形学习算法,避免了传统流形学习算法中关于邻域参数的选择问题,原因在于由自然最近邻居所形成的邻域图能很好地逼近低维嵌入的数据流形。这为机器学习领域中的两大热点问题之一的维数约简(Dimensionality Reduction)方法提供了一个新的视角。
     本文还研究了离群数据对象的特征子空间结构与聚类特征子空间结构之间的关系,还提供了一种从邻居的角度来观察离群对象的行为。
     本文的主要创新和贡献包括如下几个方面。
     (1)提出了自然最近邻居(3N)的概念,并提供了一个十分简单的计算方法。在标准分布(均匀分布、正态分布)及规则数据集上验证了这种邻居概念的合理性。与k-最近邻居和ε-最近邻居相比含有更丰富的信息:如密度、离群信息以及结构逼近等。从自然最近邻居数目的分布(直方图)可以观察数据集的分布状态,这种分布与数据集的高维特性无关。
     (2)将自适应邻域图用于代表性的流形学习算法:如全局结构流形学习算法Isomap和局部结构流形学习算法LLE、HLLE及LTSA等算法,提出了无自由参数的自适应流形学习算法3N-Isomap、3N-LLE及3N-LTSA等,同时解决了近十年来流形学习算法中关于如何选择自由参数的问题。使任何人都可使用这批算法来观察自已领域范围内的数据,而不受困扰。在三个实际问题中应用了3N-Isomap算法,并提出了自动多流形学习算法、大规模流形学习算法(由一个通用的简单随机采样算法实现)及同质数据复杂性分析方法。
     (3)将自适应邻域图用于谱图聚类算法,如MNcut算法,提出了一个改进的算法3N-MNcut,其性能优于原聚类算法。
     (4)提出了离群点的特征子空间结构与聚类的特征子空间结构具有相同的特征依赖,即都依赖于最大特征值及相应的特征子空间。提出了一个新的离群指数,可以观察离群点的动态行为。
Analyzing and investigating the structure of data set with large-scale and high-dimensional has became a very important task in the fields of data mining, machine learning and pattern recognition, and faced us with a puzzled problem in modern science and technology conditions. The features of structure contained in high-dimensional data are usually displayed in clustering and manifold styles, the corresponding methods used to reveal the features involve many mathematics branches, such as multivariate analysis, functional analysis and statistics etc., but the simple and efficiency approaches have been received special attentions to researchers.
     In this paper, a novel concept in terms of nearest neighbor is proposed and named natural nearest neighbor (3N neighbor), in contrast to k-nn andε-nn, it is a scale-free nearest neighbor. The key idea is that only one neighbor should be taken as a nearest neighbor for some farthest outliers within data set, however the dense points should have more nearest neighbors, also the calculation of this type of neighbors for all points can be implemented in an automatic way, it is the highlight point included in the paper. Of course, an adaptive nearest neighborhood graph (ANNG) or data network induced by connecting each data point to its corresponding 3N neighbors can also be constructed automatically with respect to any data sets, by using the similar concept within complex network, it indicates a scale-free network closely associated to the whole data and presents a natural relationships among data points, where the density information is clearly displayed and can be obtained easily. By using of the mechanism of forming 3N neighbors, the risk of crossing the boundaries between classes to find the neighbors should be reduced significantly, the agglomerate status existed in original data set will be preserved within the corresponding ANNG or network where an infrastructure with primary form of clustering is maintained and provided a sparse and more efficient graph model for the subsequent steps of learning the structure of clustering or manifold. On the other hand, applying the constructed ANNG to the classic manifold learning algorithms, such as Isomap, LLE, HLLE and LTSA etc., the extended and adaptive versions corresponding to these algorithms can be easily formulated, and more importantly, the long standing problem in manifold learning about the optimal size of neighborhood selection would to be avoid in practice. It provides a new point of view for the problem of dimensionality reduction which is one of the two hotspot issues in the field of machine learning.
     Additionally, two subspace structures, i.e. outlying features subspace and clustering features subspace, are discussed in this paper from a uniform point of clusters view, and a new outlier index is defined in order to seek the outlier’s behaviors. Main innovations and contributions are listed as following.
     (1) Natural Nearest Neighbor (3N), a novel concept is proposed and implemented in a very simple way. The rationality in terms of 3N neighbor is validated on the data sets with standard distributions (uniform and normal distribution) and on the regular data derived from special mechanism such as grid data. The status of data distribution can be observed from the histogram of all numbers of 3N neighbors which is independent of the dimension. 3N neighbor contains more meaningful information than the tranditional k-nn or∈-nn.
     (2)Applying the concept 3N to the global manifold learning algorithm ISOMAP and the local manifold learning algorithms LLE, HLLE and LTSA, the corresponding extended algorithms 3N-ISOMAP, 3N-LLE and 3N-LTSA can be easily obtained but without in use of the free parameter. It means that these algorithms may be used in anywhere and anytime without the consideration of what the value of free parameter should be taken as. Such as analysing the complexity of some homogeneous high-dimensional data subsets, automatically learning the multi-fold or clusterings, and the large scale manifold learning etc..
     (3) An improved spectral graph clustering algorithm 3N-MNcut is obtained by applying the ANNG graph to the original algorithm MNcut, due to the constructed ANNG graph induced a well-formed similarity matrix with sparseness and preserving the local manifold structure.
     (4)The features of structure subspace in terms of outliers and clusters are closely interralated and correspond to the largest eigen-values. A new outlier index is proposed for observing the behaviors of outliers from the neighbors with different distances.

引文

[1] H. S. Seung and Daniel D.Lee.“The Manifold ways of Perception”. Science 2000 :2268-2269.
    [2] S. Edelman. Representation and Recognition in Vision. Cambridge, MA: MIT Press, 1999.
    [3] J. W. Sammon.“A Nonlinear Mapping for Data Structure Analysis”. IEEE Trans. Computers, Vol. 18, no.5, May 1969:401-409.
    [4] D. Donoho.“High-Dimensional Data Analysis: the Curses and Blessings of Dimensionality”. Mathematical Challenges of the 21st Century. American Mathematical Society, Los Angeles, CA(2000).
    [5] I. T. Jolliffe, Principal Component Analysis, Springer, New York, 1986.
    [6] T. Cox, M. Cox. Multidimensional Scaling, Chapman & Hall, London, 1994.
    [7] J. Tenenbaum, V De Silva and J. C. Langford.“A global geometric framework for nonlinear dimension reduction”. Science 2000, 290:2319–2323..
    [8] M. Berstein, V de Silva, J. Langford and J. Tenenbaum. Graph approximations to geodesics on embedded manifolds. http://isomap.stanford.edu/BdSLT.pdf, 2000.
    [9] S. Roweis and L. Saul.“Nonlinear dimensionality reduction by locally linear embedding”. Science, 2000, 290: 2323–2326.
    [10] L. Saul and S. Roweis.“Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifolds”. Journal of Machine Learning Research 4 (2003) 119-155.
    [11] M. Belkin and P. Niyogi.“Laplacian eigenmaps for dimensionality reduction and data representation”. Neural Computation, 15(6) , June 2003:1373–1396..
    [12] Z. Zhang and H. Zha.“Principal manifolds and nonlinear dimensionality reduction via tangent space alignment”, SIAM J. Sci. Comput. 26 (1)(2004) 313–338.
    [13] X. He, D. Cai, S. Yan and H. Zhang.“Neighborhood preserving embedding”, in: Proceedings of the 10 IEEE International Conference on Computer Vision, Beijing, China, October 2005:1208–1213.
    [14] G. Hinton and S. Roweis.“Stochastic Neighbor Embedding”. Advances in Neural Information Processing Systems 15 (NIPS'02): 857-864.
    [15] X. He, P. Niyogi.“Locality Preserving Projections”, Proceedings of Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2004: 153-160.
    [16] Tony Lin, Hongbin Zha and Sang Uk Lee.“Riemannian Manifold Learning for Nonlinear Dimensionality Reduction”. In ECCV 2006, A. Leonardis, H. Bischof, and A. Prinz (Eds.):Part I, LNCS 3951, 2006:44-55.
    [17] K. Q. Weinberger and L. K. Saul.“An introduction to nonlinear dimensionality reduction by maximum variance unfoding”. In AAAI, 2006. 1
    [18] Tianhao Zhang, Jie Yang, Deli Zhao, Xinliang Ge.“Linear local tangent space alignment and application to face recognition”. Neurocomputing, 70 (2007) 1547–1553 Letters.
    [19]张军平,王珏.“流形学习”,神经网络及其应用[M],周志华、曹存根主编,清华大学出版社,2004:172-207.
    [20] Li Yang.“Building k-Connected Neighborhood Graphs for Isometric Data Embedding”. IEEE transaction on pattern analysis and machine intelligence, VOL. 28, NO. 5, 2006: 827-831.
    [21] Li Yang.“Building k-Edge-Connected Neighborhood Graphs for Distance-Based Data Projection,”Pattern Recognition Letters, vol. 26, no. 13, Oct. 2005:2015-2021.
    [22] Li Yang.“Building k Edge-Disjoint Spanning Trees of Minimum Total Length for Isometric Data Embedding”. IEEE transaction on pattern analysis and machine intelligence, VOL. 27, NO. 10, OCTOBER 2005: 1680-1683.
    [23] S. S. Stevens.“Mathematics, Measurement, and Psychophysics”(Handbook of Experimental Psychology). Wiley, 1951.
    [24] M. Balasubramanian and E. L. Schwartz.“The Isomap Algorithm and Topological Stability”. Science, 7a(2002), 295.
    [25] Laurens van der Maaten. http://www.cs.unimaas.nl/l.vandermaaten.
    [26]高新波:模糊聚类分析[M]。西安电子科大出版社, 2004.
    [27] Kaufman, L., Rousseeuw, P.J..“Finding Groups in Data: An Introduction to Cluster Analysis”. John Wiley & Sons, Inc., New York,1990.
    [28] Ng, R., Han, J..“Efficient and effective clustering method for spatial data mining”. In: Proc. of 1994 Internat. Conf. on Very Large Data Bases (VLDB_94), September, 1994, Santiago, Chile, 1994:144–155.
    [29] Ester, M., Kriegel, M.-P., Sander, J., Xu, X..“A density based algorithm for discovering clusters in large spatial databases with noise”. In: Proc. 2nd Internat. Conf. on Knowledge Discovery Data Mining, 2–4 August, 1996, Portland, OR, 1996: 226–231.
    [30] Xu, X., Ester, M., Kriegel, H.-P., Sander, J.“A distribution-based clustering algorithm for mining in large spatial databases”. In: Proc. 14th Internat. Conf. on Data Eng. (ICDE 98). Orlando, FL, 1998: 324–331.
    [31] Wang, W., Yang, J. Muntz, R., STING: A statistical information grid approach to spatial data mining. In: Proc. of the 23rd Very Large Databases Conf. (VLDB 1997). Athens, Greece,1997.
    [32] Zhang, T., Ramakrishnan, R., Linvy, M..“BIRCH: An efficient data clustering method for very large databases”. In:Proc. ACM SIGMOD Internat. Conf. on Management of Data, June, 1996, Montreal, Canada, 1996: 103–114.
    [33] Sheikholeslami, G., Chatterjee, S., Zhang, A..“WaveCluster: A multi-resolution clustering approach for very large spatial databases”. In: Proc. 24th Very Large Databases Conf. (VLDB 98). New York, NY,1998.
    [34] Hinneburg, A., Keim, D.A..“An efficient approach to clustering in large multimedia databases with noise”. In: Proc. 1998 Int. Conf. Knowledge Discovery and Data Mining (KDD_98), 1998:58-65.
    [35] Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P..“Automatic subspace clustering of high dimensional data for data mining applications”. In: Proc. ACM SIGMOD Internat. Conf. on Manage. Data, June, 1998, Seattle, Washington, DC, 1998: 94–105.
    [36] Guha, S., Rastogi, R., Shim, K..“CURE: An efficient clustering algorithm for large databases”. In: Proc. ACM SIGMOD Internat. Conf. on Management of Data, June,1998, Seattle, Washington, 1998:73–84.
    [37] Wang, W., Yang, J., Muntz, R..“STING+: An approach to active spatial data mining. In: Proc. Internat. Conf. on Data Eng”. 23–26 March, 1999, Sydney, Australia, 1999: 116–125.
    [38] Goil, S., H. Nagesh, A. Choudhary.“MAFIA: Efficient and scalable subspace clustering for very large data sets”. Technical Report Number CPDC-TR-9906-019, Center for Parallel and Distributed Computing, Northwestern University,1999..
    [39] Karypis, G., Han, E., Kumar, V..“CHAMELEON: A hierarchical clustering algorithm using dynamic modeling”. IEEE Comput.: Special Issue Data Anal. Mining 32 (8), 1999, 68–75.
    [40] Zhao, Y. C., Song, J..“GDILC: A grid-based densityisoline clustering algorithm”. In: Proc. Internat. Conf. on Info-net, Vol. 3, 2001:140–145.
    [41] Alevizos, P., Boutsinas, B., Tasoulis, D. and Vrahatis, M. N..“Improving the K-windows clustering algorithm”. In: Proc. 14th IEEE Internat. Conf. on Tools with Artificial Intell., 2002: 239–245.
    [42] Ma, W. M., Eden, Chow, Tommy, W. S..“A new shifting grid clustering algorithm”. Pattern Recognition 37 (3), 2004, 503–514.
    [43] A. H. Pilevar, M. Sukumar.“GCHL: A grid-clustering algorithm for high-dimensional very large spatial data bases”. Pattern Recognition Letters. 26(2005) 999-1010
    [44] A.Y. Ng, M.I. Jordan, and Y. Weiss.“On spectral clustering: Analysis and an algorithm”. Advances in Neural Information Processing Systems (NIPS),14, 2001.
    [45] Jianbo Shi and Jitendra Malik.“Normalized cuts and image segmentation”. IEEETransactions on Pattern Analysis and Machine Intelligence, 22(8) ,2000:888–905.
    [46] M. Meila. The multicut lemma. UW Statistics Technical Report 417,2001.
    [47] M. Meila and J. Shi. Learning segmentation with random walk. In NIPS, 2001.
    [48] M. Meila, S. Shortreed and L. Xu.“Regularized spectral learning”. UW-Stat Dept TR No. 465. 2004.
    [49] Stella X. Yu, Jianbo Shi.“Multiclass Spectral Clustering”. Proceedings of the Ninth IEEE International Conference on Computer Vision (ICCV’03), 2003:313-319.
    [50] Arik Azram, Zoubin Ghahramani.“Spectral Methods for Automatic Multiscale Data Clustering”, Proceedings of the 2006 IEEE Society Conference on Computer Vision and Pattern Recognition Vol.1, 2006:.190-197..
    [51] Christian Hennig.“Cluster-wise assessment of cluster stability”. Research report no. 271, Department of Statistical Science, University College London, Date: December 2006.
    [52] Francis R. Bach, Michael I. Jordan.“Learning Spectral Clustering”. Advances in Neural Information Processing Systems (NIPS), 16, 2004.
    [53] E. Knorr and R. Ng.“Algorithms for Mining Distance-Based Outliers in Large Datasets,”Proc. Int’l Conf. Very Large Databases (VLDB’98), 1998:392-403.
    [54] E. Knorr and R. Ng.“Finding Intensional Knowledge of Distance-Based Outliers”, Proc. Int’l Conf. Very Large Databases (VLDB’99), 1999: 211-222.
    [55] J. Han and M. Kamber. Data Mining, Concepts and Technique. San Francisco: Morgan Kaufmann, 2001.
    [56] T. Fawcett and F. Provost.“Adaptive Fraud Detection”. Data Mining and Knowledge Discovery, vol. 1, 1997:291-316.
    [57] E. Eskin, A. Arnold, M. Prerau, L. Portnoy, and S. Stolfo.“A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data,”Applications of Data Mining in Computer Security, Kluwer, 2002.
    [58] A. Lazarevic, L. Ertoz, V. Kumar, A. Ozgur, and J. Srivastava.“A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection,”Proc. SIAM Int’l Conf. Data Mining (SIAM-03), 2003.
    [59] Ronald K. Pearson.“Outliers in Process Modeling and Identification”. IEEE trans. On control systems technology,Vol.10,No.1, 2002:.55-63.
    [60] Alec Pawling, Nitesh V. Chawla and Greg Madey.“Anomaly Detection in a Mobile Communication Network”. Computatonal & methematicl organization theory. Vol.13, No.4, 2007:407-426.
    [61] Hongmei Deng, Roger Xu.“Model Selection for Anomaly Detection in Wireless Ad HocNetworks”. Proceedings of the 2007 IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2007), April 5 2007:540-546.
    [62] Jo-Anne Ting, Aaron D’Souza Stefan Schaal.”Automatic Outlier Detection: A Bayesian Approach”. 2007 IEEE International Conference on Robotics and Automation Roma, Italy, 10-14 April 2007:2489-2494.
    [63] Olga Duran and Maria Petrou.“A Time-Efficient Method for Anomaly Detection in Hyperspectral Images”. IEEE Trans. On Geoscience and Remote Sensing, Vol. 45, No. 12, DECEMBER 2007:3894-3904.
    [64] M. Berstein, V de Silva, J. Langford and J. Tenenbaum.“Graph approximations to geodesics on embedded manifolds”. Available: http://isomap.stanford.edu/BdSLT.pdf, 2000.
    [65] H. Chernoff,“A measure of asymptotic efficiency for test of a hypothesis based upon the sum of observations”. Annals of Mathematical Statistics 23, 1952:.493-507.
    [66] C. Song, S. Havlin and H. A. Makse.“Self-similarity of complex networks.”Nature 43, (2005), pp.392-395.
    [67] D. Watts and S. Strogatz, Collective dynamics of small-world networks. Nature 393, 440 (1998).
    [68] S. Milgram,“The small world problem,”Psychology Today 1, 61 (1967).
    [69] Nathan Mekuz and John K. Tsotsos.“Parameterless Isomap with Adaptive Neighborhood Selection.”Lecture Notes in Computer Science, 2006, Volume 4174/2006, 364-373, DOI: 10.1007/11861898_37.
    [70] Levina, E., Bickel, P.J.“Maximum likelihood estimation of intrinsic dimension.”In Saul, L.K., Weiss, Y., Bottou, l., eds.: Advances in Neural Information Processing Systems 17. MIT Press, Cambridge, MA (2005) 777–784.
    [71] Jing Wang, Zhenyue Zhang and HongYuan Zha.“Adaptive Manifold Learning.”Advances in Neural Information Processing Systems 17, edited by Lawrence K. Saul and Yair Weiss and Leon Bottou, MIT Press, Cambridge, MA, 2005:1473-1480.
    [72] Samko, A. D. Marshall and P. L. Rosin.“Selection of the optimal parameter value for the Isomap algorithm.”Pattern Recognition Letters, Vol. 27, No. 9, 2006: 968-79.
    [73] Kouropteva O, Okun O and Pietik?ainen M.“Selection of the optimal parameter value for the locally linear embedding algorithm.”In: Wang L, Halgamuge SK, Yao X, eds. Proc. of the 1st Int’l Conf. on Fuzzy Systems and Knowledge Discovery: Computational Intelligence for the E-Age. Singapore, 2002. 359?363. http://www.ee.oulu.fi/research/imag/document /publications/ FSKD02 .pdf
    [74] Y. Zhan, J. Yin, X. Liu and G. Zhang.“Adaptive Neighborhood Select Based on LocalLinearity for Nonlinear Dimensionality Reduction.”ISICA '09 Proceedings of the 4th International Symposium on Advances in Computation and Intelligence. ISBN: 978-3-642-04842-5. doi>10.1007/978-3-642-04843-2_36.
    [75] Shao C. Huang HK, Zhao LW.“A more topologically stable Isomap algorithm,”Journal of Software, 2007, 18(4): 869-877. http://ww.jos.org.cn/1000-9825/18/869.htm.
    [76] Y. Wu and K.L. Chan.“An extended Isomap algorithm for learning multiclass manifold,”Proc. of the 2004 Int. Conf. on Machine Learning and Cybernetics, vol. 6, Aug. 2004: 3429-3433.
    [77] D. Meng, Y. Leung, T. Fung and Z. Xu.“Nonlinear dimensionality reduction of data lying on the multicluster manifold,”IEEE Trans. Systems, Man and Cybern. B, vol. 38, issue. 4, Aug. 2008: 1111-1122.
    [78] M. Fan, H. Qiao and B. Zhang.“Isometric multi-manifolds learning”. In ArXiv:0912.0572v1 [cs.LG] 3 Dec 2009.
    [79]“Columbia automated vision environment: Columbia university image library (coil-20),”http://www1.cs.columbia .edu/CAVE/.
    [80] F. Korn and S. Muthukrishnan.“Influence Sets Based on Reverse Nearest Neighbor Queries,”Proc. ACM SIGMOD Conf., 2000.
    [81] F. Korn, S. Muthukrishnan, and D. Srivastava.“Reverse Nearest Neighbor Aggregates over Data Streams,”Proc. Int’l Conf. Very Large Data Bases (VLDB), 2002.
    [82] M. L. Yiu and N. Mamoulis.“Reverse Nearest Neighbors Search in Ad Hoc Subspaces”, IEEE trans. on Knowledge and Data Engineering, Vol.19, No.3,2007:412-426.
    [83] Y. Tao, M. L. Yiu and N. Mamoulis.“Reverse Nearest Neighbor Search in Metric Spaces”. IEEE trans. on Knowledge and Data Engineering, Vol.18, No.9, 2006:.1239-1252..
    [84] E. Fix and J. L. Hodges.“Discriminatory Analysis, Nonparametric Discrimination: Consistency Properties,”Technical Report TR4,US Air Force School of Aviation Medicine, Randolph Field, Tex., 1951.
    [85] H. Wang.“Nearest Neighbors by Neighborhood Counting”. IEEE trans. on pattern analysis and machine intelligence, Vol.28, No.6, 2006:942-953.
    [86] Anna Yershova and Steven M. LaValle.“Improving Motion-Planning Algorithms by Efficient Nearest-Neighbor Searching”. IEEE TRANSACTIONS ON ROBOTICS, VOL. 23, NO. 1, FEBRUARY 2007:151-157.
    [87] K. L. Clarkson,“Nearest-neighbor searching and metric space dimension,”2005 [Online]. Available: http://cm.bell-labs.com/who/clarkson/nn_survey/b.pdf.
    [88] J.H. Friedman, J.L. Bentley, and R.A. Finkel.“An Algorithm for Finding Best Matches inLogarithmic Expected Time,”ACM Trans. Math. Software, vol.3, no.3, Sept. 1977:209-226.
    [89] R. Sproull.“Refinements to Nearest-Neighbor Searching in k-d Trees,”Algorithmica, vol. 6, 1991: 579-589.
    [90] Anil K. Ghosh, Probal Chaudhuri and C. A. Murthy.“Multiscale Classification Using Nearest Neighbor Density Estimates”. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 36, NO. 5, OCTOBER 2006:1139-1148
    [91] Pasi Fr?nti, Olli Virmajoki, and Ville Hautam?ki.“Fast Agglomerative Clustering Using a k-Nearest Neighbor Graph”. IEEE transaction on pattern analysis and machine intelligence, VOL. 28, NO. 11, NOVEMBER 2006:1875-1881.
    [92] Yongzhou Li, Dayong Luo and Shaoqiang Liu, Orthogonal discriminant linear local tangent space alignment. Neurocomuting 72 (2009) 1319-1323.
    [93] Xiaoming Huo and Andrew K.Smith. Matrix perturbution analysis of local rangent space alignment. Linear Algebra and its Applications 430 (2009) 732-746.
    [94] I. M. Johnstone and D. M. Titterington.“Statistical challenges of high-dimensional data.”Phil. Trans. R. Soc. A (2009) 367, 4237-4253. doi:10.1098/rsta.2009.0159.
    [95] Chung, F. R. K.. Spectral Graph Theory. Number Regional Conference Series in Mathematics in 92. American Mathematical Society, Prvidence, RI. (1997)
    [96] Inderjit S. Dhillon, Yuqiang Guan, Brian Kulis. Weighted Graph Cuts without Eigenvectors: A Multilevel Aproach. IEEE transaction on pattern analysis and machine intelligence, VOL. 29, NO. 11, NOVEMBER 2007:1944-1957.
    [97] I. Dhillon, Y. Guan, and B. Kulis.“Kernel k-Means, Spectral Clustering and Normalized Cuts,”Proc. 10th ACM Knowledge Discovery and Data Mining Conf., 2004:551-556.
    [98] I. Dhillon, Y. Guan, and B. Kulis.“A Fast Kernel-Based Multilevel Algorithm for Graph Clustering,”Proc. 11th ACM Knowledge Discovery and Data Mining Conf., 2005, pp. 629-634.
    [99] H. Zha, C. Ding, M. Gu, X. He, and H. Simon,“Spectral Relaxation for k-Means Clustering,”In Proc. Neural Information Processing Systems, 2001.
    [100] Jon M. Kleinberg.“An impossibility theorem for clustering”. In Suzanna Becker, Sebastian Thrun, and Klaus Obermayer, editors, NIPS, pages 446-453. MIT Press, 2002. ISBN 0-262-02550-7.
    [101] Gunnar Carlsson, Facundo Memoli. Characterization, Stability and Convergence of Hierarchical Clustering Methods. Journal of Machine Learning Research Vol.11 (2009) 1425-1470.
    [102] F. Angiulli and C. Pizzuti.“Outlier Mining in Large High-Dimensional Data Sets,”IEEETrans. Knowledge and Data Eng.,vol. 2, no. 17, Feb. 2005:203-215.
    [103] F. Angiulli, S. Basta, and C. Pizzuti. Distance-Based Detection and Prediction of Outliers. IEEE Trans. On knowledge and Data Engineering , Vol. 18, No. 2, Feb. 2006: 145-160.
    [104] S. D. Bay M. Schwabacher. Mining DistanceBased Outliers in Near Linear Time with Randomization and a Simple Pruning Rule. Pro. Intel Conf. Knowledge Discovery and Data Mining(KDD’03),2003
    [105] S. Ramaswamy, R. Rastogi, K. Shim. Efficient Algorithms for Mining Outliers from Large Data Sets. Proc. Fisrt Intel Conf. Management of Data(SIGMO’00), 2000:427-438.
    [106] Dogmei Ren,Baoying Wang,William Perrizo. RDF:A Density-based Outlier Detection Method using Vertical Data Representation. ICDM’04 IEEE Computer Society.
    [107] Anny L.M. Chiu,Ada W.C. Fu. Enhancements on Local Outlier Detection. Proceedings of the Senventh International Database Engineering and Applications Symposium(IDEAS’03) 2003.
    [108] George Kollios, Dimitrios Gunopulos, Nick Koudas &Stefan Berchtod. Efficient Biased Sampling for approximate Clustering and Outlier Detection in Large Data Sets. IEEE Trans. on knowledge and data engineering,Vol.15, no.5 2003:1170-1187.
    [109] Brett G. Amidan, Thomas A. Ferryman, and Scott K. Cooley. Data Outlier Detection using the Chebyshev Theorem. IEEEAC paper 1198, Version 3, Updated December 9, 2004
    [110] Zengyou He, Xiaofei Xu, Shengchun Deng. Discovering cluster-based local outliers.Pattern Recognition Letters 24(2003):1641-1650.
    [111] J. Neyman and E. L. Scott.“A statistical approach to problems of cosmology,”J. Royal Stat. Sm., Ser. B, vol. 20, 1958: l-43.
    [112] Bahaa E. A. Saleh, Malvin C. Teich.“Statistical Properties of a Nonstationary Neyman-Scott Cluster Process”. IEEE Trans. On Information Theory, Vol. IT - 29, No. 6, November 1983.
    [113] Wei LI, Gong Xue-Qing, Qian Wei-ning, Zhou Ao-ying.“Finding Outliers in High-dimensional Space[J]. Journal of Software”. Vol.13, No2, 2002: 281~290.魏藜,宫学庆,钱卫宁,周傲英.“高维空间中的离群发现”.软件学报[J], Vol.13, No.2, 2002:280-290.
    [114] Lu Zhengding, WangQing.“The model of outliered pattern ming based on similarities”. Huazhong Univ. of Sci. &Tech [J]. (Nature Science Edition). Vol.33, No.1, Jan. 2005:22-27.卢正鼎,王琼,”基于相似度离群模式的发现模型”,华中科技大学学报[J], Vol.33, No.1, 2005:22-27
    [115] Jin Yifu, Zhu Qingsheng, Xing YongKang.“An Algorithm for Clustering of Outliers Based on Key Attribute Subspace”[J]. Journal of Computer Research and Development. 44 (4), 2007:651-659 ,金义富,朱庆生,邢永康.“一种基于关键域子空间的离群数据聚类算法”.计算机研究与发展[J], 44(4), 2007:651-659,
    [116] SUN Huan-Liang, BAO Yu-Bin, YU Ge, ZHAO Fa-Xin, WANG Da-Ling.“An Algorithm Based on Partition for Outlier Detection”[J]. Journal of Software, Vol.17, No.5, May 2006:1000-1016.孙焕良,鲍玉斌,于戈,赵法信,王大玲等.“一种基于划分的孤立点检测算法”.软件学报[J]. Vol.17, No.5, May 2006:1009-1016,
    [117] Mingxi Wu, Christogher Jermaine.“Outlier Detection by Sampling with Accuracy Guarantees”. Proceedings of the 12th ACM SIGKDD international Conference Knowledge Discovery and Data Mining, 2006:767-772.