基于高斯混合密度模型的医学图像聚类研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

基于高斯混合密度模型的医学图像聚类研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Research on Medical Image Clustering Based on Gaussian Mixture Density Model
作者：王春红
论文级别：硕士
学科专业名称：模式识别与智能系统
中文关键词：医学图像 ; 高斯混合密度模型 ; 期望最大算法 ; QAIC准则 ; 蚁群k-均值算法 ; 数据加权
英文关键词：medical images ; Gaussian mixture density model ; Expectation Maximization algorithm ; QAIC criterion function ; ant colony algorithm ; data weighted
学位年度：2008
导师：宋余庆
学科代码：081104
学位授予单位：江苏大学
论文提交日期：2008-11-01
答辩委员会主席：詹永照

摘要

图像聚类已成为图像识别的一种关键技术。而医学图像识别是医学图像分析和理解的重要内容,在医学临床诊断中具有重要作用。因而,研究适合于图像识别的图像聚类算法具有重要意义。目前,医学图像聚类算法还没有达到理想的识别效果,不能完全满足医学图像分析和理解的要求。本文试图研究适合于医学图像识别的基于高斯混合密度模型的聚类方法及其算法。
     本文研究了高斯混合密度模型和基于高斯混合密度模型的聚类方法,构造出医学图像的高斯混合密度模型,提出了基于EM的医学图像高斯混合密度模型参数估计算法及其改进。论文研究工作主要体现以下几个方面:
     (1)系统研究了概率密度函数的参数估计、非参数估计的理论和方法,特别阐述了基于高斯混合密度模型的参数估计属于半参数估计的理论和方法。发现基于高斯混合密度模型聚类算法适合于医学图像的聚类分析。
     (2)针对模型选择的问题,提出了改进的QAIC准则函数。理论和实验证明,该函数适合于确定医学图像高斯混合密度模型的分量数。研究中使用试探法验证了改进的QAIC准则函数的正确性。
     (3)研究了高斯混合密度模型的医学图像应用问题,提出了基于高斯混合密度模型的医学图像数据分布描述方法。
     (4)针对k-均值初始化高斯混合密度模型参数对参数估计值比较敏感的问题,提出了基于蚁群算法改进的k-均值算法,并将之应用于确定高斯混合密度模型初始值。实验证明,改进的初始化算法对医学图像会产生更好的聚类效果。
     (5)通过研究医学图像每个像素点的特征矢量对高斯混合模型的贡献程度不同,提出医学图像的加权高斯混合密度模型和基于加权高斯混合密度模型的医学图像聚类算法。
Image clustering has become a kind of key image recognition technology.Medical image recognition is important content of medical image analysis and understanding,which plays an important part in the field of medical clinical diagnosis.As a result,there is an important significance to research image clustering algorithm which is suitable for image recognition.At present,the medical image clustering algorithm has not yet achieved the desired effect of identification and can not fully meet the requirements of the medical image analysis and understanding.In this paper,it attempted to research the clustering algorithm based on Gaussian mixture density model,which fits for medical image recognition.
     In this paper,it had researched Gaussian mixture model density and the clustering algorithm based on Gaussian mixture density model and constructed Gaussian mixture density model of medical image and proposed medical image density Gaussian mixture model parameter estimation algorithm based on the EM and its improvement.The main research work in this paper can be summarized as the following three aspects:
     (1) This paper systematically researched theory and methods of the parameter estimation and nonparametric estimation of the probability density function.In particular,it expanded on the parameter estimation of Gaussian mixture density model belonging to the theory and methods of half of the estimated parameters.It founded that the clustering algorithm based on Gaussian mixture density model was suitable for medical images in the clustering analysis.
     (2) For the problem of the model choice,this paper put forward the QAIC criterion function improved.The theory and experiments proved that the function was suitable for determining the weight of medical image Gaussian mixture density model.In the process of research,the use of heuristic method authenticated the correctness of the improvement of QAIC criteria.
     (3) This paper researched the medical image application of Gaussian mixture density model and presented the described method of medical image data distribution based on Gaussian mixture density model.
     (4) Aiming at the problem that the k-means initialized the parameters of Gaussian mixture density model that is sensitive to the parameter estimation,this paper presented the ant colony algorithm to improve k-means algorithm,which is applied for determining the initialization of the Gaussian mixture density model.Experiments had proved that the initialization algorithm improved for medical images would yield better clustering results.
     (5) Through the study of medical image data for each pixel on Gaussian mixture density model having the different contribution degree, this paper had proposed the weighted Gaussian mixture density model of the medical image and the medical image clustering algorithm based on the weighted Gaussian mixture density model.

引文

[1]XU L.Bayesian Ying-Yang machine,clustering and number of clusters[J].Pattern Recognition Letters,1997,18(11-13):1167-1178.
    [2]KAUFMAN L,ROUSSEEUW P.Finding Groups in Data:An Introduction to Cluster Analysis[M].New York:John Wiley and Sons,NY,1990.
    [3]Raymond T.Ng,Jiawei Han.CLARANS:A Method for Clustering Objects for Spatial Data Mining[J].IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,2002,14(5):1003-1016.
    [4]ZHANG T,RAMAKRISHNAN R,LIVNY M.BIRCH:an efficient data clustering method for very large databases[C].In:Proceedings of the ACM SIGMOD Conference,Montreal,Canada,1996:103-114.
    [5]Ester M,Kriegel HP,Sander J,etal.A Density Based Algorithmfor Discovering Clusters in Large Spatial Databases with Noise[A].In:Proc 2nd Int.Con# on Knowledge Discovery and Data Mining(KDD296)[C].Portland:ACM Press,1996:226-231.
    [6]Agrawal R,Gehrke J,Gunopulos D,Raghavan P.Automatic subspace clustering of high-dimen- sional data for data mining applications[J].ACM SIGMOD,1998,27(2):94-105.
    [7]B.Dawant,A.Zijidenbos,R.Margolin.Correction of intensity variations in MR images for computer-aided tissue classification.IEEE Trans.On Medical imaging,1993,12:770-781.
    [8]M.Yan,and J.Karp.An adaptive Bayesian approach to three-dimensional MR brain segmen-tation.In Proc.of XIVth Int.Conf.On Information Processing in Medical Imaging,1995,201-213.
    [9]J.C.Rajapakse,J.N.Giedd,and J.L.Rapoport.Statistical approach to segmentation of single--channel cerebral MR images.IEEE Trans.on Medical Imaging,1997,16:176-186.
    [10]M.Unser.Multigrid adaptive image processing.In Proc.of the IEEE Conference on Image Processing(ICIP95),1995,1:49-52.
    [11]W.M.Wells,W.E.L.Grimson,R.Kikinis,and F.A.Jolesz.Adaptive segmention of MRI data.IEEE Trans.On Medical Imaging,1996,15:429-442.
    [12]周代红,遗传聚类算法及其在医学图像分割中的应用,山东:曲阜师范大学,2007.3
    [13]潘伟,基于模糊聚类的医学图像分割技术研究,西安:西北工业大学,2007.3
    [14]谢从华,基于密度聚类的医学图像分割和特征提取方法研究,江苏:江苏大学,2005.11
    [15]Vlachos,M.,Vardangalos,G,Tatsiopoulos,C.,Effective ways for querying images by content over the Internet,Electrotechnical Conference,2000.MELECON 2000.10th Mediterranean,Volume 1,29-31 May 2000 Page(s):337 - 340 vo1.1
    [16]Moore,I.G,Modelled concept learning for database knowledge discovery,Intelligent Information Systems,Proceedings of the 1994 Second Australian and New Zealand Conference on 29 Nov.-2 Dec.1994 Page(s):372-376
    [17]Pizzuti,C.,Talia,D.,P-AutoClass:scalable parallel clustering for mining large data sets,Knowledge and Data Engineering,IEEE Transactions on Volume 15,Issue 3,May-June 2003 Page(s):629-641
    [18]郑永越,基于高斯混合模型的微阵列基因表达数据聚类分析,硕士论文,西安:西安电子科技大学,2006.1
    [19]陈增炀,基于自适应高斯混合模型说话人识别的研究,硕士论文,南京:南京理工大学,2007.7
    [20]王吉林,利用矢量量化(VQ)和混合高斯模型(GMM)的说话人识别的研究,硕士论文,南京:东南大学,2004.
    [21]肖涵,基于高斯混合模型与子空间技术的故障识别研究,博士大论文,武汉:武汉科技大学,2007.
    [22]Han J,Kamber M.Data Mining[M].New York:Morgan Kaufmann Publishers.2001.1-321.
    [23]杨小兵,聚类分析中若干关键技术研究,博士学位论文,浙江:浙江大学,2005.
    [24]于秀林,任雪松,多元统计分析[M],北京:中国统计出版社,2006.8
    [25]Mac J.Some methods for classification and analysis of multivariate observations.Proceedings of 5th Berkeley Symposium on Mathematics,Statistics and Probability,Berkeley,1997:281-296.
    [26]Huang Z.Extensions to the k-means algorithm for clustering large data sets with categorical values.Data Mining and Knowledge Discovery,1998,2(3):283-304.
    [27]Zhang T,Ramakrishnan R,Livny M.BIRCH:An efficient data clustering method for very large databases.Procedings of the ACM SIGMOD International Conference on Management of Data,Montreal,Canada,1996:103-114.
    [28]Guha S,Rastogi R,Shim K.CURE:An efficient clustering algorithm for large databases.Proceedings of the ACM SIGMOD International Conference on Management of Data,Seattle,1998:73-84.
    [29]Karypis G,Han E H,Kumar V.CHAMELEON:A hierarchical clustering algorithm using dynamic modeling.IEEE Computer,1999,32(8):68-75.
    [30]Guha S,Rastogi R,Shim K.ROCK:A Robust Clustering Algorithm for Categorical Attributes[C].Sydney:Proceedings of the 15th ICDE,1999.512- 521
    [31]Ester M,Kriegel P,Sander J.A density-based algorithm for discovering clusters in large spatial databases.Proceedings International Conference:Knowledge Discovery and Data Mining,Portland,1996:226-231.
    [32]Ankerst M,Breunig M,Kriegel P.Optics:Ordering points to identify the clustering structure.Proceedings ACM SIGMOD International Conference on Management of Data.Philadelphia,1999:49-60.
    [33]Alexander Hinneburg,Daniel A.Kein.An Efficient Approach to Clustering in Large Multimedia Databases with Noise.American Association for Artificial Intelligence,1988:
    [34]ZhangW,Yang J,Muntz R.STING:A statistical information grid approach to spatial data mining.Proceedings of the 23rd VLD8 Conference,Athens,1997:186-195.
    [35]Gholamhosein Sheikholeslami,Surojit Chatterjee,Aidong Zhang.WaveCluster:A Multi-Resolution Clustering Approach for Very Large Spatial Databases.Proceedings of the 24~(th)VLDB Conference New York,USA,1998:428-539.
    [36]Agrawal R,Gehrke J,Gunopulos D.Automatic subspace clustering of high dimensional data for data mining applications.Proceedings of the ACM SIGMOD International Conference on Management of Data,Seattle,1998:73-84.
    [37]Celeux,G and Govaret,G(1992).A classification EM algorithm for clustering and two stochastic versions.Computational Statistics and Data Analysis,14,315-332.
    [38]李弼程,邵美珍,黄洁．模式分割原理与应用[M]．西安：西安电子科技大学出版社,2008．
    [39]Dempster,A.P.,Laird,n.m.,Rubin,d.b.Maximan likelihood estimation from incomplete data via t he algorithm[J].J.Roystatist Soc.B.1977,39:1-38.
    [40]R.M.NeaI,G.E.Hinton,A view of the EM algorithm that justifies incremental,sparse,and Other variants,in:M.LJordan(Ed.)Learning in Graphical Models,1998,Kluwer Academic Publishers,Dordrecht,pp.355-368.
    [41]G.Schwarz.Estimating the dimension of a model[J].Annals of Statistics.1978,6:461-464.
    [42]C.Keribin.Consistent estimation of the order of mixture models[J].Sankhya Ser A.2000,62:49-66.
    [43]S.J.R.oberts,D.Husmeier,I.R.ezek,VU.Penny.Bayesian approaches to Ganssian modeling[J].IEEE Trans Pattern Analysis and Machine Intelligence.1998,20:1133-1142.
    [44]M.Beal,Z.Ghahramani.The variational Bayesian EM algorithm for incomplete data:with application to scoring graphical model structures[J].Bayesian Statistics.2003,2:453-464.
    [45]S.R.ichardson,P.J.Green.On Bayesian analysis of mixtures with an unknown number of components[J].Journal Of The Royal Statistical Society(B).1997,59:731-758.
    [46]C.Andrieu,N.deFreitas,A.Doucet.M.I.Jordan.An introduction to MCMC for machine learning[J].Machine Learning.2003,50(1):5-43.
    [47]H.Akaike.A new look at statistical model identification[J].IEEE Trans on Automatic Control.1974,19:716-723.
    [48]J.Rissanen.Stochastic complexity in statistical inquiryfhll.Singapore:World Scientific,1989.
    [49]J.Oliver,R.Baxter,C.Wallace.Unsupervised learning using MML[C].Proceedings of the:Thirteenth International Conference on Machine Learning.1996.364-372.
    [50]C.Biernacki,G.Govaert.Using the Classification Likelihood to Choose the Number of Clusters[J].Computing Science and Statistics.1997,29:451-157.
    [51]G.Celeirx,G.Soromenho.An entropy criterion for assessing the number of clusters in a mixture model[J].Journal of Classification.1996,13:195-212.
    [52]C.Biernacki,G.Celemc,G.Govaert.Assessing a mixture model for clustering with the integrated classification likelihood(J).IEEE mans Pattern Analysis and Machine Intelligence.2000,22:719-725.
    [53]V.Vapnik.The natureof statistical learning theory,Statistics for Engineering and Information Science Series[M].NewYork,NY,USA:Spinger-Verlag,1995.
    [54]P.Smyth.Model selection for probabilistic clustering using cross-validated likelihood[J].Statistics and Computing.2000,10:63-72.
    [55]G.J.McLachlan.On bootstrapping the likelihood ratio test statistic forthe number of components in a normal mixture[J].Applied Statistics.1987,36:318-324.
    [56]X.Yang,J.Liu.Mixture density estimation with group membership functions[J].Pattern Recognition Letters.2002,23:501-512.
    [57]M.A.T.Figueiredo,A.K..Jain.Unsupervised learning of finite mixture models[J].IEEE Trans Pattern Analysis and Machine Intelligence.2002,24(3):381-396.
    [58]H.X.Wang,B.Luo,Q.B.Zhang,S.Wei.Estimation for the number of components in a mixture model using stepwise split-and-merge EAI algorithm[J].Pattern Recognition Letters.2004,25:1799-1809.
    [59]N.Vlassis,A.Likas.A greedy EM algorithm for Gaussian mixture learning[J].Neural Processing Letters.2002,15(1):77-87.
    [60]J.J.Verbeek,N.Vlassis,B.Krose.Efficient Greedy Learning of Gaussian Mixture Models[J].Neural Computation.2003,5(2):469-485.
    [61]Ron Wehrens,Lutgarde M.C.Buydens,Chris Fraley,Adrian E.Raftery,Model-Based Clustering for Image Segmentation and Large Datasets Via Sampling.Technical Report no.424 Department of Statistics University of Washington.2003,2(13).
    [62]孙即祥等.现代模式分割.长沙:国防科技大学出版社,2002.
    [63]罗述谦,周果宏.医学图像处理与分析[M].北京:科学出版社,2003.
    [64]DORIGO M,MANIEZZO V,COLORNI A.Ant system:optimization by a colony of cooperating agents[J].IEEE Trans.on Systems,Man,and Cybernetics Part B,1996;26(1):29-41.
    [65]SHELOKAR P S,JAYARAMAN V K,KULKAMIB D.An ant colony approach for clustering[J].Analytica Chimica Acta,2004,509(2):187-195.
    [66]HUANG Guo-rui,WANG Xu-fa,CAO Xian-bin.Ant colony optimization algorithm based on directional pheromone diffusion[J],Chinese Journal of Electronics,2006,15(3):447-450.
    [67]Markatou,M.,Basu,A.and Lindsay,B.G.Weighted likelihood estimating equations with a bootstrap root search[J].Journal of the American Statistical Association,1998,Vol.93:740-750.
    [68]McLachlan,G.J.and Basford,K.E.,Mixture Models:Inference and Applications to Clustering[M].New York and Basel:Marcel Dekker Inc,1988.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700