用户名: 密码: 验证码:
Roughened Random Forests for Binary Classification
详细信息   
  • 作者:Xiong ; Kuangnan ; Ph.D.
  • 学历:Ph.D.
  • 年:2014
  • 关键词:Binary classification ; High-dimensional microarray
  • 导师:Yucel, Recai M.
  • 毕业院校:State University of New York
  • Department:Biometry and Statistics
  • 专业:Biostatistics, Statistics, Computer science
  • ISBN:9781303992742
  • CBH:3624962
  • Country:USA
  • 语种:English
  • FileSize:4266295
  • Pages:137
文摘
Binary classification plays an important role in many decision-making processes. Random forests can build a strong ensemble classifier by combining weaker classification trees that are de-correlated. The strength and correlation among individual classification trees are the key factors that contribute to the ensemble performance of random forests. We propose roughened random forests, a new set of tools which show further improvement over random forests in binary classification. Roughened random forests modify the original dataset for each classification tree and further reduce the correlation among individual classification trees. This data modification process is composed of artificially imposing missing data that are missing completely at random and subsequent missing data imputation.
    
    
    Through this dissertation we aim to answer a few important questions in building roughened random forests: (1) What is the ideal rate of missing data to impose on the original dataset? (2) Should we impose missing data on both the training and testing datasets, or only on the training dataset? (3) What are the best missing data imputation methods to use in roughened random forests? (4) Do roughened random forests share the same ideal number of covariates selected at each tree node as the original random forests? (5) Can roughened random forests be used in medium- to high- dimensional datasets?

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700