用户名: 密码: 验证码:
提高回归模型拟合优度的策略(Ⅳ)——优化计分变换与其他变量变换
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Strategy of improving the goodness of fit of the regression model(Ⅳ)——the optimal scoring transformation and the other variable transformations
  • 作者:胡良平
  • 英文作者:Hu Liangping;Graduate School, Academy of Military Medical Sciences PLA China;Specialty Committee of Clinical Scientific Research Statistics of World Federation of Chinese Medicine Societies;
  • 关键词:优化计分变换 ; 单调变换 ; 样条变换 ; BOX-COX变换
  • 英文关键词:Optimal scoring transformation;;Spline transformation;;Monotonic transformation;;Box-cox transformation
  • 中文刊名:WANT
  • 英文刊名:Sichuan Mental Health
  • 机构:军事医学科学院研究生院;世界中医药学会联合会临床科研统计学专业委员会;
  • 出版日期:2019-02-25
  • 出版单位:四川精神卫生
  • 年:2019
  • 期:v.32
  • 基金:国家高技术研究发展计划课题资助(2015AA020102)
  • 语种:中文;
  • 页:WANT201901007
  • 页数:8
  • CN:01
  • ISSN:51-1457/R
  • 分类号:30-37
摘要
本文目的是介绍第四种提高回归模型拟合优度的策略,即优化计分变换与其他变量变换。具体方法包括以下几个方面:①第一,对多值名义自变量采取"优化计分变换";②对有序自变量分别采取"单调变换"与"优化计分变换";③对定量自变量分别采取"样条变换"和"单调样条变换";④对定量因变量分别采取"样条变换""单调样条变换"和"BOX-COX变换"。全部变量变换方法组合起来共12种,共创建了12个多重非线性回归模型。依据"拟合优度评价指标"的取值,从12个回归模型中挑选出一个,即本文中的"模型1",其"均方误差平方根=0.30935、R~2=0.9586、调整R~2=0.9527"。结合本期科研方法专题同类文章的结果和结论,得出提高回归模型拟合优度的策略主要在于以下四点:①应对"定量因变量""定量自变量"和"多值有序自变量"采取合适的变量变换方法;②在拟合回归模型的过程中,应尽可能多地引入派生变量;③应假定回归模型中不含截距项;④在构建回归模型的过程中,应尽可能多地使用筛选自变量的策略,如"前进法""后退法"和"逐步法"。
        The purpose of this paper was to introduce the fourth strategy of improving the goodness of fit of the regression models, the optimal scoring transformation and the other variable transformations. The concrete approaches were as follows: ①"The optimal scoring transformation" was adopted to the multi-value nominal independent variable. ②"The monotonic transformation"and "the optimal scoring transformation" were adopted to the multi-value ordered independent variable, respectively. ③"The spline transformation" and "the monotonic spline transformation" were adopted to the quantitative independent variables, respectively. ④"The spline transformation""the monotonic spline transformation"and"the BOX-COX transformation"were adopted to the quantitative dependent variable, respectively. There were twelve variable transformation ways, so the twelve multiple nonlinear regression models were built. One best regression model, which was "the model one" in this article, was selected from the twelve models mentioned above in terms of the results of the goodness of fit evaluation. The results were as follows: Root MSE=0.30935, R-Square=0.9586, and the adjusted R-Square=0.9527. Combined the results of this article with the other results of the previous three articles in the similar titles in this journal, the final conclusions were acquired as follows:①"The quantitative dependent variable""the quantitative independent variables" and "the multi-value ordered independent variables" should be transformed in an appropriate form. ②The derived variables should be introduced as many as possible in fitting the regression model. ③No intercept term should be applied in fitting the regression models.④The strategies of screening independent variables should be adopted as many as possible during fitting the regression models, such as "forward selection""backward selection" and "stepwise selection".
引文
[1] SAS Institute Inc. STAT SAS 9.3 User's Guide[M]. Cary, NC: SAS Institute Inc, 2011: 7761-8002.
    [2] Fisher RA. Statistical methods for research workers[M].10~(th) Edition. Edinburgh: Oliver & Boyd, 1938: 1-124.
    [3] Kruskal JB. Multidimensional scaling by optimizing goodness of fit to a nonmetrichypothesis[J]. Psychometrika, 1964, 29(1): 1-27.
    [4] 胡良平. 岭回归分析[J]. 四川精神卫生, 2018, 31(3):193-196.
    [5] Box GEP, Cox DR. An analysis of transformations[J]. J R Stati Soc, 1964, 26(2): 211-252.
    [6] 胡良平. 回归建模的基础与要领(Ⅲ)——变量状态与相互间关系[J]. 四川精神卫生, 2018, 31(6): 493-497.
    [7] Reinsch CH. Smoothing by spline functions[J]. Numer Math (Heidelb), 1967, 10(3): 177-183.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700