用户名: 密码: 验证码:
基于RFA和Copula的海南旅游业及高尔夫产业预测
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
旅游业是海南省国际旅游岛的核心产业,为了进行更加科学的旅游规划设计(特别是对旅游业的投资决策),提高资产利用率,保证海南旅游业的健康蓬勃的发展,必须对海南旅游市场作出准确的预测:一是通过对各种影响因素的分析,进行预测(各种回归分析);另一种是通过自身的发展规律,进行预测(时间序列)。本文挑选影响旅游市场的关键因素,首次采用随机森林RFA法和Copula法,对海南旅游业进行预测;考虑到要把海南目前的以”观光旅游”为主,逐步转变为以”休闲旅游”为主,高尔夫旅游是重要因素,为此,本文还首次用非线性时间序列法,对高尔夫产业收入进行预测,具体如下:
     首先找出影响海南旅游的主要因素:全国城镇居民人均可支配收入、固定资产投资、海南省酒店数量、公路通车、铁路通车里程、民航航线里程以及高尔夫球洞数量,再计算它们的Kendall秩相关系数和互信息指数,判断它们和旅游收入的相关性,再采用随机森林RFA法拟合它们之间的函数关系,然后计算各因素的影响程度,结果表明居民人均可支配收入影响最大,酒店数量次之。
     然后,挑选最有影响力的因素,用Copulafa法拟合它们的概率分布,确定其相关结构为frank Copula,再结合RFA法的拟合函数,利用蒙特卡罗法预测出海南旅游收入的概率分布函数,计算出风险度量VaR和条件风险值。结果表明:2012年海南旅游收入超过295亿元的可能性为80%,而超过310亿元的可能性为20%。
     对高尔夫产业的预测,是以海南博鳌亚洲论坛国际会议中心高尔夫球会每月收入为例,先对其对数差分数据进行独立性检验和非线性检验,再利用自相关函数和偏相关函数以及AIC准则,选择差分+ARMA(3,2)模型,用极大似然法估计参数,最后预测了博鳌高尔夫球会月收入均值和95%置信空间
Tourism is an important pillar industry of international island of Hainan. Mar-ket prediction is essential for tourism planning and design, especially for investmentdecision. It also do good to asset utilization ratio as well as healthy and vigorous de-velopment of junketing. There are two ways of prediction, one is finished by analysisof inffuencing factors, the other by finding its own laws. Key inffuencing factors wereselected. Based on Random Forest and Copula, Hainan travel industry was forecasted.Golf plays an essential role in transition from sightseeing to leisure tour. Nonlinear timeseries was applied to Golf estate prediction, as structured below:
     Firstly the national average disposable income of urban residents, fixed asset in-vestment, hotels numbers, road traffc mileage, railway traffc mileage, airplane traffcmileage and Golf holes in Hainan were picked out. Rank correlation coeffcients andmutual information indexes between Hainan tourism income and them were calculatedto order dependence. The function among them was fitted by Random Forest and vari-able importance was computed. The result showed that the national average disposableincome has the most inffuence on travel income and hotel number came second.
     Secondly the Copula of five key factors was fitted as Frank Copula. Combiningwith Random Forest fitted function between tour income and inffuencing factors, prob-ability distribution function of future income was simulated by Monte Carlo. VaR andconditional VaR were figured out. It appeared that tour gainings in Hainan in 2012 has80% probability to exceed 29.5 billion, while 20% probability to exceed 31 billion.
     Thirdly Golf club in BFA International Convention Center was taken as an examplefor Golf estate forecasting. Independence and nonlinear tests are finished on logarithmdifference transformation data. ARMA(3,2) was taken as last model by autocorrelationfunction, partial correlation function and AIC criterion. Parameters were estimated us-ing maximum likelihood estimation. The mean and 95% confidence interval of Monthlyrevenues in Boao golf club were predicted.
引文
[1] Hamilton J D. Time Series Analysis[M]. New Jersey: Princeton University Press,1994.
    [2] Kantz H, Schreiber T. Nonlinear Time Series Analysis: Second Edition[M].London: Cambridge University Press, 2006.
    [3] Engle R. Autorregressive Conditional Heteroskedasticity with Estimates ofUnited Kingdom Inffation[J]. Econometrica, 1982, 50:987–1008.
    [4] Bollerslev T. Generalized Autorregressive Conditional Heteroskedasticity[J].Journal of Econometrics, 1986, 31:307–327.
    [5] Wang C H. Predicting tourism demand using fuzzy time series and hybrid greytheory[J]. Tourism Management, 2004, 25(3):367– 374.
    [6] Kim J H, Wong K, Athanasopoulos G, et al. Beyond point forecasting: Evalua-tion of alternative prediction intervals for tourist arrivals[J]. International Journalof Forecasting, 2010, In Press, Corrected Proof:–.
    [7] Gurney K. An Introduction to Neural Networks[M]. London: UCL Press, 1997.
    [8] Friedman J H, Stuetzle W. Projection pursuit regression[J]. J. Amer. Statist.Assoc., 1981, 76:817–823.
    [9] Fredman J. Multivariate adaptive regression splines(with discussion)[J]. Theannals of statistics, 1991, 19:1–141.
    [10] Vapnik V. The Nature of Statistical Learning Theory[M]. New York: Springer,1995.
    [11] Freidman J H. Greedy function approximation: a gradient boosting machine[J].Annals of Statistics, 2001, 29:1189–1232.
    [12] Chen K Y, Wang C H. Support vector regression with genetic algorithms inforecasting tourism demand[J]. Tourism Management, 2007, 28(1):215– 226.
    [13] Hastie T, Tibshirani R. Genaralized Additive Models[M]. London: Chapmanand Hall, 1990.
    [14] Breiman L, Fredman J, Olshen R, et al. Classification and Regression Trees[M].New York: Chapman and Hall, 1984.
    [15] Breiman L. Random Forests[J]. Mach. Learn, 2001, 45(1):5–32.
    [16] Jiang W. Some Theoretical Aspects of Boosting in the Presence of NoisyData[C]//Proc. 18th International Conf. on Machine Learning.[S.l.]: MorganKaufmann, San Francisco, CA, 2001:234–241.
    [17] Lawrence R, Bunn A, Powell S, et al. Classification of remotely sensed imageryusing stochastic gradient boosting as a refinement of classification tree analy-sis[J]. Remote Sensing of Environment, 2004, 90(3):331–336.
    [18] Hancock T, Put R, Coomans D, et al. A performance comparison of modern
    statistical techniques for molecular descriptor selection and retention predictionin chromatographic QSRR studies[J]. Chemometrics and Intelligent LaboratorySystems, 2005, 76(2):185–196.
    [19] Bricklemyer R S, Lawrence R L, Miller P R, et al. Predicting tillage practices andagricultural soil disturbance in north central Montana with Landsat imagery[J].Agriculture, Ecosystems & Environment, 2006, 114(2-4):210–216.
    [20] Moisen G G, Freeman E A, Blackard J A, et al. Predicting tree species presenceand basal area in Utah: A comparison of stochastic gradient boosting, general-ized additive models, and tree-based methods[J]. Ecological Modelling, 2006,199(2):176–187.
    [21] Bricklemyer R S, Lawrence R L, Miller P R, et al. Monitoring and verifying agri-cultural practices related to soil carbon sequestration with satellite imagery[J].Agriculture, Ecosystems & Environment, 2007, 118(1-4):201–210.
    [22] Cole S. A LOGISTIC TOURISM MODEL: Resort Cycles, Globalization, andChaos[J]. Annals of Tourism Research, 2009, 36(4):689– 714.
    [23] Song H, Li G. Tourism demand modelling and forecasting–A review of recentresearch[J]. Tourism Management, 2008, 29(2):203– 220.
    [24] Wang P, Ji Q. Multi-view face detection under complex scene based on com-bined SVMs[C]//Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17thInternational Conference on. .[S.l.]: [s.n.] , 2004,4:179–182Vol.4.
    [25] Chu F L. Forecasting tourism: a combined approach[J]. Tourism Management,1998, 19(6):515– 520.
    [26] Burger C J S C, Dohnal M, Kathrada M, et al. A practitioners guide to time-series methods for tourism demand forecasting– a case study of Durban, SouthAfrica[J]. Tourism Management, 2001, 22(4):403– 409.
    [27]欧阳润平,胡晓琴.国内外旅游需求研究综述[J].南京财经大学学报,2006(8):80–83.
    [28]李君轶,马耀峰,杨敏.我国旅游市场需求预测研究综述[J].商业研究,2009(3):17–21.
    [29]甘永萍.基于多种模型的广西入境旅游游客量预测[J].广西师范学院学报(自然科学版), 2011, 27(3):65–72.
    [30]纪成君,何建军.国内旅游收入预测模型的比较[J].辽宁工程技术大学学报:自然科学版, 2010, 29(5):990–993.
    [31]李武选,王小建,李源, et al.基于30年入境旅游外汇收入的最佳建模与预测[J].统计与信息论坛, 2009, 24(4):21–26.
    [32]张运来.我国国内旅游需求影响因素分析及趋势预测方法应用研究[D].哈尔滨市:东北林业大学, 2002.
    [33]张侨,蔡道成.基于灰色系统模型的海南中部国际旅游需求预测[J].科技和产业, 2010(3):13–15.
    [34]宋祎.基于支持向量机的海南省入境旅游客流量预测研究[J].生产力研究,2006(7):171–172.
    [35] Weiss S, INdurkhya N. Optimized rule induction[J]. IEEE Expert, 1993,8(6):61–69.
    [36] Kwiatkowski D, Phillips P C B, Schmidt P, et al. Testing the Null Hypothesis ofStationarity against the Alternative of a Unit Root[J]. Journal of Econometrics,1992, 54:159–178.
    [37] Cover T, Thomas J. Elements of information theory[M]. New York, NY: JohnWiley & Sons, 1991.
    [38] Nelsen R. An Introduction to Copulas[M]. Berlin: Springer, 1998.
    [39] Lambert P, Lindsey J K. Analysing financial returns using regression mod-els based on non-symmetric stable distributions[J]. Applied Statistics, 1999,48:409–424.
    [40] Kojadinovic I, Yan J. Modeling Multivariate Distributions with ContinuousMargins Using the copula R Package[J]. Journal of Statistical Software, 2010,34(9):1:20.
    [41] Henderson S G. Chapter 2 Mathematics for Simulation[M]//. Henderson S G,Nelson B L. Simulation.[S.l.]: Elsevier, 2006,Volume 13:19–53.
    [42] R Development Core Team. R: A Language and Environment for StatisticalComputing[H]. R Foundation for Statistical Computing, Vienna, Austria, 2006.ISBN 3-900051-07-0.
    [43] Crawley M J. Statistics: An Introduction using R[M]. London: Wiley, 2005.
    [44] ThodeJr H C. Testing for Normality[M]. New York.: Marcel Dekker, 2002.
    [45] Shannon C, Weaver W. The mathematical theory of communication[M]. Urbana,Illinois: University of Illinois Press, 1949.
    [46]史道济.相关系数与相关性[J].统计科学与实践, 2002, 4:22–24.
    [47] Embrechts L F M A, P. Handbook of Heavy Tailed Distributions in Finance[M].Rachev: Elsevier, 2003: 329–384.
    [48] Fraser A M, Swinney H L. Independent coordinates for strange attractors frommutual information[J]. Physical Review A, 1986, 33(2):1134–1140.
    [49] Nguyen H N, Ohn S Y. DRFE: Dynamic Recursive Feature Elimination for geneidentification based on Random Forest[J]. Neural Information Processing, Pt 3,Proceedings, 2006, 4234:1–10.
    [50] Enot D P, Beckmann M, Draper J. On the interpretation of high throughput MSbased metabolomics fingerprints with Random Forest[J]. Computational LifeSciences ii, Proceedings, 2006, 4216:226–235.
    [51] Barcellos L F, Ramsay P P, Madden E, et al. Random forest analysis: a novel ap-proach for exploring the complex genetic component in rheumatoid arthritis.[J].Arthritis and Rheumatism, 2006, 54(9):S132–S132.
    [52] Palmer D S, O’Boyle N M, Glen R C, et al. Random forest models to predictaqueous solubility[J]. Journal of Chemical Information And Modeling, 2007,47(1):150–158.
    [53] Zhang Q Y, Aires-De-Sousa J. Random forest prediction of mutagenicity fromempirical physicochemical descriptors[J]. Journal of Chemical Information andModeling, 2007, 47(1):1–8.
    [54] Strobl C, Boulesteix A L, Zeileis A, et al. Bias in random forest variable im-portance measures: Illustrations, sources and a solution[J]. Bmc Bioinformatics,2007, 8:25.
    [55] Ehrman T M, Barlow D J, Hylands P J. Virtual screening of Chinese herbswith random forest[J]. Journal of Chemical Information and Modeling, 2007,47(2):264–278.
    [56] Granitto P M, Gasperi F, Biasioli F, et al. Modern data mining tools in descriptivesensory analysis: A case study with a Random forest approach[J]. Food Qualityand Preference, 2007, 18(4):681–689.
    [57] Bayly C I, Brideau C, Liaw A, et al. Iterative focussed screening using RandomForest: A comparison with HTS/random screening for two extreme cases[J]. Ab-stracts of Papers of The American Chemical Society, 2006, 231.
    [58] Granitto P M, Furlanello C, Biasioli F, et al. Recursive feature elimination withrandom forest for PTR-MS analysis of agroindustrial products[J]. Chemometricsand Intelligent Laboratory Systems, 2006, 83(2):83–90.
    [59] Efron B. Bootstrap Methods: Another Look at the Jackknife[J]. The Annals ofStatistics, 1979, 7(1):1–26.
    [60] Singh K. On the Asymptotic Accuracy of Efron’s Bootstrap[J]. The Annals ofStatistics, 1981, 9(6):1187–1195.
    [61] Liaw A, Wiener M. Classification and regression by randomForest[J]. Rnews,2002, 2:18–22.
    [62] Genest C, Quessy J, Rémillard B. Asymptotic local effciency of Cramér-vonMises tests for multivariate independence[J]. The Annals of Statistics,, 2007,35:166–191.
    [63] Genest C, Quessy J F, Remillard B. Goodness-of-fit procedures for copula mod-els based on the probability integral transformation[J]. Scandinavian Journal OfStatistics, 2006, 33(2):337–366.
    [64] Genest C, Rémillard B. Tests of independence and randomness based on theempirical copula process[J]. Test, 2004, 13:335–369.
    [65] Conover W J. Practical Nonparametric Statistics[M]. New York: John Wiley &Sons, 1971: 295–301.
    [66]史道济,姚庆祝.改进Copula对数据拟合的方法[J].系统工程理论与实践,2004, 24(4):49–55.
    [67] Wang W, Wells M T. Model selection and semiparametric inference for bivari-ate failure-time data[J]. Journal of The American Statistical Association, 2000,95:62–72.
    [68] Fermanian J. Goodness of fit tests for copulas[J]. Journal of Multivariate Anal-ysis, 2005, 95:119–152.
    [69] Cornelia S, Mark T. Goodness-of-fit Tests for Parametric Families ofArchimedean Copulas[R]. Münster: University of Muenster, 2004.
    [70] Bickel P, Rosenblatt M. On some global meansures of the deviation of densityfunction estimates[J]. Annals of Statistics, 1973, 1:1071–1095.
    [71] Berg D. Copula Goodness-of-Git Testing: An Overview and Power Compari-son[J]. The European Journal of Finance, 2009, 15:675–701.
    [72] Genest C, Remillard B, Beaudoin D. Goodness-of-fit tests for copulas: A reviewand a power study[J]. Insurance: Mathematics and Economics, 2009, 44:199–214.
    [73] Genest C R, emillard B. Validity of the Parametric Bootstrap for Goodness-of-Fit Testing in Semiparametric Models. e Probabilit[J]. Annales de lInstitut HenriPoincar e Probabilit es et Statistiques, 2008, 44:1096–1127.
    [74] Kojadinovic I, Yan J. A goodness-of-fit test for multivariate multiparameter cop-ulas based on multiplier central limit theorems[J]. Statistics and Computing,2011, 21:17–30. 10.1007/s11222-009-9142-y.
    [75] Matsumoto M, Nishimura T. Mersenne Twister: A 623-dimensionally equidis-tributed uniform pseudo-random number generator[J]. ACM Transactions onModeling and Computer Simulation, 1998, 8:3–30.
    [76] Pollatsek A, Tversky A N. A theory of risk[J]. Journal of Mathematical Psychol-ogy, 1970, 7:540–553.
    [77] Ramsay C M. Loading gross premiums for risk without using utility theory(withDiscussions)[J]. Transactions of the Society of Actuaries XLV, 1994, 48:305–349.
    [78] Artzner P, Delbaen J M, F. adn Eber, Heath D. Coherent measure of risk[J].Mathematical Finance, 1999, 9:203–208.
    [79]王爱民,何 信.金融风险统计度量标准研究[J].统计研究, 2005(2):67–71.
    [80] Fishburm P C, Wakker P. The invention of the independence condition for pref-erences[J]. Management Science, 1995, 41:1130–1144.
    [81]王春峰. VaR金融市场风险管理[M].天津:天津大学出版社, 2001.
    [82] Boda K, Filar J. Time consistent dynamic risk measures[J]. Mathematical Meth-ods of Operations Research, 2006, 63(1):169– 86.
    [83] Sakalauskas V, Kriksciuniene D. Short-term investment risk measurement usingVaR and CVaR[J]. Computational Science-ICCS 2006. 6th International Con-ference. Proceedings, Part IV (Lecture Notes in Computer Science Vol.3994),2006:316– 23.
    [84] Pukelsheim F. The Three Sigma Rule[J]. The American Statistician, 1994,48:88–91.
    [85]王公法.国际旅游岛背景下海南高尔夫球岛建设新思维[J].科技情报开发与经济, 2009, 19(33):97–98.
    [86]赵云鹏.积极推进海南省高尔夫旅游产业发展的研究[J].内蒙古民族大学学报(自然科学版), 2010(04):471–473.
    [87]李如跃.海南省高尔夫旅游发展的SWOT分析[J].南方论刊, 2009(12):79–81.
    [88]门达明.海南高尔夫旅游大众化初探[J].管理观察, 2008:63–64.
    [89]吴兰卡,李如跃.海南高尔夫旅游市场探析[J].商场现代化, 2010(12):56–57.
    [90] Brock D W, W.A., Scheinkman J. A test for independence based on the corre-lation dimension[R].[S.l.]: University of Wisconsin-Madison and University ofChicago, 1986.
    [91] Brock W A, Dechert. W D. A General Class of Specification Tests: The ScalarCase[C]//Proceedings of the American Statistical Association, Business and Eco-nomic Statistics Section. Alexandria: [s.n.] , 1988.
    [92] Theiler J, Eubank S, Longtin A, et al. Testing for nonlinearity in time series: themethod of surrogate data[J]. Physica D Nonlinear Phenomena, 1992, 58:77–94.
    [93] Lee T H, White H, Granger C W J. esting for neglected nonlinearity in timeseries models[J]. Journal of Econometrics, 1993, 56:269–290.
    [94] Lee T H, White H, Granger C W J. Testing for neglected nonlinearity in timeseries models: a comparison of neural network methods and alternative tests[J].2001:208–229.
    [95] White H. An additional hidden unit test for neglected nonlinearity in multilayerfeedforward networks[C]//Proceedings of the International Joint Conference onNeural Networks. New York: IEEE Press, 1989,2:451–455.
    [96] TSAY R S. Nonlinearity tests for time series[J]. Biometrika, 1986, 73(2):461–466.
    [97] McLeod W, A.I. & Li. Diagnostic checking ARMA time series models usingsquared-residual autocorrelations[J]. Journal of Time Series Analysis, 1983,4:269–273.
    [98] Schreiber T, Schmitz A. Improved Surrogate Data for Nonlinearity Tests[J].Phys. Rev. Lett., 1996, 77(4):635–638.
    [99] Unsworth C P, Cowper M R, Mulgrew B, et al. Improved surrogate data testsfor sea clutter[J]. IEE PROCEEDINGS-RADAR SONAR AND NAVIGATION,2001, 148(3):112–118.
    [100] Kugiumtzis D. Surrogate data test for nonlinearity including nonmonotonictransforms[J]. Phys. Rev. E, 2000, 62(1):R25–R28.
    [101] Schreiber T. Constrained Randomization of Time Series Data[J]. Phys. Rev.Lett., 1998, 80(10):2105–2108.
    [102] Teraesvirta T, Lin C F, , et al. Power of the Neural Network Linearity Test[J].ournal of Time Series Analysis, 1993, 14:209–220.
    [103]钱文科.基于转型增效的海南旅游产业暨国际旅游岛发展研究[D].上海:上海师范大学, 2008.
    [104]朱晓惠.海南国际旅游岛建设与新型体育旅游的发展[J].琼州学院学报,2010, 17(5):63–65.
    [105]张海东,程春艳.海口海瑞故居旅游文化品牌建设[J].热带农业工程, 2010,34(2):44–48.
    [106]焦慧元.建设国际旅游岛背景下的海南生态旅游发展[J].合作经济与科技,2010(24):18–19.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700