用户名: 密码: 验证码:
证券分析中数据挖掘模型的研究及应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
证券分析是现代金融分析的基本研究对象。证券市场在我国的短短几十年内迅猛发展,越来越多的人将资金投入到证券中,证券市场尤其在近几年异常活跃。而随着证券市场的快速发展也对证券分析系统提出了更高的要求,因此对证券分析系统的研究也成为金融分析研究的一个重要课题。同时数据挖掘技术近几年被研究越来越多,尤其在证券分析领域中,数据挖掘技术由于其具有强大的发掘潜在信息的能力,被广泛应用。
     在证券分析中,股票预测是金融数据挖掘的一个重要研究方向。股票时间序列除了具有非线性、非平稳和动态等一般时间序列具有的特征外,还具有高噪音、非正态、尖峰厚尾等特征,因此股票时间序列预测更具有挑战性,并有广阔的应用价值和市场前景。同时相似股票时间序列检索也是证券分析中的一个研究重点。随着证券市场的繁荣,股票价格的波动显得更加复杂,从大量股票的历史数据中快速查找出与其具有相似波动规律的股票从而进行预测或者投资组合分析是证券分析系统中不可缺少的功能。
     针对以上两个关键问题,本文重点研究了基于遗传BP神经网络混合模型在股票预测中的应用。传统的研究只能对短趋式的预测才有比较好的预测效果,同时大部分的模型都只适用于他们所实验的单一种类的数据而并不能适用于其他类型的测试数据,准确率偏低,连续预测值波动性小等问题。本文针对上述问题提出一种改进的遗传BP神经网络模型,通过实验表明,该模型适用于长期预测,同时预测结果准确率高。同时我们还建立了相似股票时间序列检索库,我们先对股票序列进行特征提取,然后使用模糊聚类方法对时间序列进行聚类分析,在模糊聚类分析方法中我们对其有效性指标进行了改进,提高了聚类效果。最后,基于以上理论算法的研究,我们利用软件工程的思想建立了金融证券分析系统,包括数据的获取,数据库的建立,到算法模型的实现以及结果的展示等。
Nowadays, Analyzing Securities is the basically researching object in financial analyzing field. In the last few decades, Analyzing Securities grows up quickly in China, more and more people put their money into Securities market which is flourishing in recent years. Analyzing Securities System is looked forward to high quality with the rapid developing of Securities market, so researching on Analyzing Securities System has become an important issue in financial analyzing area.
     In Analyzing Securities, stock prediction is an important researching area. Besides being nonlinear, non-stationary, and dynamic, financial time series also has special properties, like being high noisy, non-normal, sharp-peaked and heavy-tailed, so stock series prediction is more challenging and has great values in practical application and bright prospect in marketing. Meanwhile, similar research in stock time series is also an important field in Analyzing Securities. With the Securities market’s boom, stock price fluctuates complicatedly. Quickly querying similar stocks from a great deal of stock history data becomes important in Analyzing Securities.
     According to above two key points, we pay attention on the application of the forecasting model based on GA-BP neural network. Traditional prediction models only just work well on short term predictions, they just fit to the data they experienced but not to the other kinds of testing data and the results fluctuate slowly in continuous prediction. In this paper, we propose a stable and efficient model based on GA-BP neural network for long term forecasting, this proposed system practices well in the experiences. Here, we also build a similar stock time series searching database. After withdrawing characteristic from stock time series, cluster the time series’characteristic values with fuzzy c-mean clustering (FCM). In FCM, we propose a new validity index, with it, we improve the clustering’s efficiency. Finally, we integrate these models into a Financial Analyzing Securities System.
引文
[1] Kopin Tan. 敏感的市场为何有看涨的可能(译). 巴伦周刊[J]. 2007.3.
    [2] Peayson Karl. The Problem of the Random Walk. Nature[J], 1905, 72: 342.
    [3] 韩家炜著, 范明, 孟小峰译. 数据挖掘:概念与技术[M]. 机械工业出版社. 2001.8.
    [4] 聂亚可. 序列挖掘及其在证券分析中的应用[D]. 重庆大学计算机科学与技术硕士学位论文[D]. 2001.
    [5] 谷赫. 时间序列的数据挖掘在证券预测分析中的应用研究[D]. 硕士学位论文. 2005.5.
    [6] 曹鑫. 论我国证券市场的现状和发展构想. 理论探新[Z]. 2000:115-116.
    [7] Kendall Maurice. The Analysis of Economic Time Series. Journal of the Royal Statistical Society[J], 1953, 96(A):11-25.
    [8] Basu Sanjoy. The Investment Performance of Common Stocks in Relation to Their Price to Earnings Ratio: A Test of the Efficient Markets Prosthesis. Journal of Finance[J], 1977, 32:663-682.
    [9] Banz Rolf. The Relationship Between Return and Market Value of Common Stocks. Journal of Financial Economics[J], 1981, 9: 3-18.
    [10] Schwert William. Size and Stock Returns and Other Empirical Regularities. Journal of Financial Economics[J], 1983, 12:3-12.
    [11] Dimson Elroy, Paul Marsh. The Smaller Companies Puzzle. Investment Analyst[J], 1989, 91:16-24.
    [12] Ritter Jay. The Long-Run Performance of Initial Public Offerings. Journal of Finance[J]. 1991, 46:3-28.
    [13] Loughran Tim, Jay Ritter. The New Issues Puzzle. Journal of Finance[J], 1995, 50:23-51.
    [14] Rozeff Michael, William Kinney. Capital Market Seasonality: The Case of Stock Returns. Journal of Financial Economics[J], 1976,3:370-402.
    [15] Cocharane John H.By Force of Habit: A Consumption-Based Explanation of Aggregate Stock Market Behavior. Journal of Political Economy[J], April 1999:205-251.
    [16] 李小龙. 我国证券投资预测方法的研究. 西北工业大学硕士学位论文[D]. 2005.
    [17] 顾忠伟. 灰色挖掘及其在证券分析中的应用研究[D]. 硕士学位论文. 2003.12.
    [18] 李丰龙. 基于神经网络的金融证券预测方法研究[D]. 硕士学位论文. 2006.5.
    [19] 张玉林. 神经网络在股市预测中的建模及应用[D]. 硕士学位论文. 2004.6.
    [20] 凌毅. 神经网络在证券系统中的应用[D]. 硕士学位论文. 2002.5.
    [21] 李斌, 谭立湘, 章劲松, 庄镇泉. 面向数据挖掘的时间序列符号化方法研究. 电路与系统学报. 2000,5(2):9-14.
    [22] 谷赫. 时间序列的数据挖掘在证券预测分析中的应用研究[D]. 硕士学位论文. 2005.5
    [23] 曾海泉. 时间序列挖掘与相似性查找技术研究[D]. 复旦大学博士学位论文[D], 2003.4.
    [24] 罗月丰. 基于数据挖掘的证券 CRM 客户细分研究[D]. 硕士学位论文. 2006.5
    [25] 张强. 聚类方法在证券行业中的应用[D]. 硕士学位论文. 2003.8.
    [26] 张德富, 熊腾科, 邓安生. 基于模糊修正的金融预测. 计算机工程与应用[J]. 2005,41 (25):216-220.
    [27] 江浩. 面向相似性的时间序列表示与搜索方法研究[D]. 硕士学位论文. 2004.4.
    [28] 唐亮. 时间序列挖掘和相似性查找技术的研究[D]. 硕士学位论文. 2004.4.
    [29] 牛魔王. http://www.stock2000.net/.
    [30] 大智慧. http://www.gw.com.cn/.
    [31] 李红梅. 股票分析和预测系统. 天津大学软件工程硕士学位论文[D]. 2004.
    [32] 宋毅红. 基于 C/S 结构的证券交易系统性能评价研究. 计算机应用与软件[J]. 2007, 3(24):120-121.
    [33] 谷岩. 基于数据仓库和 OLAP 技术的证券交易系统的实现方案研究. 计算机工程与应用[J]. 2004.31:215-217.
    [34] 程少飞, 王红卫, 谢勇. 基于 Web 的证券分析决策支持系统. 系统工程[J]. 2002, 20(3):29-32.
    [35] Qingshan Jiang, Raj Srinivasan, Dean Slonowsky. Measurement Based Traffic Prediction Using Fuzzy Logic. 2002 IEEE Canadian Conference on Electrical & Computer Engineering[C]. 834-840.
    [36] 同花顺. http://www.10jqka.com.cn/.
    [37] 邵宇,秦培景. 证券投资分析[M]. 复旦大学出版. 2005.12.
    [38] 高川陵. 证券分析系统的研制与开发[D]. 硕士毕业论文. 2001.5.
    [39] 杜习瑞, 张树哲, 李迎春. 几种证券投资分析方法的比较. 证券广场[R]. 2007(1): 53.
    [40] 朱新满. 基于行为金融的我国证券市场行为特征的分析研究[D]. 硕士学位论文. 2005.1.
    [41] 张玉斌. 股票市场信息分析方法研究[D]. 硕士学位论文. 2004.3.
    [42] 李卫民. ARMA-广义回归神经网络技术在股票预测中的应用研究. 硕士论文. 山东科技大学[J]. 2004.5.
    [43] Michael P,Clements, Philip Hans Franses, Norman R. Swanson. Forecasting economic and financial time-series with non-linear models. International Journal of Forecasting [J]. 20(2004):169-183.
    [44] Agrawal R, Faloutsos C, Swami A. Efficient Similarity Search In Sequence Databases. In D. Lomet,editor, Proceedings of the 4th International Conference of Foundations of Data Organization and Algorithms (FODO) [C], 1993:69-84.
    [45] 彭玉青. 时间序列数据相似模式挖掘的研究与应用[D]. 硕士学位论文. 2005.3.
    [46] 汤胤. 时间序列相似性分析方法研究. 计算机工程与应用[J], 2006.01. 68-71.
    [47] 张军. 基于时间序列相似性的数据挖掘方法研究[D]. 硕士学位论文. 2006.4.
    [48] 郑扣根, 庄越挺译. 人工智能[M]. 机械工业出版社. 2000.9.
    [49] 唐万梅. BP 神经网络网络结构优化问题的研究. 系统工程理论与实践[J]. 2005, 10(10):95-100.
    [50] 邓娟等. 一种改进的 BP 算法神经网络. 东华大学学报(自然科学版) [J]. 2005,31(3):123-126.
    [51] Myoung-Jong Kim, etc. The Discovery of Experts’ Decision Rules from Qualitative Bankruptcy Data Using Genetic Algorithms. Expert System with Applications 2003 [J], 25: 637-646.
    [52] 张波等. 基于遗传 BP 神经网络的数据挖掘技术. 自动化技术与应用[J], 2005,24(9): 4-6.
    [53] 赵雪红等. 基于组合遗传神经网络的磨损趋势预测. 润滑与密封[J]. 2005, 9(5): 40-42.
    [54] Sung-Kwun Oh, etc. Multi-layer Self-organizing Polynomial Neural Networks and Their Development with the Use of Genetic Algorithms. Journal of the Franklin Institute [J]. 2005,9.
    [55] Philip Doganis, etc. Time Series Forecasting for Short Shelf-life Food Products Based on Artificial Neural Networks and Evolutionary Computing. Journal of Food Engineering [J]. 2005, 7.
    [56] Kyong-Jae Kim, etc. Genetic Algorithms Approach to Feature Discretization in Artificial Neural Networks for the Prediction of Stock Price Index. Expert System with Application 2000 [J],19:125-132.
    [57] J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms[M], Plenum Press, New Work, 1981.
    [58] Sun H, Wang S, Jiang Q. A New cluster-validity for Determining the Number of Clusters in a Data Set. IJCNN’01[C], Washington DC, July 2001:17-19.
    [59] Gonzalez T. Clustering to Minimize and Maximum Intercluster Distance. Theoretical Computer Science [J], 1985, 38:293-306.
    [60] Pal N R, Bezdek J C. On Cluster Validity for the Fuzzy C-Mean Model. IEEE Transactions on Fuzzy Systems [J], 1995:370-390.
    [61] Xie X, Beni G. A Validity Measure for Fuzzy Clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence (PA-MI)[J],1991,13(8):841-847.
    [62] Bensaid A M. Validity-Guided(Re) Clustering with Applications to Image Segmentation. IEEETransactions on Fuzzy Systems[J]. 1996,4(2).
    [63] Kwon S H. Cluster validity index for fuzzy clustering. ELECTRONICS LETTERS[C], 1998, 34(22): 2176-2177.
    [64] Rezaee M, Letlieveldt B, Reiber J. A new cluster validity index for the Fuzzy c-means. Pattern Recognition Letters[J], 1998,19:237-246.
    [65] Sun H, Wang S, Jiang Q. FCM-Based Model Selection Algorithms for Determining the Number of Cluster. By Pattern Recognition[J], 2003.
    [66] ZHANG Defu, Qingshan Jiang, Xin Li. Application of Neural Networks in Financial Data Mining, 2004 International Journal of Computational Intelligence Volume 1 Number 2 2004 ISSN[J]: 1304-4508:116-119.
    [67] 杨晴. 神经网络方法在证券市场预测中的应用研究. 电子科技大学计算机应用硕士学位论文[D]. 2004.
    [68] 宋彤,刘宇新,Gonzalez T.结合遗传算法优化 BP 神经网络的结构和参数.计算机应用与软件[J], 2004,21(6).
    [69] 李爱国, 覃征, 贺升平. 时间序列数据的相似模式抽取. 西安交通大学学报[J].2002, 12(36): 1275-1278.
    [70] Dae-Won Kim, Kwang H.Lee, Doheon Lee. On Cluster Validity Index for Estimation of the Optimal Number os Fuzzy Clusters. By Pattern Recognition[J], 2004 : 2013-2016.
    [71] Pal N R, Bezdek J C. On Cluster Validity for the Fuzzy C-Mean Model. IEEE Transactions on Fuzzy Systems [J], 1995. 370-390.
    [72] Andrew W. Lo, Harry Mamaysky, Jiang Wang: Foundations of Technical Analysis:Computational Algorithms, Statistical Inference, and Empirical Implementation. NBER 1050 Massachusetts Avenue Cambridge[C], MA 02138, 2000.3.
    [73] ZHANG Defu, Qingshan Jiang, Xin Li. A Hybrid Mining Model Based on Neural Network and Kernel Smoothing Technique. ICCS 2005[C], LNCS 3516:801-805.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700