The GA-based algorithms for optimizing hiding sensitive itemsets through transaction deletion

设为首页

收藏本站

网站地图 | English | 公务邮箱

NSTL服务站

详细信息查看全文

作者：Chun-Wei Lin (1) (2)
Tzung-Pei Hong (3) (4)
Kuo-Tung Yang (3)
Shyue-Liang Wang (5)

1. Innovative Information Industry Research Center (IIIRC) ; Harbin Institute of Technology Shenzhen Graduate School ; Shenzhen ; China
2. Shenzhen Key Laboratory of Internet Information Collaboration ; School of Computer Science and Technology ; Harbin Institute of Technology Shenzhen Graduate School ; Shenzhen ; China
3. Department of Computer Science and Information Engineering ; National University of Kaohsiung ; Kaohsiung ; Taiwan ; Republic of China
4. Department of Computer Science and Engineering ; National Sun Yat-sen University ; Kaohsiung ; Taiwan ; Republic of China
5. Department of Information Management ; National University of Kaohsiung ; Kaohsiung ; Taiwan ; Republic of China
关键词：Privacy preserving ; Data mining ; Genetic algorithm ; Pre ; large concept ; Evolutionary computation
刊名：Applied Intelligence
出版年：2015
出版时间：March 2015
年：2015
卷：42
期：2
页码：210-230
全文大小：2,127 KB
参考文献：1. Amiri A (2007) Dare to share: protecting sensitive knowledge with data sanitization. Decis Support Syst 43(1):181鈥?91 CrossRef
2. Evfimievski A, Srikant R, Agrawal R, Gehrke J (2002) Privacy preserving mining of association rules. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 217鈥?28
3. Bache K, Lichman M (2012) Uci machine learning repository. https://archive.ics.uci.edu/ml/datasets.html
4. Aggarwal CC, Yu PS (2009) A survey of uncertain data algorithms and applications. IEEE Trans Data Knowl Eng 21(5):609鈥?23 CrossRef
5. Aggarwal CC, Pei J, Zhang B (2006) On privacy preservation against adversarial data mining. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 510鈥?16
6. Chen CM, Lin YH, Sun HM (2013) SASHIMI: secure aggregation via successively hierarchical inspecting of message integrity on WSN. Journal of Information Hiding and Multimedia Signal Processing
7. Lin CW, Hong TP (2013) A survey of fuzzy web mining. Wiley Interdiscip Rev: Data Min Knowl Disc 3(3):190鈥?99
8. Lin CW, Hong TP, Chang CC, Wang SL (2013) A greedy-based approach for hiding sensitive itemsets by transaction insertion. J Info Hiding Multimedia Signal Proc 4(4):201鈥?27
9. Cheung DW, Ng VT, Tam BW (1997) Incremetal updates of discovered multi-level association rules. Int J Art Intell Tools 6(2):273鈥?90 CrossRef
10. Cheung DWL, Han J, Ng V , Wong CY (1996) Maintenance of discovered association rules in large databases: an incremental updating technique. Int Conf Data Eng:106鈥?14
11. Dasseni E, Verykios VS, Elmagarmid AK, Bertino E (2001) Hiding association rules by using confidence and support. Int Workshop Info Hiding:369鈥?83
12. Giannotti F, Lakshmanan LVS, Monreale A, Pedreschi D, Wang HW (2012) Privacy-preserving mining of association rules from outsourced transaction databases. IEEE Syst J 7(3):385鈥?395 CrossRef
13. Lan GC, Hong TP, Tseng VS (2011) Discovery of high utility itemsets from on-shelf time periods of products. Expert Syst Appl 38(5):5851鈥?857 CrossRef
14. Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases data and knowledge engineering
15. Holland JH (1992) Adaptation in natural and artificial systems. MIT Press
16. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Mining Knowl Dis 8(1):53鈥?7 CrossRef
17. Pei J, Han J, Mortazavi-Asl B, Wang J, Pinto H, Chen Q, Dayal U, Hsu MC (2004) Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Trans Knowl Data Eng 16(11):1424鈥?440 CrossRef
18. Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81鈥?06
19. Dunning LA, Kresman R (2013) Privacy preserving data sharing with anonymous id assignment. IEEE Trans Info Forensics Security 8(2):402鈥?13 CrossRef
20. Dehkordi MN, Badie K, Zadeh AK (2009) A novel method for privacy preserving in association rule mining based on genetic algorithms. J Software 4(6):555鈥?62 CrossRef
21. Chen MS, Han J, Yu PS (1996) Data mining: an overview from a database perspective. IEEE Trans Knowl Data Eng 8(6):866鈥?83 CrossRef
22. Berkhin P (2006) A survey of clustering data mining techniques. In: Grouping multidimensional data, pp 25鈥?1
23. Quinlan JR (1983) C4.5: Programs for machine learning. Morgan Kaufmann Publishers Inc.
24. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: The international conference on very large data bases, pp 487鈥?99
25. Agrawal R, Srikant R (1995) Mining sequential patterns. In: The international conference on data engineering, pp 3鈥?4
26. Agrawal R, Srikant R (2000) Privacy-preserving data mining. In: ACM SIGMOD international conference on management of data, pp 439鈥?50
27. Agrawal R, Imielinski T, Swami A (1993) Database mining: a performance perspective. IEEE Trans Knowl Data Eng 5(6):914鈥?25 CrossRef
28. Srikant R, Agrawal R (1996) Mining sequential patterns: generalizations and performance improvements. In: The international conference on extending database technology: advances in database technology, pp 3鈥?7
29. Kotsiantis SB (2007) Supervised machine learning: a review of classification techniques. In: The conference on emerging artificial intelligence applications in computer engineering: real word AI systems with applications in eHealth, HCI, information retrieval and pervasive technologies, pp 3鈥?4
30. Han S, Ng WK (2007) Privacy-preserving genetic algorithms for rule discovery. In: International conference on data warehousing and knowledge discovery, pp 407鈥?17
31. Oliveira SM, Zaane O, Saygin Y (2004) Secure association rule sharing. Lect Notes Comput Sci 3056:74鈥?5 CrossRef
32. Oliveira SRM, Zaane OR (2002) Privacy preserving frequent itemset mining. In: IEEE international conference on privacy, security and data mining, pp 43鈥?4
33. Liu TH, Wang Q, Zhu HF (2014) A multi-function password mutual authentication key agreement scheme with privacy preserving. Journal of Information Hiding and Multimedia Signal Processing
34. Hong TP, Wang CY (2007) Maintenance of association rules using pre-large itemsets. In: Intelligent databases: technologies and applications, pp 44鈥?0
35. Hong TP, Lin CW, Wu YL (2008) Incrementally fast updated frequent pattern trees. Expert Syst Appl 34(4):2424鈥?435 CrossRef
36. Hong TP, Wang CY, Tao YH (2001) A new incremental data mining algorithm using pre-large itemsets. Intell Data Anal 5:111鈥?29
37. Wu TY, Tseng YM (2013) Further analysis of pairing based traitor tracing schemes for broadcast encryption. Security and Communication Networks
38. Verykios VS, Bertino E, Fovino IN, Provenza LP, Saygin Y, Theodoridis Y (2004) State-of-the-art in privacy preserving data mining. ACM SIGMOD Record
39. Wu YH, Chiang CM, Che ALP (2007) Hiding sensitive association rules with limited side effects. IEEE Trans Knowl Data Eng 19(1):29鈥?2 CrossRef
40. Lindell Y, Pinkas B (2000) Privacy preserving data mining. In: The annual international cryptology conference on advances in cryptology, pp 36鈥?4
41. Zheng Z, Kohavi R, Mason L (2001) Real world performance of association rule algorithms. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 401鈥?406
刊物类别：Computer Science
刊物主题：Artificial Intelligence and Robotics
Mechanical Engineering
Manufacturing, Machines and Tools
出版者：Springer Netherlands
ISSN：1573-7497

文摘

Data mining technology is used to extract useful knowledge from very large datasets, but the process of data collection and data dissemination may result in an inherent threat to privacy. Some sensitive or private information concerning individuals, businesses and organizations has to be suppressed before it is shared or published. Privacy-preserving data mining (PPDM) has become an important issue in recent years. In the past, many heuristic approaches were developed to sanitize databases for the purpose of hiding sensitive information in PPDM, but data sanitization of PPDM is considered to be an NP-hard problem. It is critical to find the balance between privacy protection for hiding sensitive information and maintaining the discovery of knowledge, or even reducing artificial knowledge in the sanitization process. In this paper, a GA-based framework with two optimization algorithms is proposed for data sanitization. A novel evaluation function with three concerned factors is designed to find the appropriate transactions to be deleted in order to hide sensitive itemsets. Experiments are then conducted to evaluate the performance of the proposed GA-based algorithms with regard to different factors such as the execution time, the number of hiding failures, the number of missing itemsets, the number of artificial itemsets, and database dissimilarity.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700