用户名: 密码: 验证码:
The GA-based algorithms for optimizing hiding sensitive itemsets through transaction deletion
详细信息    查看全文
  • 作者:Chun-Wei Lin (1) (2)
    Tzung-Pei Hong (3) (4)
    Kuo-Tung Yang (3)
    Shyue-Liang Wang (5)

    1. Innovative Information Industry Research Center (IIIRC)
    ; Harbin Institute of Technology Shenzhen Graduate School ; Shenzhen ; China
    2. Shenzhen Key Laboratory of Internet Information Collaboration
    ; School of Computer Science and Technology ; Harbin Institute of Technology Shenzhen Graduate School ; Shenzhen ; China
    3. Department of Computer Science and Information Engineering
    ; National University of Kaohsiung ; Kaohsiung ; Taiwan ; Republic of China
    4. Department of Computer Science and Engineering
    ; National Sun Yat-sen University ; Kaohsiung ; Taiwan ; Republic of China
    5. Department of Information Management
    ; National University of Kaohsiung ; Kaohsiung ; Taiwan ; Republic of China
  • 关键词:Privacy preserving ; Data mining ; Genetic algorithm ; Pre ; large concept ; Evolutionary computation
  • 刊名:Applied Intelligence
  • 出版年:2015
  • 出版时间:March 2015
  • 年:2015
  • 卷:42
  • 期:2
  • 页码:210-230
  • 全文大小:2,127 KB
  • 参考文献:1. Amiri A (2007) Dare to share: protecting sensitive knowledge with data sanitization. Decis Support Syst 43(1):181鈥?91 CrossRef
    2. Evfimievski A, Srikant R, Agrawal R, Gehrke J (2002) Privacy preserving mining of association rules. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 217鈥?28
    3. Bache K, Lichman M (2012) Uci machine learning repository. https://archive.ics.uci.edu/ml/datasets.html
    4. Aggarwal CC, Yu PS (2009) A survey of uncertain data algorithms and applications. IEEE Trans Data Knowl Eng 21(5):609鈥?23 CrossRef
    5. Aggarwal CC, Pei J, Zhang B (2006) On privacy preservation against adversarial data mining. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 510鈥?16
    6. Chen CM, Lin YH, Sun HM (2013) SASHIMI: secure aggregation via successively hierarchical inspecting of message integrity on WSN. Journal of Information Hiding and Multimedia Signal Processing
    7. Lin CW, Hong TP (2013) A survey of fuzzy web mining. Wiley Interdiscip Rev: Data Min Knowl Disc 3(3):190鈥?99
    8. Lin CW, Hong TP, Chang CC, Wang SL (2013) A greedy-based approach for hiding sensitive itemsets by transaction insertion. J Info Hiding Multimedia Signal Proc 4(4):201鈥?27
    9. Cheung DW, Ng VT, Tam BW (1997) Incremetal updates of discovered multi-level association rules. Int J Art Intell Tools 6(2):273鈥?90 CrossRef
    10. Cheung DWL, Han J, Ng V , Wong CY (1996) Maintenance of discovered association rules in large databases: an incremental updating technique. Int Conf Data Eng:106鈥?14
    11. Dasseni E, Verykios VS, Elmagarmid AK, Bertino E (2001) Hiding association rules by using confidence and support. Int Workshop Info Hiding:369鈥?83
    12. Giannotti F, Lakshmanan LVS, Monreale A, Pedreschi D, Wang HW (2012) Privacy-preserving mining of association rules from outsourced transaction databases. IEEE Syst J 7(3):385鈥?395 CrossRef
    13. Lan GC, Hong TP, Tseng VS (2011) Discovery of high utility itemsets from on-shelf time periods of products. Expert Syst Appl 38(5):5851鈥?857 CrossRef
    14. Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases data and knowledge engineering
    15. Holland JH (1992) Adaptation in natural and artificial systems. MIT Press
    16. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Mining Knowl Dis 8(1):53鈥?7 CrossRef
    17. Pei J, Han J, Mortazavi-Asl B, Wang J, Pinto H, Chen Q, Dayal U, Hsu MC (2004) Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Trans Knowl Data Eng 16(11):1424鈥?440 CrossRef
    18. Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81鈥?06
    19. Dunning LA, Kresman R (2013) Privacy preserving data sharing with anonymous id assignment. IEEE Trans Info Forensics Security 8(2):402鈥?13 CrossRef
    20. Dehkordi MN, Badie K, Zadeh AK (2009) A novel method for privacy preserving in association rule mining based on genetic algorithms. J Software 4(6):555鈥?62 CrossRef
    21. Chen MS, Han J, Yu PS (1996) Data mining: an overview from a database perspective. IEEE Trans Knowl Data Eng 8(6):866鈥?83 CrossRef
    22. Berkhin P (2006) A survey of clustering data mining techniques. In: Grouping multidimensional data, pp 25鈥?1
    23. Quinlan JR (1983) C4.5: Programs for machine learning. Morgan Kaufmann Publishers Inc.
    24. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: The international conference on very large data bases, pp 487鈥?99
    25. Agrawal R, Srikant R (1995) Mining sequential patterns. In: The international conference on data engineering, pp 3鈥?4
    26. Agrawal R, Srikant R (2000) Privacy-preserving data mining. In: ACM SIGMOD international conference on management of data, pp 439鈥?50
    27. Agrawal R, Imielinski T, Swami A (1993) Database mining: a performance perspective. IEEE Trans Knowl Data Eng 5(6):914鈥?25 CrossRef
    28. Srikant R, Agrawal R (1996) Mining sequential patterns: generalizations and performance improvements. In: The international conference on extending database technology: advances in database technology, pp 3鈥?7
    29. Kotsiantis SB (2007) Supervised machine learning: a review of classification techniques. In: The conference on emerging artificial intelligence applications in computer engineering: real word AI systems with applications in eHealth, HCI, information retrieval and pervasive technologies, pp 3鈥?4
    30. Han S, Ng WK (2007) Privacy-preserving genetic algorithms for rule discovery. In: International conference on data warehousing and knowledge discovery, pp 407鈥?17
    31. Oliveira SM, Zaane O, Saygin Y (2004) Secure association rule sharing. Lect Notes Comput Sci 3056:74鈥?5 CrossRef
    32. Oliveira SRM, Zaane OR (2002) Privacy preserving frequent itemset mining. In: IEEE international conference on privacy, security and data mining, pp 43鈥?4
    33. Liu TH, Wang Q, Zhu HF (2014) A multi-function password mutual authentication key agreement scheme with privacy preserving. Journal of Information Hiding and Multimedia Signal Processing
    34. Hong TP, Wang CY (2007) Maintenance of association rules using pre-large itemsets. In: Intelligent databases: technologies and applications, pp 44鈥?0
    35. Hong TP, Lin CW, Wu YL (2008) Incrementally fast updated frequent pattern trees. Expert Syst Appl 34(4):2424鈥?435 CrossRef
    36. Hong TP, Wang CY, Tao YH (2001) A new incremental data mining algorithm using pre-large itemsets. Intell Data Anal 5:111鈥?29
    37. Wu TY, Tseng YM (2013) Further analysis of pairing based traitor tracing schemes for broadcast encryption. Security and Communication Networks
    38. Verykios VS, Bertino E, Fovino IN, Provenza LP, Saygin Y, Theodoridis Y (2004) State-of-the-art in privacy preserving data mining. ACM SIGMOD Record
    39. Wu YH, Chiang CM, Che ALP (2007) Hiding sensitive association rules with limited side effects. IEEE Trans Knowl Data Eng 19(1):29鈥?2 CrossRef
    40. Lindell Y, Pinkas B (2000) Privacy preserving data mining. In: The annual international cryptology conference on advances in cryptology, pp 36鈥?4
    41. Zheng Z, Kohavi R, Mason L (2001) Real world performance of association rule algorithms. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 401鈥?406
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Mechanical Engineering
    Manufacturing, Machines and Tools
  • 出版者:Springer Netherlands
  • ISSN:1573-7497
文摘
Data mining technology is used to extract useful knowledge from very large datasets, but the process of data collection and data dissemination may result in an inherent threat to privacy. Some sensitive or private information concerning individuals, businesses and organizations has to be suppressed before it is shared or published. Privacy-preserving data mining (PPDM) has become an important issue in recent years. In the past, many heuristic approaches were developed to sanitize databases for the purpose of hiding sensitive information in PPDM, but data sanitization of PPDM is considered to be an NP-hard problem. It is critical to find the balance between privacy protection for hiding sensitive information and maintaining the discovery of knowledge, or even reducing artificial knowledge in the sanitization process. In this paper, a GA-based framework with two optimization algorithms is proposed for data sanitization. A novel evaluation function with three concerned factors is designed to find the appropriate transactions to be deleted in order to hide sensitive itemsets. Experiments are then conducted to evaluate the performance of the proposed GA-based algorithms with regard to different factors such as the execution time, the number of hiding failures, the number of missing itemsets, the number of artificial itemsets, and database dissimilarity.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700