摘要
Python is widely used in web crawler, machine learning, data analysis and so on. However, there is no guarantee that Python scripts are trusted in their whole lifetime because of system insecurity. When the system is attacked, scripts in the computer are likely to be tampered with. Therefore, the trustworthiness of Python scripts needs to be checked through different configuration strategies, including integrity verification and vulnerability detection. In this paper, integrity verification and vulnerability detection are based on two Python scripts, an original Python script and a current Python script, and the original Python script is assumed to has no vulnerabilities. By comparing with the original script, we can find out whether the current script is integrity or not and detect whether there are vulnerabilities if the integrity of the current file is destroyed. Integrity verification with Hash functions is not applied in some cases. In this mode, any changes including blank lines added are considered illegal. So loose integrity verification by combining UNIX diff tool with abstract syntax trees is proposed. The vulnerability detection starts from the premise that the original Python script has no vulnerabilities, and taint analysis is applied on the vulnerability detection framework Bandit to find vulnerabilities. Besides, in order not to change the usage of Python, both integrity verification and vulnerability detection modules are embedded in Python interpreter. The experiments show that the performance of security analysis framework is good and Bandit with taint can greatly reduce the false positive results without affecting the performance.
Python is widely used in web crawler, machine learning, data analysis and so on. However, there is no guarantee that Python scripts are trusted in their whole lifetime because of system insecurity. When the system is attacked, scripts in the computer are likely to be tampered with. Therefore, the trustworthiness of Python scripts needs to be checked through different configuration strategies, including integrity verification and vulnerability detection. In this paper, integrity verification and vulnerability detection are based on two Python scripts, an original Python script and a current Python script, and the original Python script is assumed to has no vulnerabilities. By comparing with the original script, we can find out whether the current script is integrity or not and detect whether there are vulnerabilities if the integrity of the current file is destroyed. Integrity verification with Hash functions is not applied in some cases. In this mode, any changes including blank lines added are considered illegal. So loose integrity verification by combining UNIX diff tool with abstract syntax trees is proposed. The vulnerability detection starts from the premise that the original Python script has no vulnerabilities, and taint analysis is applied on the vulnerability detection framework Bandit to find vulnerabilities. Besides, in order not to change the usage of Python, both integrity verification and vulnerability detection modules are embedded in Python interpreter. The experiments show that the performance of security analysis framework is good and Bandit with taint can greatly reduce the false positive results without affecting the performance.
引文
[1]Python Software Foundation.Python Documentation[EB/OL].[2014-12-10].http://www.python.org/doc.
[2]Python Software Foundation.Python/C API Reference Manual[EB/OL].[2018-11-03].https://docs.python.org/2/.
[3]Bush W R,Pincus J D,Sielaff D J.A static analyzer for finding dynamic programming errors[J].Software:Practice and Experience,2000,30(7):775-802.
[4]Engler R.EXE:Automatically generating inputs of death[J].ACM Transactions on Information&System Security,2008,12(2):1-38.
[5]Foster J S,Johnson R.Cqual-A tool for adding type qualifiers to C[EB/OL].[2006-06-24].https://sourceforge.net/projects/cqual/.
[6]Hewlett-Packard.Fortify[EB/OL].[2009-11-07].http://www.fortify.com/.
[7]Marcin F.Pytaint:Taint tracking in Python[EB/OL].[2013-10-09].https://github.com/felixgr/pytaint.
[8]Conti J J,Russo A.A taint mode for Python via a library[C]//Nordic Conference on Secure IT Systems.Berlin,Heidelberg:Springer-Verlag,2010:210-222.
[9]Kim S,Kim R Y C,Park Y B.Software vulnerability detection methodology combined with static and dynamic analysis[J].Wireless Personal Communications,2016,89(3):777-793.
[10]Micheelsen S,Thalmann B.PyT-A Static Analysis Tool for Detecting Security Vulnerabilities in Python Web Applications[D].Aalborg:Aalborg University,2016.
[11]Falleri J R,Morandat F,Blanc X,et al.Fine-grained and accurate source code differencing[C]//ASE?14 Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering.New York:ACM Press,2014:313-324.
[12]Myers E W.An O(ND)difference algorithm and its variations[J].Algorithmica,1986,1(1-4):251-266.
[13]Joy M,Luck M.Plagiarism in programming assignments[J].Education IEEE Transactions on,1999,42(2):129-133.
[14]Phil F,Bryce G,Carlos C,et al.Bandit[EB/OL].[2017-10-25].https://github.com/PyCQA/bandit.
[15]Tedesco F D,Russo A,Sands D.Implementing erasure policies using taint analysis[C]//Information Security Technology for Applications.Berlin,Heidelberg:Springer-Verlag,2010:193-209.
[16]Belgijska.Python source code review-Best practices version 0.9[EB/OL].[2015-05-26].http://www.avet.com.pl.