Malware detection method based on enhanced code images
SUN Bowen1, ZHANG Peng1, CHENG Mingyu2, LI Xintong2, LI Qi2
1. China Information Technology Security Evaluation Center, Beijing 100085, China; 2. School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing 100876, China
Abstract:Cyberspace malware is becoming more and more serious with traditional malware detection methods unable to deal with the new types of malware. This paper presents a malware detection method based on enhanced code images. The traditional malware image method is improved by using ASCII character information and PE structure information. A three-dimensional RGB image is used as the raw input into the detection algorithm with a VGG16 neural network model with spatial pyramid pooling used to train and predict the malware images. In addition, a multi-label normalized representation method is used to improve the sample label reliability. The method was evaluated against real malware datasets.
[1] AHMADI M, ULYANOV D, SEMENOV S, et al. Novel feature extraction, selection and fusion for effective malware family classification[C]//Proceedings of the 6th ACM Conference on Data and Application Security and Privacy. Orleans, USA:ACM, 2016:183-194. [2] KOLOSNJAJI B, ZARRAS A, WEBSTER G, et al. Deep learning for classification of malware system call sequences[C]//Proceedings of the 29th Australasian Joint Conference on Artificial Intelligence. Hobart, Australia:Springer, 2016:137-149. [3] HU W W, TAN Y. Generating adversarial malware examples for black-box attacks based on GAN[J]. arXiv preprint arXiv:1702.05983, 2017. [4] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. arXiv:1409.1556, 2014. [5] NATARAJ L, KARTHIKEYAN S, JACOB G, et al. Malware images:Visualization and automatic classification[C]//Proceedings of the 8th International Symposium on Visualization for Cyber Security. Pittsburg, USA:ACM, 2011:4. [6] 韩晓光, 曲武, 姚宣霞, 等. 基于纹理指纹的恶意代码变种检测方法研究[J]. 通信学报, 2014, 35(8):125-136.HAN X G, QU W, YAO X X, et al. Research on malicious code variants detection based on texture fingerprint[J]. Journal on Communications, 2014, 35(8):125-136. (in Chinese) [7] 任卓君, 陈光. 熵可视化方法在恶意代码分类中的应用[J].计算机工程, 2017, 43(9):167-171.REN Z J, CHEN G. Application of entropy visualization method in malware classification[J]. Computer Engineering, 2017, 43(9):167-171. (in Chinese) [8] 张晨斌, 张云春, 郑杨, 等. 基于灰度图纹理指纹的恶意软件分类[J]. 计算机科学, 2018, 45(S1):383-386.ZHANG C B, ZHANG Y C, ZHENG Y, et al. Malware classification based on texture fingerprint of gray-scale images[J]. Computer Science, 2018, 45(S1):383-386. (in Chinese) [9] CUI Z H, XUE F, CAI X J, et al. Detection of malicious code variants based on deep learning[J]. IEEE Transactions on Industrial Informatics, 2018, 14(7):3187-3196. [10] REZENDE E, RUPPERT G, CARVALHO T, et al. Malicious software classification using transfer learning of resnet-50 deep neural network[C]//Proceedings of the 16th IEEE International Conference on Machine Learning and Applications (ICMLA). Cancun, Mexico:IEEE, 2017:1011-1014. [11] PERDISCI R, MANCHON U. VAMO:Towards a fully automated malware clustering validity analysis[C]//Proceedings of the 28th Annual Computer Security Applications Conference. New York, USA:ACM, 2012. [12] SEBASTIÁN M, RIVERA R, KOTZIAS P, et al. AVCLASS:A tool for massive malware labeling[C]//Proceedings of the 19th International Symposium on Research in Attacks, Intrusions, and Defenses. Paris, France:Springer, 2016. [13] HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1904-1916. [14] DAHL G E, STOKES J W, DENG L, et al. Large-scale malware classification using random projections and neural networks[C]//2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, Canada:IEEE, 2013:3422-3426. [15] SUN B W, GUO Y H, LI Q, et al. Malware family classification method based on static feature extraction[C]//2017 3rd IEEE International Conference on Computer and Communications (ICCC). Chengdu, China:IEEE, 2017:507-513.