基于代码图像增强的恶意代码检测方法

孙博文; 张鹏; 成茗宇; 李新童; 李祺

doi:10.16511/j.cnki.qhdxxb.2020.25.008

清华大学学报（自然科学版） >

2020 , Vol. 60 >Issue 5: 386 - 392

DOI: https://doi.org/10.16511/j.cnki.qhdxxb.2020.25.008

专题：漏洞分析与风险评估

基于代码图像增强的恶意代码检测方法

孙博文 ,
张鹏 ,
成茗宇 ,
李新童 ,
李祺

展开

1. 中国信息安全测评中心, 北京 100085;
2. 北京邮电大学网络空间安全学院, 北京 100876

收稿日期: 2019-06-01

网络出版日期: 2020-04-26

基金资助

李祺,副教授,E-mail:liqi2001@bupt.edu.cn

收起

Malware detection method based on enhanced code images

SUN Bowen ,
ZHANG Peng ,
CHENG Mingyu ,
LI Xintong ,
LI Qi

Expand

1. China Information Technology Security Evaluation Center, Beijing 100085, China;
2. School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing 100876, China

Received date: 2019-06-01

Online published: 2020-04-26

Fold

摘要

网络空间面临的恶意代码威胁日益严峻，传统恶意代码检测方法在恶意代码攻防对抗中逐渐暴露弊端。针对此现状，该文提出了基于代码灰度化图像增强的恶意代码检测方法，使用恶意代码ASCII字符信息和PE结构信息对传统恶意代码灰度化图像方法进行改进，构建RGB三维图像作为原始数据输入到检测算法，并使用一种带有空间金字塔池化结构的VGG16神经网络模型对恶意代码图像进行训练和预测。该文还提出了一种基于多标注归一化表示的方法来提高样本标签的可靠性，实验结果表明：该方案可以有效应对加壳、混淆等对抗手段，对新型恶意代码具有良好的检测效果。

关键词： 计算机病毒与防治; 恶意代码; 代码图像化; 卷积神经网络; 空间金字塔池化

本文引用格式

孙博文 , 张鹏 , 成茗宇 , 李新童 , 李祺 . 基于代码图像增强的恶意代码检测方法[J]. 清华大学学报（自然科学版）, 2020 , 60(5) : 386 -392 . DOI: 10.16511/j.cnki.qhdxxb.2020.25.008

Abstract

Cyberspace malware is becoming more and more serious with traditional malware detection methods unable to deal with the new types of malware. This paper presents a malware detection method based on enhanced code images. The traditional malware image method is improved by using ASCII character information and PE structure information. A three-dimensional RGB image is used as the raw input into the detection algorithm with a VGG16 neural network model with spatial pyramid pooling used to train and predict the malware images. In addition, a multi-label normalized representation method is used to improve the sample label reliability. The method was evaluated against real malware datasets.

Key words： computer virus and prevention; malware; code image; convolution neural network; spatial pyramid pooling

参考文献

[1] AHMADI M, ULYANOV D, SEMENOV S, et al. Novel feature extraction, selection and fusion for effective malware family classification[C]//Proceedings of the 6th ACM Conference on Data and Application Security and Privacy. Orleans, USA:ACM, 2016:183-194.
[2] KOLOSNJAJI B, ZARRAS A, WEBSTER G, et al. Deep learning for classification of malware system call sequences[C]//Proceedings of the 29th Australasian Joint Conference on Artificial Intelligence. Hobart, Australia:Springer, 2016:137-149.
[3] HU W W, TAN Y. Generating adversarial malware examples for black-box attacks based on GAN[J]. arXiv preprint arXiv:1702.05983, 2017.
[4] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. arXiv:1409.1556, 2014.
[5] NATARAJ L, KARTHIKEYAN S, JACOB G, et al. Malware images:Visualization and automatic classification[C]//Proceedings of the 8th International Symposium on Visualization for Cyber Security. Pittsburg, USA:ACM, 2011:4.
[6] 韩晓光, 曲武, 姚宣霞, 等. 基于纹理指纹的恶意代码变种检测方法研究[J]. 通信学报, 2014, 35(8):125-136.HAN X G, QU W, YAO X X, et al. Research on malicious code variants detection based on texture fingerprint[J]. Journal on Communications, 2014, 35(8):125-136. (in Chinese)
[7] 任卓君, 陈光. 熵可视化方法在恶意代码分类中的应用[J].计算机工程, 2017, 43(9):167-171.REN Z J, CHEN G. Application of entropy visualization method in malware classification[J]. Computer Engineering, 2017, 43(9):167-171. (in Chinese)
[8] 张晨斌, 张云春, 郑杨, 等. 基于灰度图纹理指纹的恶意软件分类[J]. 计算机科学, 2018, 45(S1):383-386.ZHANG C B, ZHANG Y C, ZHENG Y, et al. Malware classification based on texture fingerprint of gray-scale images[J]. Computer Science, 2018, 45(S1):383-386. (in Chinese)
[9] CUI Z H, XUE F, CAI X J, et al. Detection of malicious code variants based on deep learning[J]. IEEE Transactions on Industrial Informatics, 2018, 14(7):3187-3196.
[10] REZENDE E, RUPPERT G, CARVALHO T, et al. Malicious software classification using transfer learning of resnet-50 deep neural network[C]//Proceedings of the 16th IEEE International Conference on Machine Learning and Applications (ICMLA). Cancun, Mexico:IEEE, 2017:1011-1014.
[11] PERDISCI R, MANCHON U. VAMO:Towards a fully automated malware clustering validity analysis[C]//Proceedings of the 28th Annual Computer Security Applications Conference. New York, USA:ACM, 2012.
[12] SEBASTIÁN M, RIVERA R, KOTZIAS P, et al. AVCLASS:A tool for massive malware labeling[C]//Proceedings of the 19th International Symposium on Research in Attacks, Intrusions, and Defenses. Paris, France:Springer, 2016.
[13] HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1904-1916.
[14] DAHL G E, STOKES J W, DENG L, et al. Large-scale malware classification using random projections and neural networks[C]//2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, Canada:IEEE, 2013:3422-3426.
[15] SUN B W, GUO Y H, LI Q, et al. Malware family classification method based on static feature extraction[C]//2017 3rd IEEE International Conference on Computer and Communications (ICCC). Chengdu, China:IEEE, 2017:507-513.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献

访问统计