Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们 横山亮次奖 百年刊庆
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  横山亮次奖  |  百年刊庆
清华大学学报(自然科学版)  2019, Vol. 59 Issue (1): 9-14    DOI: 10.16511/j.cnki.qhdxxb.2018.22.054
  信息安全 本期目录 | 过刊浏览 | 高级检索 |
信息密度增强的恶意代码可视化与自动分类方法
刘亚姝1,2, 王志海1, 侯跃然3, 严寒冰4
1. 北京交通大学 计算机与信息技术学院, 北京 100044;
2. 北京建筑大学 电气与信息工程学院, 北京 100044;
3. 北京邮电大学 网络技术研究院, 北京 100876;
4. 国家计算机网络应急技术处理协调中心, 北京 100029
Malware visualization and automatic classification with enhanced information density
LIU Yashu1,2, WANG Zhihai1, HOU Yueran3, YAN Hanbing4
1. School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China;
2. School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, China;
3. Institute of Network Technology, Beijing University of Posts and Telecommunication, Beijing 100876, China;
4. National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing 100029, China
全文: PDF(1463 KB)  
输出: BibTeX | EndNote (RIS)      
摘要 计算机及网络技术的发展致使恶意代码数量每年以指数级数增长,对网络安全构成了严重的威胁。该文将恶意代码逆向分析与可视化相结合,提出了将可移植可执行(PE)文件的“.text”段函数块的操作码序列simHash值可视化的方法,不仅提高了恶意代码可视化的效率,而且解决了操作码序列simHash值相似性判断困难的问题。实验结果表明:该可视化方法能够获得有效信息密度增强的分类特征;与传统恶意代码可视化方法相比,该方法更高效,分类结果更准确。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
刘亚姝
王志海
侯跃然
严寒冰
关键词 恶意代码可视化simHash图像纹理    
Abstract:The development of computers and networking has been accompanied by exponential increases in the amount of malware which greatly threaten cyber space applications. This study combines the reverse analysis of malicious codes with a visualization method in a method that visualizes operating code sequences extracted from the ".text" section of portable and excutable (PE) files. This method not only improves the efficiency of malware, but also solves the difficulty of simHash similarity measurements. Tests show that this method identifies more effective features with higher information densities. This method is more efficient and has better classification accuracy than traditional malware visualization methods.
Key wordsmalware visualization    simHash    image texture
收稿日期: 2018-08-20      出版日期: 2019-01-16
基金资助:国家自然科学基金重点项目(U1736218);国家自然科学基金面上项目(61672086);国家重点研发计划项目(2018YFB0803604)
通讯作者: 王志海,教授,E-mail:zhhwang@bjtu.edu.cn     E-mail: zhhwang@bjtu.edu.cn
引用本文:   
刘亚姝, 王志海, 侯跃然, 严寒冰. 信息密度增强的恶意代码可视化与自动分类方法[J]. 清华大学学报(自然科学版), 2019, 59(1): 9-14.
LIU Yashu, WANG Zhihai, HOU Yueran, YAN Hanbing. Malware visualization and automatic classification with enhanced information density. Journal of Tsinghua University(Science and Technology), 2019, 59(1): 9-14.
链接本文:  
http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2018.22.054  或          http://jst.tsinghuajournals.com/CN/Y2019/V59/I1/9
  图1 恶意代码灰度图像示例
  图2 Benign家族图像示例
  图3 有效信息增强的恶意代码分类流程图
  图4 “sub_401390”函数的内容示例
  图5 simHash可视化算法步骤
  图6 simHash降维过程示意图
  表1 “.text”段函数操作码可视化方法 KNN(K=2)分类结果对比
  表2 “.text”段函数操作码可视化方法 RF(n_estimators=25)分类结果对比
  表3 简单改进的恶意代码灰度图像 RF(n_estimators=25)分类结果比较
  表4 不同方法时间消耗对比
[1] 百度百科. 恶意代码[R/OL].[2018-07-12]. https://baike.baidu.com/item/%E6%81%B6%E6%84%8F%E4%BB%A3%-E7%A0%81. Baidu Encyclopedia. Malware entry[R/OL].[2018-07-12]. https://baike.baidu.com/item/%E6%81%B6%E6%84%-8F%E4%BB%A3%E7%A0%81. (in Chinese)
[2] 国家互联网应急中心. 网络安全信息与动态周报[R/OL].[2018-07-12]. http://www.cert.org.cn/publish/main/upload/File/2018CNCERT12.pdf. National Internet Emergency Center. Network security information and trends weekly report[R/OL].[2018-07-12]. http://www.cert.org.cn/publish/main/upload/File/2018CN-CERT12.pdf. (in Chinese)
[3] 陈娟英. 基于亲缘性分析的恶意代码检测技术研究与实现[D]. 成都:电子科技大学, 2014. CHEN J Y. Research and implementation of malicious code detection technology based on affinity analysis[D]. Chengdu:University of Electronic Science and Technology of China, 2014. (in Chinese)
[4] ZHANG Y N, HUANG Q J, MA X J, et al. Using multi-features and ensemble learning method for imbalanced malware classification[C]//2016 IEEE Trustcom/BigDataSE/ISPA. Tianjin, China, 2017:965-973.
[5] EISNER J. Understanding heuristics:Symantec's bloodhound technology[R]. Mountain View, USA:Symantec, 1997.
[6] FIRDAUSI I, LIM C, ERWIN A, et al. Analysis of machine learning techniques used in behavior-based malware detection[C]//2nd International Conference on Advances in Computing, Control, and Telecommunication Technologies. Jakarta, Indonesia, 2010:201-203.
[7] RIECK K, TRINIUS P, WILLEMS C, et al. Automatic analysis of malware behavior using machine learning[J]. Journal of Computer Security, 2011, 19(4):639-668.
[8] 王蕊, 冯登国, 杨轶, 等. 基于语义的恶意代码行为特征提取及检测方法[J]. 软件学报, 2012, 23(2):378-393. WANG R, FENG D G, YANG Y, et al. Semantics-based malware behavior signature extraction and detection method[J]. Journal of Software, 2012, 23(2):378-393. (in Chinese)
[9] SAXE J, BERLIN K. Deep neural network based malware detection using two dimensional binary program features[C]//10th International Conference on Malicious and Unwanted Software. Fajardo, Puerto Rico, 2015:11-20.
[10] CONTI G, BRATUS S, SANGSTER B, et al. Automated mapping of large binary objects using primitive fragment type classification[J]. Digital Investigation, 2010, 7:S3-S12.
[11] NATARAJ L, KARTHIKEYAN S, JACOB G, et al. Malware images:Visualization and automatic classification[C]//8th International Symposium on Visualization for Cyber Security. Pittsburgh, USA, 2011:21-29.
[12] HAN K S, LIM J H, KANG B, et al. Malware analysis using visualized images and entropy graphs[J]. International Journal of Information Security, 2015, 14(1):1-14.
[13] CHARIK M S. Similarity estimation techniques from rounding algorithms[C]//34th Annual ACM Symposium on the Theory of Computing. Montreal, Canada, 2002:380-388.
[14] MANKU G S, JAIN A, SARMA A D. Detecting near-duplicates for web crawling[C]//16th International Conference on World Wide Web. Banff, Canada, 2007:141-150.
[15] UDDIN M S, ROY C K, SCHNEIDER K A, et al. On the effectiveness of simHash for detecting near-miss clones in large scale software systems[C]//18th Working Conference on Reverse Engineering. Limerick, Ireland, 2011:13-22.
[16] 乔延臣. 恶意代码同源判断技术研究[D]. 北京:中国科学院大学, 2016. QIAO Y C. Research on malware homologous judgment technology[D]. Beijing:University of Chinese Academy of Sciences, 2016. (in Chinese)
[17] OLIVA A, TORRALBA A. Modeling the shape of a scene:A holistic representation of the spatial envelope[J]. International Journal of Computer Vision, 2001, 42(3):145-175.
[18] OJALA T, PIETIKÄINEN M, HARWOOD D. A comparative study of texture measures with classification based on feature distribution[J]. Pattern Recognition, 1996, 29(1):51-59.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《清华大学学报(自然科学版)》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn