Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  百年期刊
Journal of Tsinghua University(Science and Technology)    2018, Vol. 58 Issue (5) : 500-508     DOI: 10.16511/j.cnki.qhdxxb.2018.25.020
COMPUTER SCIENCE AND TECHNOLOGY |
API based sequence and statistical features in a combined malware detection architecture
LU Xiaofeng1, JIANG Fangshuo1, ZHOU Xiao1, CUI Baojiang1, YI Shengwei2, SHA Jing3
1. School of Cyberspace Security, Beijing University of Post and Telecommunications, Beijing 100876, China;
2. China Information Technology Security Evaluation Center, Beijing 100085, China;
3. The Third Research Institute of Ministry of Public Security, Shanghai 201204, China
Download: PDF(3026 KB)  
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks    
Abstract  This paper presents a combined machine learning framework for malware behavior analyses. One part of the framework analyzes the dependency relation in the API call sequence at the functional level to extract features to train and classify a random forest. The other part uses a recurrent neural network (RNN) to study the API sequence to identify malware with redundant information preprocessing using the RNN time series forecasting ability. Tests on a malware dataset show that both methods can effectively detect malwares. However, the combined framework is better with an AUC of 99.3%.
Keywords computer virus and prevention      malware classification      machine learning      deep learning      call sequence     
ZTFLH:  TP309.5  
Issue Date: 15 May 2018
Service
E-mail this article
E-mail Alert
RSS
Articles by authors
LU Xiaofeng
JIANG Fangshuo
ZHOU Xiao
CUI Baojiang
YI Shengwei
SHA Jing
Cite this article:   
LU Xiaofeng,JIANG Fangshuo,ZHOU Xiao, et al. API based sequence and statistical features in a combined malware detection architecture[J]. Journal of Tsinghua University(Science and Technology), 2018, 58(5): 500-508.
URL:  
http://jst.tsinghuajournals.com/EN/10.16511/j.cnki.qhdxxb.2018.25.020     OR     http://jst.tsinghuajournals.com/EN/Y2018/V58/I5/500
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
[1] WANG X Z, LIU J W, CHEN X E. Say no to overfitting. (2017-05-31). https://www.kaggle.com/c/malware-classification/discussion/13897.
[2] LIPTON Z C, BERKOWITZ J, ELKAN C. A critical review of recurrent neural networks for sequence learning[J]. arXiv preprint arXiv:1506.00019, 2015.
[3] 黄全伟. 基于N-Gram系统调用序列的恶意代码静态检测[D]. 哈尔滨:哈尔滨工业大学, 2009.HUANG Q W. Malicious executables detection based on N-Gram system call sequences[D]. Harbin:Harbin Institute of Technology, 2009.(in Chinese)
[4] 刘阳. 应用随机森林与神经网络算法检测与分析Android应用恶意样本[D]. 北京:北京交通大学, 2015.LIU Y. Employing the algorithms of random forest and neural networks for the detection and analysis of malicious code of Android applications[D]. Beijing:Beijing Jiaotong University, 2015. (in Chinese)
[5] 杨宏宇, 徐晋. 基于改进随机森林算法的Android恶意软件检测[J]. 通信学报, 2017(4):8-16.YANG H Y, XU J. Android malware detection based on improved random forest[J]. Journal on Communications, 2017(4):8-16. (in Chinese)
[6] 张家旺, 李燕伟. 基于机器学习算法的Android恶意程序检测系统[J]. 计算机应用研究, 2017(6):1-6.ZHANG J W, LI Y W. Malware detection system implementation of Android application based on machine learning[J]. Application Research of Computers, 2017(6):1-6. (in Chinese)
[7] SANTOS I, BREZO F, UGARTE-PEDRERO X, et al. Opcode sequences as representation of executables for data-mining-based unknown malware detection[J]. Information Sciences, 2013, 231:64-82.
[8] RAVI C, MANOHARAN R. Malware detection using windows API sequence and machine learning[J]. International Journal of Computer Applications, 2012, 43(17):12-16.
[9] 廖国辉, 刘嘉勇. 基于数据挖掘和机器学习的恶意代码检测方法[J]. 信息安全研究, 2016(1):74-79.LIAO G H, LIU J Y. A malicious code detection method based on data mining and machine learning[J]. Journal of Information Security Research, 2016(1):74-79. (in Chinese)
[10] DAHL G E, STOKES J W, DENG L, et al. Large-scale malware classification using random projections and neural networks[C]//2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Vancouver, BC, Canada:IEEE, 2013:3422-3426.
[11] SAXE J, BERLIN K. Deep neural network based malware detection using two dimensional binary program features[C]//201510th International Conference on Malicious and Unwanted Software (MALWARE). Fajardo, Puerto Rico:IEEE, 2015:11-20.
[12] KOLOSNJAJI B, ZARRAS A, WEBSTER G, et al. Deep learning for classification of malware system call sequences[C]//Australasian Joint Conference on Artificial Intelligence. Hobart, TAS, Australia:Springer International Publishing, 2016:137-149.
[13] TOBIYAMA S, YAMAGUCHI Y, SHIMADA H, et al. Malware detection with deep neural network using process behavior[C]//201640th Annual IEEE Conference on Computer Software and Applications (COMPSAC). Atlanta, GA, USA:IEEE, 2016, 2:577-582.
[14] PASCANU R, STOKES J W, SANOSSIAN H, et al. Malware classification with recurrent networks[C]//2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Brisbane, QLD, Australia:IEEE, 2015:1916-1920.
[15] Tensorflow.. (2017-05-31). https://www.tensorflow.org/,2017.
[16] VirusShare.. (2017-05-31). https://virusshare.com,2017.
[17] VirusTotal.. (2017-05-31). http://www.virustotal.com,2017.
[18] Scikit-Learn.. (2017-05-31). http://scikit-learn.org/,2017.
[1] MIAO Xupeng, ZHANG Minxu, SHAO Yingxia, CUI Bin. PS-Hybrid: Hybrid communication framework for large recommendation model training[J]. Journal of Tsinghua University(Science and Technology), 2022, 62(9): 1417-1425.
[2] ZHAO Qiming, BI Kexin, QIU Tong. Comparison and integration of machine learning based ethylene cracking process models[J]. Journal of Tsinghua University(Science and Technology), 2022, 62(9): 1450-1457.
[3] CAO Laicheng, LI Yuntao, WU Rong, GUO Xian, FENG Tao. Multi-key privacy protection decision tree evaluation scheme[J]. Journal of Tsinghua University(Science and Technology), 2022, 62(5): 862-870.
[4] WANG Haojie, MA Zixuan, ZHENG Liyan, WANG Yuanwei, WANG Fei, ZHAI Jidong. Efficient memory allocator for the New Generation Sunway supercomputer[J]. Journal of Tsinghua University(Science and Technology), 2022, 62(5): 943-951.
[5] LU Sicong, LI Chunwen. Human-machine conversation system for chatting based on scene and topic[J]. Journal of Tsinghua University(Science and Technology), 2022, 62(5): 952-958.
[6] LI Wei, LI Chenglong, YANG Jiahai. As-Stream: An intelligent operator parallelization strategy for fluctuating data streams[J]. Journal of Tsinghua University(Science and Technology), 2022, 62(12): 1851-1863.
[7] LIU Qiangmo, HE Xu, ZHOU Baishun, WU Haolin, ZHANG Chi, QIN Yu, SHEN Xiaomei, GAO Xiaorong. Simple and high performance classification model for autism based on machine learning and pupillary response[J]. Journal of Tsinghua University(Science and Technology), 2022, 62(10): 1730-1738.
[8] MA Xiaoyue, MENG Xiao. Image position and layout effects of multi-image tweets from the perspective of user engagement[J]. Journal of Tsinghua University(Science and Technology), 2022, 62(1): 77-87.
[9] MEI Jie, LI Qingbin, CHEN Wenfu, WU Kun, TAN Yaosheng, LIU Chunfeng, WANG Dongmin, HU Yu. Overtime warning of concrete pouring interval based on object detection model[J]. Journal of Tsinghua University(Science and Technology), 2021, 61(7): 688-693.
[10] TANG Zhili, WANG Xue, XU Qianjun. Rockburst prediction based on oversampling and objective weighting method[J]. Journal of Tsinghua University(Science and Technology), 2021, 61(6): 543-555.
[11] GUAN Zhibin, WANG Xiaomeng, XIN Wei, WANG Jiajie. Data generation and annotation method for source code defect detection[J]. Journal of Tsinghua University(Science and Technology), 2021, 61(11): 1240-1245.
[12] HAN Kun, PAN Haiwei, ZHANG Wei, BIAN Xiaofei, CHEN Chunling, HE Shuning. Alzheimer's disease classification method based on multi-modal medical images[J]. Journal of Tsinghua University(Science and Technology), 2020, 60(8): 664-671,682.
[13] WANG Zhiguo, ZHANG Yujin. Anomaly detection in surveillance videos: A survey[J]. Journal of Tsinghua University(Science and Technology), 2020, 60(6): 518-529.
[14] SUN Bowen, ZHANG Peng, CHENG Mingyu, LI Xintong, LI Qi. Malware detection method based on enhanced code images[J]. Journal of Tsinghua University(Science and Technology), 2020, 60(5): 386-392.
[15] SONG Yubo, QI Xinyu, HUANG Qiang, HU Aiqun, YANG Junjie. Two-stage multi-classification algorithm for Internet of Things equipment identification[J]. Journal of Tsinghua University(Science and Technology), 2020, 60(5): 365-370.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
Copyright © Journal of Tsinghua University(Science and Technology), All Rights Reserved.
Powered by Beijing Magtech Co. Ltd