COMPUTER SCIENCE AND TECHNOLOGY |
|
|
|
|
|
K-means based feature reduction for network anomaly detection |
JIA Fan1, YAN Yan2, ZHANG Jiaqi1 |
1. Key Laboratory of Communication & Information Systems of Beijing, Beijing Jiaotong University, Beijing 100044, China;
2. China Information Security Certification Center, Beijing 100020, China |
|
|
Abstract Although the basic K-means test was used for anomaly detection in the KDD 99 attack dataset, its accuracy and efficiency for detecting rare attacks needs to be improved. Rare attacks, which are usually greater threats, are easily hidden by common threats so the rare attacks can be more easily identified by removing common attacks. An improved hierarchical iterative K-means method was developed based on this finding to detect all kinds of anomalies using feature reduction through correlations to decrease classification the dimensions. The algorithm is able to detect almost every rare attack with a 99% succesful classification rate and for nearly real-time detection with 90% less computations on the KDD 99 data compared with the basic K-means algorithm.
|
Keywords
anomaly detection
K-means
feature reduction
U2R
R2L
|
|
Issue Date: 15 February 2018
|
|
|
[1] |
NI X J, HE D J, FAROOQ A. Practical network anomaly detection using data mining techniques[J]. VFAST Transactions on Software Engineering, 2016, 9(2):1-6.
|
[2] |
TROST R. Practical intrusion analysis:Prevention and detection for the twenty-first century[M]. New York:Addison-Wesley, 2009.
|
[3] |
BHUYAN M H, BHATTACHARYYA D K, KALITA J K. Network anomaly detection:Methods, systems and tools[J]. IEEE Communications Surveys & Tutorials, 2014, 16(1):303-336.
url: http://dx.doi.org/Communications Surveys
|
[4] |
KNORR E M, NG R T. Algorithms for mining distance-based outliers in large datasets[C]//Proceedings of the 24th International Conference on Very Large Data Bases. New York, USA:Morgan Kaufmann, 1998:392-403.
|
[5] |
WEI L, QIAN W N, ZHOU A Y, et al. Hot:Hypergraph-based outlier test for categorical data[C]//Proceedings of the 7th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. Seoul, Korea:Springer, 2003:399-410.
|
[6] |
BAY S D, SCHWABACHER M. Mining distance-based outliers in near linear time with randomization and a simple pruning rule[C]//Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington, DC, USA:ACM Press, 2003:29-38.
|
[7] |
BREUNIG M M, KRIEGEL H P, NG R T, et al. LOF:Identifying density-based local outliers[J]. ACM SIGMOD Record, 2000, 29(2):93-104.
|
[8] |
季成, 李晓东, 袁坚, 等. 基于<em>K</em>-means算法的DNS查询模式分析[J]. 清华大学学报(自然科学版), 2010, 50(4):601-604.JI C, LI X D, YUAN J, et al. Analysis of domain name queries based on the <em>K</em>-means algorithm[J]. Journal of Tsinghua University (Science and Technology), 2010, 50(4):601-604. (in Chinese)
|
[9] |
KDD Cup 1999 Intrusion detection dataset[EB/OL]. (1999-10-28). http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.
url: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.
|
[10] |
蒋学英, 李雅珍, 严结苟. 基于SOM神经网络的异常检测算法研究[J]. 计算机科学, 2008, 35(10B):244-246. JIANG X Y, LI Y Z, YAN J G. Research on anomaly detection algorithm based on SOM neural network[J]. Computer Science, 2008, 35(10B):244-246. (in Chinese)
|
[11] |
MOUSTAFA N, SLAY J. The evaluation of network anomaly detection systems:Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD 99 data set[J]. Information Security Journal:A Global Perspective, 2016, 25(1-3):18-31.
|
[12] |
WELLER-FAHY D J, BORGHETTI B J, SODEMANN A A. A survey of distance and similarity measures used within network intrusion anomaly detection[J]. IEEE Communications Surveys & Tutorials, 2014, 17(1):70-91.
url: http://dx.doi.org/Communications Surveys
|
[13] |
傅涛, 孙文静, 孙亚民. 基于分箱统计的FCM算法及其在网络入侵检测中的应用[J]. 计算机科学, 2008, 35(4):36-39.FU T, SUN W J, SUN Y M. FCM algorithm based on Box-FCM statistics and its application in network intrusion detection[J]. Computer Science, 2008, 35(4):36-39. (in Chinese)
|
[14] |
SYARIF I, PRUGEL-BENNETT A, WILLS G. Unsupervised clustering approach for network anomaly detection[C]//International Conference on Networked Digital Technologies (NDT 2012). Berlin, Germany:Springer, 2012:135-145.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|