Chinese  |  English
Home Table of Contents

15 May 2020, Volume 60 Issue 5
    

  • Select all
    |
    SPECIAL SECTION:VULNERABILITY ANALYSIS AND RISK ASSESSMENT
  • SONG Yubo, QI Xinyu, HUANG Qiang, HU Aiqun, YANG Junjie
    Journal of Tsinghua University(Science and Technology). 2020, 60(5): 365-370. https://doi.org/10.16511/j.cnki.qhdxxb.2019.22.050
    Abstract ( ) Download PDF ( )   Knowledge map   Save
    The Internet of Things will have a large number of devices interconnected through the network with effective network access control needed to avoid damage from malicious devices on the system. At present, the most effective method is to extract network traffic characteristics as the device fingerprint for device identification since this method requires relatively few network resources. However, existing device identification algorithms are not very accurate, especially for similar devices since classification overlap is unavoidable. This paper presents a two-stage multi-classification algorithm that identifies the equipment according to its network traffic characteristics. When classification overlap occurs, the maximum similarity comparison algorithm is used for secondary classification. Tests show that the average recognition accuracy of this algorithm is 93.2%.
  • ZHANG Mingyuan, WU Wei, SONG Yubo, HU Aiqun
    Journal of Tsinghua University(Science and Technology). 2020, 60(5): 371-379. https://doi.org/10.16511/j.cnki.qhdxxb.2020.25.007
    Abstract ( ) Download PDF ( )   Knowledge map   Save
    Wireless local area network (WLAN) access devices are critical parts of a network topology that require comprehensive security performance analyses. Security assessment methods for WLAN devices are affected by network environmental factors that limit security performance evaluations of access devices. This paper presents a security level assessment system for WLAN access devices that integrates security function assessments with vulnerability assessments in a device security level assessment that is independent of the application environment security based on a combination of semi-quantitative and quantitative analyses methods. Tests with several mainstream brand devices show that the evaluation system can automatically evaluate the security level of WLAN access devices.
  • ZHAO Xiaolin, JIANG Xiaoyi, ZHAO Jingjing, XU Hao, GUO Jiong
    Journal of Tsinghua University(Science and Technology). 2020, 60(5): 380-385. https://doi.org/10.16511/j.cnki.qhdxxb.2020.26.002
    Abstract ( ) Download PDF ( )   Knowledge map   Save
    Network security methods lack effective metrics to measure attack risks and defense capabilities in dynamic networks, especially since they have high dimensionality and are difficult to analyze since there are many indicators. This paper presents a method to quantify network attack and defense capabilities. Clustering and principal component analyses are used to reduce the dimensionality and allocate weights to the indicator groups. These indexes are embedded in differential manifolds that change with time with the network risk evaluated based on the attack risks and defense capabilities to quantify the network security effectiveness. The CIC2017 dataset is used as an example to show that this method can indicate the attach and defense risks for dynamic networks. The results show that this method can provide a dynamic method for network security measurements.
  • SUN Bowen, ZHANG Peng, CHENG Mingyu, LI Xintong, LI Qi
    Journal of Tsinghua University(Science and Technology). 2020, 60(5): 386-392. https://doi.org/10.16511/j.cnki.qhdxxb.2020.25.008
    Abstract ( ) Download PDF ( )   Knowledge map   Save
    Cyberspace malware is becoming more and more serious with traditional malware detection methods unable to deal with the new types of malware. This paper presents a malware detection method based on enhanced code images. The traditional malware image method is improved by using ASCII character information and PE structure information. A three-dimensional RGB image is used as the raw input into the detection algorithm with a VGG16 neural network model with spatial pyramid pooling used to train and predict the malware images. In addition, a multi-label normalized representation method is used to improve the sample label reliability. The method was evaluated against real malware datasets.
  • YANG Hongyu, ZHANG Xugao, LU Weili
    Journal of Tsinghua University(Science and Technology). 2020, 60(5): 393-401. https://doi.org/10.16511/j.cnki.qhdxxb.2020.25.009
    Abstract ( ) Download PDF ( )   Knowledge map   Save
    The accuracy of existing information system security assessments is affected by the expert evaluation preferences. This paper presents a matrix correction method (MCM) based on information system security situation assessment model (ISSSAM). The system uses a modified interval number judgment matrix to reflect the relative importance of various indicators to improve the objectivity of the indicator layer weight vector. Then, an entropy weight based cloud is used to quantify the criterion layer and the target layer security situation index to grade the system security level. Tests on a departure control system (DCS) verify the model validity and demonstrate that the evaluation stability of this model is better than the entropy weight coefficient method and the traditional analytic hierarchy process (AHP).
  • ZHANG Yu, LIU Qingzhong, SHI Yuanquan, CAO Junkuo
    Journal of Tsinghua University(Science and Technology). 2020, 60(5): 402-407. https://doi.org/10.16511/j.cnki.qhdxxb.2020.25.010
    Abstract ( ) Download PDF ( )   Knowledge map   Save
    Ransomware is a type of malware from cryptovirology that threatens to publish the victim's data or permanently block access to it unless a ransom is paid. Stealthy ransomware is a new type of ransomware that tries to evade detection by deleting all hard copies of its files and just residing in a process running in memory. This study uses danger theory for the biological immune system to design a digital vaccine-based dynamic defense model for stealthy ransomware attacks. Formal definitions are given for some immune concepts such as digital vaccine, antigen, antibody and antibody concentration. Vaccinations with digital vaccines (creating bait files and folders) give the system immature antibodies against stealthy ransomware attacks. The system quickly detects stealthy ransomware attacks using dynamic monitoring of the stealthy ransomware attack antigens in both the core and application layers and by monitoring the dynamic evolution of antibodies and changes of the antibody concentration. Analyses and tests show that the model provides effective real-time detection of stealthy ransomware attacks that are more effective than traditional methods.
  • SPECIAL SECTION:BIG DATA
  • JIANG Wenbin, WANG Hongbin, LIU Pai, CHEN Yuhao
    Journal of Tsinghua University(Science and Technology). 2020, 60(5): 408-414. https://doi.org/10.16511/j.cnki.qhdxxb.2020.21.001
    Abstract ( ) Download PDF ( )   Knowledge map   Save
    Small GPU memories usually restrict the scale of deep learning network models that can be handled in the GPU processors. To address this problem, a hybrid strategy for deep learning was developed which also uses the potential of the CPU by means of the new Intel SIMD instruction set AVX2. The neural network layers which need much memory for the intermediate data are migrated to the CPU to reduce the GPU memory usage. AVX2 is then used to improve the CPU efficiency. The key points include coordinating the network partitioning scheme and the code vectorization based on AVX2. The hybrid strategy is implemented on Caffe. Tests on some typical datasets, such as CIFAR-10 and ImageNet, show that the hybrid computation strategy enables training of larger neural network models on the GPU with acceptable performance.
  • JIA Xudong, WANG Li
    Journal of Tsinghua University(Science and Technology). 2020, 60(5): 415-421. https://doi.org/10.16511/j.cnki.qhdxxb.2020.26.006
    Abstract ( ) Download PDF ( )   Knowledge map   Save
    The importance of each word in a text sequence and the dependencies between them have a significant impact on identifying the text categories. Capsule networks cannot selectively focus on important words in texts. Moreover, it is not possible to encode long-distance dependencies, therefore there are significant limitations in identifying texts with semantic transitions. In order to solve the above problems, this paper proposes a capsule networks based on multi-head attention, which can encode the dependencies between words, capture important words in texts, and encode the semantic of texts, thus effectively improve the effect of text classification task. The experimental results show that the model of this paper is better than the convolutional neural network and the capsule networks in the text classification task, it is more effective in the multi-label text classification task. In addition, it proves that this model can benefit better from the attention.
  • SPECIAL SECTION:COMPUTATIONAL LINGUISTICS
  • WU Jipeng, BAO Jianzhu, LAN Gongqiang, XU Ruifeng
    Journal of Tsinghua University(Science and Technology). 2020, 60(5): 422-429. https://doi.org/10.16511/j.cnki.qhdxxb.2020.21.002
    Abstract ( ) Download PDF ( )   Knowledge map   Save
    Most existing deep learning emotion cause extraction methods are unable to model latent semantic relationships between clauses. In addition, these methods are not easily controlled, are difficult to interpret and need high-quality annotations. This paper presents an emotion cause extraction method that incorporates rule distillation with a hierarchical attention network. The hierarchical attention network uses position encoding and the residual structure to capture the latent semantic relationships within the clauses and between the clauses and the emotional expression. A knowledge distillation architecture based on adversarial learning then introduces linguistic rules related to the emotion cause expression into the deep neural network. Tests on a Chinese emotion cause extraction dataset show that this method outperforms the state-of-the-art method by 0.02 in F1, the best known result.
  • YU Chuanming, YUAN Sai, HU Shasha, AN Lu
    Journal of Tsinghua University(Science and Technology). 2020, 60(5): 430-439. https://doi.org/10.16511/j.cnki.qhdxxb.2020.21.003
    Abstract ( ) Download PDF ( )   Knowledge map   Save
    Deep representation learning of domain topics was used to build a topic alignment model (TAM) with integrated bilingual word embedding. The semantic alignment lexicon was extended to include bilingual word embedding. A traditional bilingual topic model was used to develop an auxiliary distribution to improve the word distribution semantic sharing to improve the topic alignments in the cross-lingual and cross-domain contexts. A bilingual topic similarity (BTS) indicator and a bilingual alignment similarity (BAS) indicator were developed to evaluate the supplementary alignment. The bilingual alignment similarity improved the cross-language topic matching by about 1.5% compared to a traditional multi-language common cultural theme analysis and improved F1 by about 10% for cross-domain topic alignment. These results can improve cross language and cross domain information processing.
  • HYDRAULIC ENGINEERING
  • WANG Bingjie, LI Erhui, WANG Yanjun, ZHANG Shiyan, FU Xudong
    Journal of Tsinghua University(Science and Technology). 2020, 60(5): 440-448. https://doi.org/10.16511/j.cnki.qhdxxb.2020.22.006
    Abstract ( ) Download PDF ( )   Knowledge map   Save
    The middle Yellow River basin is known for its extremely high rates of erosion that account for most of the sediment yield in the Yellow River. A suspended sediment rating curve (SRC) describes the relationship between the water discharge and the suspended sediment discharge in a river. Accurately modeling of the SRC at different timescales is important for improving sediment yield prediction in the middle Yellow River basin. The purpose of this study is to examine how various parameters characterizing the daily and annual SRC are related to the river basin characteristics and climate factors. The results show that daily SRC from 72 hydrologic gauging stations and annual SRC from 58 hydrologic gauging stations in the middle Yellow River basin can be fitted by a power function. The annual function coefficient is larger than the daily coefficient while the power exponent is smaller than the daily exponent with both the daily and annual coefficients and exponents in the loess regions being larger than in the rocky mountainous regions. A correlation and a stepwise multiple regression analysis show that the key factors controlling the daily and annual SRC fitting parameters differ with different landform types. In the soft sandstone and aeolian area, the annual exponent is affected by the flow duration and the temperature, and the controlling factor for the annual coefficient is the peak flow anomaly. In the loess area, the main factor controlling the coefficient is the gauging area, while the exponents for both timescales are significantly affected by the flow duration and the terrain. The daily and annual coefficients in the rocky mountainous regions are strongly affected by the vegetation coverage while the coefficients for both timescales have threshold values related to the vegetation coverage in the middle Yellow River basin.