SPECIAL SECTION: BIG DATA |
|
|
|
|
|
Attack-guided diffusion model for Chinese adversarial samples generation |
WU Houyue1, LI Xianwei1,2, ZHANG Shunxiang3,4, ZHU Honghao1, WANG Ting5 |
1. School of Computer and Information Engineering, Bengbu University, Bengbu 233030, China; 2. Anhui Engineering Research Center for Intelligent Applications and Security of Industrial Internet, Anhui University of Technology, Ma'anshan 243032, China; 3. School of Computer Science and Engineering, Anhui University of Science & Technology, Huainan 232000, China; 4. Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei 240088, China; 5. School of Information Engineering, Huainan Union University, Huainan 232000, China |
|
|
Abstract [Objective] The generation of adversarial samples in text represents a significant area of research in natural language processing. The process is employed to test the robustness of machine learning models and has gained widespread attention from scholars. Owing to the complex nature of Chinese semantics, generating Chinese adversarial samples remains a major challenge. Traditional methods for generating Chinese adversarial samples mainly involve word replacement, deletion/insertion, and word order adjustment. These methods often produce samples that are easily detectable and have low attack success rates, and thus, the methods struggle to balance attack effectiveness and semantic coherence. To address these limitations, this study introduces DiffuAdv, a novel method for generating Chinese adversarial samples. This approach enhances the generation process by simulating the data distribution during the adversarial attack phase. The gradient changes between adversarial and original samples are used as guiding conditions during the model's reverse diffusion phase in pre-training, resulting in the generation of more natural and effective adversarial samples. [Methods] DiffuAdv entails the introduction of diffusion models into the generation of adversarial samples to improve attack success rates while ensuring the naturalness of the generated text. This method utilizes a gradient-guided diffusion process, leveraging gradient information between original and adversarial samples as guiding conditions. It consists of two stages: forward diffusion and reverse diffusion. In the forward diffusion stage, noise is progressively added to the original data until a noise-dominated state is achieved. The reverse diffusion stage involves the reconstruction of samples, in which the gradient changes between adversarial and original samples are leveraged to maximize the adversarial objective. During the pre-training phase, data capture and feature learning occur under gradient guidance, with the aim of learning the data distribution of original samples and analyzing the deviations from adversarial samples. In the reverse diffusion generation phase, adversarial perturbations are constructed using gradients and integrated into the reverse diffusion process, ensuring that at each step of reverse diffusion, samples evolve toward greater adversarial effectiveness. To validate the effectiveness of the proposed method, extensive experiments are conducted across multiple datasets and various natural language processing tasks, and the performance of the method is compared with those of seven existing state-of-the-art methods. [Results] Compared with existing methods for generating Chinese adversarial samples, DiffuAdv demonstrates higher attack success rates across three tasks: text sentiment classification, causal relation extraction, and sentiment cause extraction. Ablation experiments confirm the effectiveness of using gradient changes between original and adversarial samples to guide the generation of adversarial samples and improve their quality. Perplexity (PPL) measurements indicate that the adversarial samples generated by DiffuAdv have an average PPL value of only 0.518, demonstrating that these samples are superior in rationality and readability compared with the samples generated by other methods. [Conclusions] DiffuAdv effectively generates high-quality adversarial samples that closely resemble real text in terms of fluency and naturalness. The adversarial samples produced by this method not only achieve high attack success rates but also exhibit strong robustness. The introduction of DiffuAdv enhances the research perspective on generating adversarial text samples and broadens the approaches for tasks such as text sentiment classification, causal relationship extraction, and emotion-cause pair extraction.
|
Keywords
adversarial sample generation
guided diffusion
conditional diffusion
diffusion model
text generation
|
Issue Date: 22 November 2024
|
|
|
[1] ALMIANI M, ABUGHAZLEH A, JARARWEH Y, et al. Resilient back propagation neural network security model for containerized cloud computing[J]. Simulation Modelling Practice and Theory, 2022, 118: 102544. [2] SAGU A, GILL N S, GULIA P. Hybrid deep neural network model for detection of security attacks in IoT enabled environment[J]. International Journal of Advanced Computer Science and Applications, 2022, 13(1): 120-127. [3] XIONG Z B, CAI Z P, HU C Q, et al. Towards neural network-based communication system: Attack and defense[J]. IEEE Transactions on Dependable and Secure Computing, 2023, 20(4): 3238-3250. [4] RADFORD A, WU J, CHILD R, et al. Language models are unsupervised multitask learners[EB/OL]. (2019-06-11)[2024-04-18]. https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf. [5] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis, USA: NAACL-HLT, 2019: 4171-4186. [6] OPENAI. ChatGPT (GPT-4 turbo version)[Large language model][R/OL]. (2022-11-30)[2024-03-22]. https://chat.openai.com/chat. [7] ZHANG J P, HUANG J T, WANG W X, et al. Improving the transferability of adversarial samples by path-augmented method[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE, 2023: 8173-8182. [8] 张顺香, 吴厚月, 朱广丽, 等. 面向中文文本分类的字符级对抗样本生成方法[J]. 电子与信息学报, 2023, 45(6): 2226-2235. ZHANG S X, WU H Y, ZHU G L, et al. Character-level adversarial samples generation approach for Chinese text classification[J]. Journal of Electronics & Information Technology, 2023, 45(6): 2226-2235. (in Chinese) [9] 韩子屹, 王巍, 玄世昌. 多约束引导的中文对抗样本生成[J]. 中文信息学报, 2023, 37(2): 41-52. HAN Z Y, WANG W, XUAN S C. Chinese adversarial example generation guided by multi-constraints[J]. Journal of Chinese Information Processing, 2023, 37(2): 41-52. (in Chinese) [10] 夏倪明, 张洁. 基于自适应集束搜索算法的中文对抗样本生成[J/OL]. 计算机工程. (2024-05-29)[2024-06-20]. https://link.cnki.net/doi/10.19678/j.issn.1000-3428.0069348. XIA N M, ZHANG J. Chinese text adversarial examples generation based on adaptive beam search[J/OL]. Computer Engineering. (2024-05-29)[2024-06-20]. https://link.cnki.net/doi/10.19678/j.issn.1000-3428.0069348. (in Chinese) [11] SONG X F, XU D H, PENG C, et al. A two-stage frequency-domain generation algorithm based on differential evolution for black-box adversarial samples[J]. Expert Systems with Applications, 2024, 249: 123741. [12] HOOGEBOOM E, NIELSEN D, JAINI P, et al. Argmax flows and multinomial diffusion: Learning categorical distributions[C]//Proceedings of the 35th International Conference on Neural Information Processing Systems. Virtual Event: Curran Associates Inc., 2021:12454-12465. [13] AUSTIN J, JOHNSON D D, HO J, et al. Structured denoising diffusion models in discrete state-spaces[C]//Proceedings of the 34th Annual Conference on Neural Information Processing Systems. Virtual Event: Curran Associates Inc., 2021: 17981-17993. [14] CHEN S F, SUN P Z, SONG Y B, et al. DiffusionDet: Diffusion model for object detection[C]//Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE, 2023: 19773-19786. [15] HO C J, TAI C H, LIN Y Y, et al. Diffusion-SS3D: Diffusion model for semi-supervised 3D object detection[C]//Proceedings of the 37th International Conference on Neural Information Processing Systems. New Orleans, LA, USA: Curran Associates Inc., 2024: 2134. [16] GONG S S, LI M K, FENG J T, et al. DiffuSeq: Sequence to sequence text generation with diffusion models[C]//Proceedings of the 11th International Conference on Learning Representations (ICLR). Kigali, Rwanda: ICLR, 2023. [17] GONG S S, LI M K, FENG J T, et al. DiffuSeq-v2: Bridging discrete and continuous text spaces for accelerated seq2seq diffusion models[C] //Findings of the Association for Computational Linguistics: EMNLP. Singapore: ACL, 2023: 9868-9875. [18] LIANG B, LI H C, SU M Q, et al. Deep text classification can be fooled[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm, Sweden: AAAI Press, 2018: 4208-4215. [19] EBRAHIMI J, RAO A Y, LOWD D, et al. HotFlip: White-box adversarial examples for text classification[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: ACL, 2018: 31-36. [20] LIU M X, ZHANG Z H, ZHANG Y M, et al. Automatic generation of adversarial readable Chinese texts[J]. IEEE Transactions on Dependable and Secure Computing, 2023, 20(2): 1756-1770. [21] CHEN M S, MEI S, FAN J Q, et al. An overview of diffusion models: Applications, guided generation, statistical rates and optimization[EB/OL]. (2024-04-16)[2024-05-18]. https://doi.org/10.48550/arXiv.2404.07771. [22] DHARIWAL P, NICHOL A. Diffusion models beat GANs on image synthesis[C]//Proceedings of the 35th International Conference on Neural Information Processing Systems. Virtual Event: Curran Associates Inc., 2021: 8780-8794. [23] AJAY A, DU Y L, GUPTA A, et al. Is conditional generative modeling all you need for decision making?[C]//Proceedings of the 11th International Conference on Learning Representations. Kigali, Rwanda: ICLR, 2023: 4940. [24] 胡忠义, 秦维, 吴江. 基于改进扩散模型的电商营销文本的自动生成研究[J/OL]. 数据分析与知识发现. (2024-04-16)[2024-04-20]. http://kns.cnki.net/kcms/detail/10.1478.G2.20240415.1125.002.html. HU Z Y, QIN W, WU J. Automatic generation of e-commerce marketing text based on improved diffusion model[J/OL]. Data Analysis and Knowledge Discovery. (2024-04-16)[2024-04-20]. http://kns.cnki.net/kcms/detail/10.1478.G2.20240415.1125.002.html. (in Chinese) [25] 陈子民, 关志涛. 基于条件扩散模型的图像分类对抗样本防御方法[J/OL]. 计算机工程. (2024-03-26)[2024-04-19]. https://link.cnki.net/doi/10.19678/j.issn.1000-3428. 0068512. CHEN Z M, GUAN Z T. Image classification adversarial defense based on classifier-free diffusion model[J/OL]. Computer Engineering. (2024-03-26)[2024-04-19]. https://link.cnki.net/doi/10.19678/j.issn.1000-3428.0068512. (in Chinese) [26] 何琨, 佘计思, 张子君, 等. 基于引导扩散模型的自然对抗补丁生成方法[J]. 电子学报, 2024, 52(2): 564-573. HE K, SHE J S, ZHANG Z J, et al. A guided diffusion-based approach to natural adversarial patch gen-eration[J]. Acta Electronica Sinica, 2024, 52(2): 564-573. (in Chinese) [27] 徐瑞, 曾诚, 程世杰, 等. 基于双三元组网络的易混淆文本情感分类方法[J]. 中文信息学报, 2024, 38(1): 135-145. XU R, ZENG C, CHENG S J, et al. Double triplet network for confusing text sentiment classification[J]. Journal of Chinese Information Processing, 2024, 38(1): 135-145. (in Chinese) [28] 朱广丽, 许鑫, 张顺香, 等. PosNet: 基于位置的因果关系抽取网络[J]. 计算机科学, 2022, 49(12): 305-311. ZHU G L, XU X, ZHANG S X, et al. PosNet: Position-based causal relation extraction network[J]. Computer Science, 2022, 49(12): 305-311. (in Chinese) [29] SU X X, HUANG Z, ZHAO Y X, et al. Recent trends in deep learning-based textual emotion cause extraction[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023, 31: 2765-2786. [30] SHANG X C, CHEN C X, CHEN Z P, et al. Modularized mutuality network for emotion-cause pair extraction[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023, 31: 539-549. [31] 何天文, 王红. 基于语义语法分析的中文语句困惑度评价[J]. 计算机应用研究, 2017, 34(12): 3538-3542, 3546. HE T W, WANG H. Evaluating perplexity of Chinese sentences based on grammar & semantics analysis[J]. Application Research of Computers, 2017, 34(12): 3538-3542, 3546. (in Chinese) [32] 李相葛, 罗红, 孙岩. 基于汉语特征的中文对抗样本生成方法[J]. 软件学报, 2023, 34(11): 5143-5161. LI X G, LUO H, SUN Y. Adversarial sample generation method based on Chinese features[J]. Journal of Software, 2023, 34(11): 5143-5161. (in Chinese) [33] OU H X, YU L, TIAN S W, et al. Chinese adversarial examples generation approach with multi-strategy based on semantic[J]. Knowledge and Information Systems, 2022, 64(4): 1101-1119. [34] 王文琦, 汪润, 王丽娜, 等. 面向中文文本倾向性分类的对抗样本生成方法[J]. 软件学报, 2019, 30(8): 2415-2427. WANG W Q, WANG R, WANG L N, et al. Adversarial examples generation approach for tendency classification on Chinese texts[J]. Journal of Software, 2019, 30(8): 2415-2427. (in Chinese) [35] 仝鑫, 王罗娜, 王润正, 等. 面向中文文本分类的词级对抗样本生成方法[J]. 信息网络安全, 2020, 20(9): 12-16. TONG X, WANG L N, WANG R Z, et al. A generation method of word-level adversarial samples for Chinese text classiifcation[J]. Netinfo Security, 2020, 20(9): 12-16. (in Chinese) [36] LI L Y, MA R T, GUO Q P, et al. BERT-ATTACK: Adversarial attack against Bert using Bert[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Virtual Event: ACL, 2020: 6193-6202. [37] 张千锟, 韩虎, 郝俊. 基于双注意力融合知识的方面级情感分类[J]. 计算机工程与科学, 2023, 45(10): 1866-1873. ZHANG Q K, HAN H, HAO J. Aspect-level sentiment classification based on dual attention fusion knowledge[J]. Computer Engineering and Science, 2023, 45(10): 1866-1873. (in Chinese) [38] MAO X L, CHANG S, SHI J J, et al. Sentiment-aware word embedding for emotion classification[J]. Applied Sciences, 2019, 9(7): 1334. [39] 周艳玲, 兰正寅, 张, 等. 融合交替归一化的细粒度情感分类研究[J]. 中文信息学报, 2023, 37(9): 140-149. ZHOU Y L, LAN Z Y, ZHANG Y, et al. Fine-grained sentiment classification based on alternating normalization[J]. Journal of Chinese Information Processing, 2023, 37(9): 140-149. (in Chinese) [40] 崔仕林, 闫蓉. 基于SoftLexicon和注意力机制的中文因果关系抽取[J]. 中文信息学报, 2023, 37(4): 81-89. CUI S L, YAN R. Chinese causality extraction based on SoftLexicon and attention mechanism[J]. Journal of Chinese Information Processing, 2023, 37(4): 81-89. (in Chinese) [41] 邓金科, 段文杰, 张顺香, 等. 基于提示增强与双图注意力网络的复杂因果关系抽取[J/OL]. 计算机应用. (2024-01-30)[2024-04-18]. http://kns.cnki.net/kcms/detail/51.1307.tp.20240129.0903.002.html. DENG J K, DUAN W J, ZHANG S X, et al. Complex causal relationship extraction based on prompt enhancement and bi-graph attention network[J/OL]. Journal of Computer Applications. (2024-01-30)[2024-04-18]. http://kns.cnki.net/kcms/detail/51.1307.tp.20240129.0903.002.html. (in Chinese) [42] 张顺香, 张镇江, 朱广丽, 等. 基于Bi-LSTM与双路CNN的金融领域文本因果关系识别[J]. 数据分析与知识发现, 2022, 6(7): 118-127. ZHANG S X, ZHANG Z J, ZHU G L, et al. Identifying financial text causality with Bi-LSTM and two-way CNN[J]. Data Analysis and Knowledge Discovery, 2022, 6(7): 118-127. (in Chinese) [43] 代建华, 邓育彬. 基于情感膨胀门控CNN的情感—原因对提取[J]. 数据分析与知识发现, 2020, 4(8): 98-106. DAI J H, DENG Y B. Extracting emotion-cause pairs based on emotional dilation gated CNN[J]. Data Analysis and Knowledge Discovery, 2020, 4(8): 98-106. (in Chinese) [44] 张思阳, 魏苏波, 孙争艳, 等. 基于多标签Seq2Seq模型的情绪—原因对提取模型[J]. 数据分析与知识发现, 2023, 7(2): 86-96. ZHANG S Y, WEI S B, SUN Z Y, et al. Extracting emotion-cause pairs based on multi-label Seq2Seq model[J]. Data Analysis and Knowledge Discovery, 2023, 7(2): 86-96. (in Chinese) [45] LI C B, HU J, LI T R, et al. An effective multi-task learning model for end-to-end emotion-cause pair extraction[J]. Applied Intelligence, 2023, 53(3): 3519-3529. [46] CHEN F, SHI Z W, YANG Z L, et al. Recurrent synchronization network for emotion-cause pair extraction[J]. Knowledge-Based Systems, 2022, 238: 107965. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|