PDF(7792 KB)
基于联邦知识蒸馏的跨语言社交媒体事件检测
周帅帅, 朱恩昌, 高盛祥, 余正涛, 线岩团, 赵子霄, 陈霖
清华大学学报(自然科学版) ›› 2025, Vol. 65 ›› Issue (5) : 854-866.
PDF(7792 KB)
PDF(7792 KB)
基于联邦知识蒸馏的跨语言社交媒体事件检测
Cross-lingual social event detection based on federated knowledge distillation
社交媒体事件检测是指从各类社交媒体的内容中挖掘热点事件。在实际情况中, 由于数据稀缺, 社交媒体事件检测在低资源的情况下表现较差。现有的方法主要是通过跨语言知识迁移等方式来缓解低资源问题, 但忽略了数据隐私问题。该文提出了基于联邦知识蒸馏的跨语言社交媒体事件检测方法FedEvent, 旨在将富资源客户端知识蒸馏到低资源客户端。该框架通过结合参数高效微调技术和三组对比损失, 实现非英文语义空间到英文语义空间的有效映射, 并采用联邦蒸馏策略, 在保障数据隐私的前提下实现知识的迁移。此外, 还设计了一套四阶段生命周期机制以适应增量场景。在真实数据集上进行实验证明了该框架的有效性。
Objective: Social event detection involves the identification of trending events from various social event contents. It has garnered widespread attention in recent years. However, in practical scenarios, the performance of social event detection suffers in the case of data scarcity. Moreover, privacy concerns have resulted in regulatory restrictions preventing organizations from sharing data without explicit user consent, which makes centralized data training impractical. The current approaches primarily address the issue of data scarcity through cross-lingual knowledge transfer but often ignore the challenges associated with data privacy. Consequently, this study proposes a framework for cross-lingual social event detection based on federated knowledge distillation, referred to as FedEvent, with the goal of distilling knowledge from high-resource clients to low-resource ones. Methods: The framework employs parameter-efficient fine-tuning techniques and triple contrastive losses to effectively map non-English semantic spaces to English ones, and employs a federated distillation strategy to ensure data privacy. In addition, a four-stage lifecycle mechanism is designed to adapt to incremental scenarios. It includes four stages: pre-training of high-resource clients, pre-training of low-resource clients, client detection, and maintenance. In the first stage, a specific algorithm is used to train the initial model. In the second stage, the model trained in the first stage uses the central server as a medium for knowledge transfer to assist low-resource clients in training the initial model. In the third stage, the trained model is used to directly detect each incoming message. In the fourth stage, the model is continuously trained with the latest message blocks, and a federated codistillation mechanism is used to enable online learning for each client, allowing for the model to learn new knowledge. Based on the large-scale public English dataset Events2012, this study supplements event samples in Chinese and Vietnamese according to its topic descriptions, constructs private datasets for low-resource clients (Chinese-language client and Vietnamese-language client), and thereby establishes a multi-client, cross-lingual experimental environment. The performance of the framework is evaluated on the aforementioned datasets across two widely used clustering evaluation metrics: Normalized Mutual Information (NMI) and Adjusted Mutual Information (AMI). Results: 1) The experimental results demonstrate that compared with the existing methods, the proposed framework achieves effective knowledge transfer while ensuring data privacy. 2) In comparison with the single-node setup, the proposed framework demonstrates notable enhancement in the multimode environment on the Chinese-language client from 1.6% to 204.3% in NMI and from 2.0% to 342.9% in AMI. On the Vietnamese-language client, the respective improvements are noted to be between 2.3% to 6 200% and 0 to 4 400%. 3) The proposed framework performs very similarly to the state-of-the-art centralized method Cross-Lingual Knowledge Distillation (CLKD). 4) Case analysis reveals that FedEvent's outstanding performance on clusters with high-resource clients significantly influences its effectiveness on analogous clusters with low-resource clients. This demonstrates that FedEvent can effectively transfer knowledge from high-resource clients to low-resource ones. Furthermore, visual analysis vividly highlights the superior clustering outcomes achieved by FedEvent. Conclusions: The framework employs a lifecycle mechanism to accommodate the needs of event detection both in online and offline scenarios, effectively transferring knowledge through process and outcome supervision. Using knowledge distillation techniques, it mitigates the challenges faced when addressing low-resource languages and leverages a cross-lingual word embedding module to map semantic spaces between various languages. The proposed framework achieved the expected results, notably enhancing the model's performance.
社交媒体事件检测 / 低资源 / 联邦知识蒸馏 / 跨语言知识迁移
social event detection / low-resource / federated knowledge distillation / cross-lingual knowledge transfer
| 1 |
|
| 2 |
|
| 3 |
|
| 4 |
|
| 5 |
|
| 6 |
AGGARWAL C C, SUBBIAN K. Event detection in social Streams[C]//Proceedings of the 12th SIAM International Conference on Data Mining. Anaheim, USA: Society for Industrial and Applied Mathematics, 2012: 624-635.
|
| 7 |
|
| 8 |
|
| 9 |
ZHANG K, ZI J, WU L G. New event detection based on indexing-tree and named entity[C]//Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Amsterdam, The Netherlands, USA: Association for Computing Machinery 2007: 215-222.
|
| 10 |
FEDORYSZAK M, FREDERICK B, RAJARAM V, et al. Real-time event detection on social data streams[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Anchorage, USA: Association for Computing Machinery 2019: 2774-2782.
|
| 11 |
|
| 12 |
LIU F Z, XUE S, WU J, et al. Deep learning for community detection: progress, challenges and opportunities[C]// Proceedings of the 29th International Joint Conference on Artificial Intelligence. Yokohama, Japan : ijcai. org, 2021: 4981-4987.
|
| 13 |
|
| 14 |
|
| 15 |
BECKER H, NAAMAN M, GRAVANO L. Beyond trending topics: Real-world event identification on twitter[C]// Proceedings of the 5th International AAAI Conference on Web and Social Media. Barcelona, Spain: AAAI Press, 2021: 438-441.
|
| 16 |
|
| 17 |
CORDEIRO M. Twitter event detection: Combining wavelet analysis and topic inference summarization[C]//Proceedings of the Doctoral Symposium on Informatics Engineering. Porto, Portugal: University of Porto, 2012: 11-16.
|
| 18 |
|
| 19 |
Chung H W, Garrette D, Tan K C, et al. Improving multilingual models with language-clustered vocabularies[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Pennsylvania, USA: Association for Computational Linguistics, 2020: 4536-4546.
|
| 20 |
|
| 21 |
|
| 22 |
PENG H, LI J X, GONG Q R, et al. Fine-grained event categorization with heterogeneous graph convolutional networks[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence. Macao China: ijcai. org, 2019: 3238-3245.
|
| 23 |
|
| 24 |
REN J Q, JIANG L, PENG H, et al. From known to unknown: Quality-aware self-improving graph neural network for open set social event detection[C]//Proceedings of the 31st ACM International Conference on Information & Knowledge Management. Atlanta, USA: Association for Computing Machinery, 2022: 1696-1705.
|
| 25 |
REN J Q, JIANG L, PENG H, et al. Evidential temporal-aware graph-based social event detection via dempster-shafer theory[C]//2022 IEEE International Conference on Web Services (ICWS). Barcelona, Spain: IEEE Computer Society Press, 2022: 331-336.
|
| 26 |
CAO Y W, PENG H, WU J, et al. Knowledge-preserving incremental social event detection via heterogeneous gnns[C]//Proceedings of the Web Conference 2021. Ljubljana, Slovenia: Association for Computing Machinery, 2021: 3383-3395.
|
| 27 |
MOHIUDDIN T, BARI M S, JOTY S. LNMap: Departures from isomorphic assumption in bilingual lexicon induction through non-linear maping in latent space[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Pennsylvania, USA: Association for Computational Linguistics, 2020: 2712-2723.
|
| 28 |
REN J Q, PENG H, JIANG L, et al. Transferring knowledge distillation for multilingual social event detection[EB/OL]. (2021-08-13)[2024-08-24]. https://doi.org/10.03084/arXiv.2108.03084.
|
| 29 |
HU E J, SHEN Y L, WALLIS P, et al. Lora: Low-rank adaptation of large language models[C]//Tenth International Conference On learning Representations. Ithaca, NY: Springer, 2022.
|
| 30 |
CAMPELLO R J G B, MOULAVI D, SANDER J. Density-based clustering based on hierarchical density estimates[C]// 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining. Gold Coast, Australia: Springer, 2013: 160-172.
|
| 31 |
|
| 32 |
PIRES T, SCHLINGER E, GARRETTE D. How multilingual is multilingual BERT?[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics, 2019: 4996-5001.
|
| 33 |
MCMAHAN B, MOORE E, RAMAGE D, et al. Communication-efficient learning of deep networks from decentralized data[C]//Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. Fort Lauderdale, USA : PMLR, 2017: 1273-1282.
|
| 34 |
|
| 35 |
|
/
| 〈 |
|
〉 |