Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们 横山亮次奖 百年刊庆
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  横山亮次奖  |  百年刊庆
清华大学学报(自然科学版)  2023, Vol. 63 Issue (9): 1380-1389    DOI: 10.16511/j.cnki.qhdxxb.2023.21.012
  计算机科学与技术 本期目录 | 过刊浏览 | 高级检索 |
基于句法结构迁移和领域融合的跨领域情感分类
赵传君1,2, 武美龄1, 申利华3, 上官学奎3, 王彦婕3, 李杰1, 王素格4, 李德玉4
1. 山西财经大学 信息学院, 太原 030006;
2. 山西财经大学 经济大数据山西省实验室, 太原 030006;
3. 山西省信息技术应用创新工程研究中心, 太原 030006;
4. 山西大学 计算机与信息技术学院, 太原 030006
Cross-domain sentiment classification based on syntactic structure transfer and domain fusion
ZHAO Chuanjun1,2, WU Meiling1, SHEN Lihua3, SHANGGUAN Xuekui3, WANG Yanjie3, LI Jie1, WANG Suge4, LI Deyu4
1. School of Information, Shanxi University of Finance and Economics, Taiyuan 030006, China;
2. Economic Big Data Shanxi Province Key Laboratory, Shanxi University of Finance and Economics, Taiyuan 030006, China;
3. Shanxi Information Technology Application Innovation Engineering Research Center, Taiyuan 030006, China;
4. School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China
全文: PDF(3117 KB)   HTML 
输出: BibTeX | EndNote (RIS)      
摘要 用于文本情感分析的深度学习模型如递归神经网络等参数较多, 因此需要大量高质量标记训练数据对模型进行训练和优化。 在实际应用中, 特定领域难以获取高质量带情感标签评论数据。 在跨领域文本情感分类任务中, 针对不同领域数据分布差异性, 提出了基于句法结构迁移和领域融合的跨领域文本情感分类方法, 可以解决特定领域对带标签数据依赖问题。 句法结构迁移方面, 将依存语法特征加入到递归神经网络中, 设计了一种可迁移的依存句法递归神经网络模型, 通过句法结构迁移有效地迁移跨领域结构信息, 为情感迁移提供支撑。 领域融合方面, 在传统的最大均值差异领域度量方法上细化了跨领域同类别距离度量信息。 通过约束源领域和目标领域的分布, 可以保证2个领域距离在学习过程中尽可能减小, 有效地提取领域通用特征。 实验结果表明, 该方法比已有方法有效提高了跨领域情感分类准确率。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
赵传君
武美龄
申利华
上官学奎
王彦婕
李杰
王素格
李德玉
关键词 跨领域情感分类句法结构迁移最小距离约束深度迁移学习    
Abstract:[Objective] Deep learning models for text sentiment analysis, such as recurrent neural networks, often require many parameters and a large amount of high-quality labeled training data to effectively train and optimize recurrent neural networks. However, obtaining domain-specific high-quality sentiment-labeled data is a challenging task in practical applications. This study proposes a cross-domain text sentiment classification method based on syntactic structure transfer and domain fusion (SSTDF) to address the domain-invariant learning and distribution distance difference metric problems. This method can effectively alleviate the dependence on domain-specific annotated data due to the difference in the data distribution among different domains. [Methods] A method combining SSTDF was proposed in this study to solve the problem of cross-domain sentiment classification. Dependent syntactic features are introduced into the recurrent neural network for syntactic structure transfer for designing a migratable dependent syntactic recurrent neural network model. Furthermore, a parameter transfer strategy is employed to transfer syntactic structure information across domains efficiently for supporting sentiment transfer. The conditional maximum mean discrepancy distance metric is used in domain fusion to quantify the distribution differences between the source and target domains and further refine the cross-domain same-category distance metric information. By constraining the distributions of source and target domains, domain variable features are effectively extracted to maximize the sharing of sentiment information between source and target domains. In this paper, we used a joint optimization and training approach to address cross-domain sentiment classification. Specifically, the sentiment classification loss of source and target domains is minimized, and their fusion losses are fully considered in the joint optimization process. Hence, the generalization performance of the model and classification accuracy of the cross-domain sentiment classification task are considerably improved. [Results] The dataset used in this study is the sentiment classification dataset of Amazon English online reviews, which has been widely used in cross-domain sentiment classification studies; furthermore, it contains four domains—B (Books), D (DVD), E (Electronic), and K (Kitchen)—each with 1 000 positive and negative reviews. The experimental results show that the accuracy of the SSTDF method is higher than the baseline method, achieving 0.844, 0.830, and 0.837 for average accuracy, recall, and F1 values, respectively. Fine-tuning allows the fast convergence of the network, thereby improving its transfer efficiency. [Conclusions] Finally, we used deep transfer learning methods to solve the task of cross-domain text sentiment classification from the perspective of cross-domain syntactic structure consistency learning. A recurrent neural network model that integrates syntactic structure information is used; additionally, a domain minimum distance constraint is added to the syntactic structure transfer process to ensure that the distance between the source and target domains is as similar as possible during the learning process. The effectiveness of the proposed method is finally verified using experimental results. The next step is to increase the number of experimental and neutral samples to validate the proposed method on a larger dataset. Furthermore, a more fine-grained aspect-level cross-domain sentiment analysis will be attempted in the future.
Key wordscross-domain sentiment classification    syntactic structure transfer    minimum distance constraint    deep transfer learning
收稿日期: 2022-12-21      出版日期: 2023-08-19
基金资助:国家自然科学基金资助项目(61906110, 62076158, 62072294); 教育部人文社科项目(22YJAZH092); 山西省高等学校哲学社会科学研究项目(2021W058); 山西省研究生优秀创新项目(2022Y535)
作者简介: 赵传君(1986-),男,副教授,E-mail:zhaochj@sxufe.edu.cn
引用本文:   
赵传君, 武美龄, 申利华, 上官学奎, 王彦婕, 李杰, 王素格, 李德玉. 基于句法结构迁移和领域融合的跨领域情感分类[J]. 清华大学学报(自然科学版), 2023, 63(9): 1380-1389.
ZHAO Chuanjun, WU Meiling, SHEN Lihua, SHANGGUAN Xuekui, WANG Yanjie, LI Jie, WANG Suge, LI Deyu. Cross-domain sentiment classification based on syntactic structure transfer and domain fusion. Journal of Tsinghua University(Science and Technology), 2023, 63(9): 1380-1389.
链接本文:  
http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2023.21.012  或          http://jst.tsinghuajournals.com/CN/Y2023/V63/I9/1380
  
  
  
  
  
  
  
  
[1] LI T, CHEN X, ZHANG S H, et al. Cross-domain sentiment classification with contrastive learning and mutual information maximization[C]//Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, Canada: IEEE, 2021: 8203-8207.
[2] ZHAO C J, WANG S G, LI D Y, et al. Cross-domain sentiment classification via parameter transferring and attention sharing mechanism[J]. Information Sciences, 2021, 578: 281-296.
[3] 吴琼, 刘悦, 沈华伟, 等. 面向跨领域情感分类的统一框架[J]. 计算机研究与发展, 2013, 50(8): 1683-1689. WU Q, LIU Y, SHEN H W, et al. A unified framework for cross-domain sentiment classification[J]. Journal of Computer Research and Development, 2013, 50(8): 1683-1689. (in Chinese)
[4] 赵传君, 王素格, 李德玉. 跨领域文本情感分类研究进展[J]. 软件学报, 2020, 31(6): 1723-1746. ZHAO C J, WANG S G, LI D Y. Research progress on cross-domain text sentiment classification[J]. Journal of Software, 2020, 31(6): 1723-1746. (in Chinese)
[5] LI L, YE W R, LONG M S, et al. Simultaneous learning of pivots and representations for cross-domain sentiment classification[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York, USA: AAAI Press, 2020: 8220-8227.
[6] 赵传君, 王素格, 李德玉, 等. 基于分组提升集成的跨领域文本情感分类[J]. 计算机研究与发展, 2015, 52(3): 629-638. ZHAO C J, WANG S G, LI D Y, et al. Cross-domain text sentiment classification based on Grouping-AdaBoost ensemble[J]. Journal of Computer Research and Development, 2015, 52(3): 629-638. (in Chinese)
[7] 魏现辉, 张绍武, 杨亮, 等. 基于加权SimRank的跨领域文本情感倾向性分析[J]. 模式识别与人工智能, 2013, 26(11): 1004-1009. WEI X H, ZHANG S W, YANG L, et al. Cross-domain sentiment analysis based on weighted SimRank[J]. Pattern Recognition and Artificial Intelligence, 2013, 26(11): 1004-1009. (in Chinese)
[8] ZHAO C J, WANG S G, LI D Y. Multi-source domain adaptation with joint learning for cross-domain sentiment classification[J]. Knowledge-Based Systems, 2020, 191: 105254.
[9] YUE C Y, CAO H Q, XU G P, et al. Collaborative attention neural network for multi-domain sentiment classification[J]. Applied Intelligence, 2021, 51(6): 3174-3188.
[10] 王素格, 李大宇, 李旸. 基于联合模型的商品口碑数据情感挖掘[J]. 清华大学学报(自然科学版), 2017, 57(9): 926-931. WANG S G, LI D Y, LI Y. Sentiment mining of commodity reputation data based on joint model[J]. Journal of Tsinghua University (Science and Technology), 2017, 57(9): 926-931. (in Chinese)
[11] 巫继鹏, 鲍建竹, 蓝恭强, 等. 结合规则蒸馏的情感原因发现[J]. 清华大学学报(自然科学版), 2020, 60(5): 422-429. WU J P, BAO J Z, LAN G Q, et al. Emotion cause extraction using rule distillation[J]. Journal of Tsinghua University (Science and Technology), 2020, 60(5): 422-429. (in Chinese)
[12] ZHAO C J, WANG S G, LI D Y. Exploiting social and local contexts propagation for inducing Chinese microblog-specific sentiment lexicons[J]. Computer Speech & Language, 2019, 55: 57-81.
[13] FU Y P, LIU Y. Cross-domain sentiment classification based on key pivot and non-pivot extraction[J]. Knowledge-Based Systems, 2021, 228: 107280.
[14] PAN S J, YANG Q. A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10): 1345-1359.
[15] TAN C Q, SUN F C, KONG T, et al. A survey on deep transfer learning[C]//Proceedings of the 27th International Conference on Artificial Neural Networks. Rhodes, Greece: Springer, 2018: 270-279.
[16] ZHAO C J, WANG S G, LI D Y. Deep transfer learning for social media cross-domain sentiment classification[C]// Proceedings of the 6th Chinese National Conference on Social Media Processing. Beijing, China: Springer, 2017: 232-243.
[17] ZHOU G Y, XIE Z W, HUANG J X, et al. Bi-transferring deep neural networks for domain adaptation[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, Germany: Association for Computational Linguistics, 2016: 322-332.
[18] DONG X, DE MELO G. A helping hand: Transfer learning for deep sentiment analysis[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: Association for Computational Linguistics, 2018: 2524-2534.
[19] WEI X C, LIN H F, YANG L, et al. A convolution-LSTM-based deep neural network for cross-domain MOOC forum post classification[J]. Information, 2017, 8(3): 92.
[20] GLOROT X, BORDES A, BENGIO Y. Domain adaptation for large-scale sentiment classification: A deep learning approach[C]//Proceedings of the 28th International Conference on Machine Learning. Bellevue, USA: Omnipress, 2011: 513-520.
[21] YU J F, JIANG J. Learning sentence embeddings with auxiliary tasks for cross-domain sentiment classification[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin, Texas: Association for Computational Linguistics, 2016: 236-246.
[22] GRETTON A, BORGWARDT K M, RASCH M J, et al. A kernel two-sample test[J]. The Journal of Machine Learning Research, 2012, 13: 723-773.
[23] LONG M S, WANG J M, DING G G, et al. Transfer feature learning with joint distribution adaptation[C]//Proceedings of the 2013 IEEE International Conference on Computer Vision. Sydney, Australia: IEEE Computer Society, 2013: 2200-2207.
[24] LONG M S, CAO Y, WANG J M, et al. Learning transferable features with deep adaptation networks[C]//Proceedings of the 32nd International Conference on Machine Learning. Lille, France: JMLR, 2015: 97-105.
[25] WANG J D, CHEN Y Q, HAO S J, et al. Balanced distribution adaptation for transfer learning[C]//Proceedings of the 2017 IEEE International Conference on Data Mining (ICDM). New Orleans, USA: IEEE, 2017: 1129-1134.
[26] DUAN L X, TSANG I W, XU D. Domain transfer multiple kernel learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(3): 465-479.
[27] ZHU Y C, ZHUANG F Z, WANG J D, et al. Multi-representation adaptation network for cross-domain image classification[J]. Neural Networks, 2019, 119: 214-221.
[28] PENNINGTON J, SOCHER R, MANNING C. GloVe: Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar: Association for Computational Linguistics, 2014: 1532-1543.
[29] MEHROTRA R, AGRAWAL R, HAIDER S A. Dictionary based sparse representation for domain adaptation[C]//Proceedings of the 21st ACM International Conferenceon Information and Knowledge Management. Maui, USA: Association for Computing Machinery, 2012: 2395-2398.
[30] KHAN F H, QAMAR U, BASHIR S. Enhanced cross-domain sentiment classification utilizing a multi-source transfer learning approach[J]. Soft Computing, 2019, 23(14): 5431-5442.
[31] MACK G A, SKILLINGS J H. A Friedman-type rank test for main effects in a two-factor ANOVA[J]. Journal of the American Statistical Association, 1980, 75(372): 947-951.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《清华大学学报(自然科学版)》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn