Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  百年期刊
Journal of Tsinghua University(Science and Technology)    2024, Vol. 64 Issue (5) : 780-788     DOI: 10.16511/j.cnki.qhdxxb.2023.22.052
SPECIAL SECTION: SOCIAL MEDIA PROCESSING |
Chinese positive sentiment style transfer based on dialogues
HU Yuting, ZUO Jiali, LIU Jiangsheng, WAN Jianyi, WANG Mingwen
School of Computer and Information Engineering, Jiangxi Normal University, Nanchang 330022, China
Download: PDF(2267 KB)   HTML
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks    
Abstract  [Objective] Several studies highlight that negative sentiment dialogues within the family remarkably impact individuals' mental and physical well-being. Conversely, positive sentiment dialogues offer individuals constructive feedback, motivating learning and personal growth. Such dialogues aid in building self-confidence and positive attitude, enabling better coping with life's challenges. Text style transfer is an effective tool to shift negative sentimental dialogues to positive sentimental dialogues. The goal of text style transfer is to retain the content of the text while imbuing the generated text with specific attributes. Sentiment style transfer is an important research direction in natural language processing, and sentiment style transfer in the context of family dialogues holds practical value. However, the current literature on sentiment style transfer has mainly focused on English datasets with relatively limited research within the Chinese domain. We constructed a dialogue-based Chinese sentimental text dataset in this study. The initial data was extracted from dialogues in the TV series “Home with Kids”, where considerable sentiment differences were observed between dialogues involving characters Liu Mei and Liu Xing as well as Liu Mei and Xia Xue. While interactions between Liu Mei and Liu Xing were primarily critical, interactions between Liu Mei and Xia Xue were characterized by encouragement and respect. Preprocessing was applied to this dataset in the following steps: (1) Data cleaning, filtering, and format conversion were performed to ensure data quality and consistency. (2) A recurrent modeling annotation method was employed using suitable algorithms and models to annotate the data, identifying key information and features. Six iterations were performed, with the classifier being fine-tuned using the data updated from the previous iteration each time. (3) Manual annotation was also conducted, meticulously reviewing and labeling the data manually to further enhance accuracy and reliability. Furthermore, the final dataset comprises 30 836 sentences, including 11 562 sentences with positive sentiment content and 19 274 sentences with negative sentiment content. In this dialogue dataset, most texts explicitly contain sentiment-related words. Based on the characteristics of this dialogue dataset, research involving dialogue-based Chinese positive sentiment style transfer was started using editing-based delete-retrieve-generate (DRG), tagger and generator (TAG), conditional Bert (CondBert), and tagging without rewriting (TWR) models. In addition, the improved TWR (TWR*) Transformer model was introduced. The original TWR model used a multilayer perceptron to train a style classifier. To improve the ability to accurately identify specific styles, a style classifier was trained based on RoBERTa-Large-Chinese model for distinguishing different text styles. These experiments demonstrated that using the pretrained language model RoBERTa-Large-Chinese produced enhanced classification results, which was attributed to the close relationship between the attention weights of the penultimate layer in the Transformer model and words commonly associated with positive and negative sentiments. RoBERTa-Large-Chinese model presented a higher accuracy in recognizing textual sentiment style attribute words. Experimental results confirm that the style classifier trained on our dataset can effectively identify negative content within text. Through both automated and manual evaluations, this TWR* model outperforms baseline models in identifying textual sentiment attributes, achieving positive sentiment style transfer, thus verifying the effectiveness of model enhancements and the validity of the dataset.
Keywords natural language processing      text generation      sentiment style transfer      recurrent model      editing-based model      family dialogue     
Issue Date: 22 April 2024
Service
E-mail this article
E-mail Alert
RSS
Articles by authors
Cite this article:   
HU Yuting, ZUO Jiali, LIU Jiangsheng, WAN Jianyi, WANG Mingwen. Chinese positive sentiment style transfer based on dialogues[J]. Journal of Tsinghua University(Science and Technology),2024, 64(5): 780-788.
URL:  
http://jst.tsinghuajournals.com/EN/10.16511/j.cnki.qhdxxb.2023.22.052     OR     http://jst.tsinghuajournals.com/EN/Y2024/V64/I5/780
[1] JIN D, JIN Z J, HU Z T, et al. Deep learning for text style transfer:A survey[J]. Computational Linguistics, 2022, 48(1):155-205.
[2] PRYZANT R, MARTINEZ R D, DASS N, et al. Automatically neutralizing subjective bias in text[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, USA, 2020:480-489.
[3] VAN DEN BERCKEN L, SIPS R J, LOFI C, et al. Evaluating neural text simplification in the medical domain[C]//The World Wide Web Conference. San Francisco, USA, 2019:3286-3292.
[4] PRABHUMOYE S, TSVETKOV Y, SALAKHUTDINOV R, et al. Style transfer through back-translation[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia, 2018:866-876.
[5] MIR R, FELBO B, OBRADOVICH N, et al. Evaluating style transfer for text[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Minneapolis, USA, 2019:495-504.
[6] JHAMTANI H, GANGAL V, HOVY E, et al. Shakespearizing modern language using copy-enriched sequence to sequence models[C]//Proceedings of the Workshop on Stylistic Variation. Copenhagen, Denmark, 2017:10-19.
[7] RAO S, TETREAULT J. Dear sir or madam, may I introduce the GYAFC dataset:Corpus, benchmarks and metrics for formality style transfer[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. New Orleans, USA, 2018:129-140.
[8] LOGACHEVA V, DEMENTIEVA D, USTYANTSEV S, et al. ParaDetox:Detoxification with parallel data[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Dublin, Ireland, 2022:6804-6818.
[9] HU Z Q, LEE R K W, AGGARWAL C C, et al. Text style transfer:A review and experimental evaluation[J]. ACM SIGKDD Explorations Newsletter, 2022, 24(1):14-45.
[10] LI J C, JIA R B, HE H, et al. Delete, retrieve, generate:A simple approach to sentiment and style transfer[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, Volume 1(Long Papers). New Orleans, USA, 2018:1865-1874.
[11] MADAAN A, SETLUR A, PAREKH T, et al. Politeness transfer:A tag and generate approach[C/OL]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online, 2020:1869-1881.
[12] DALE D, VORONOV A, DEMENTIEVA D, et al. Text detoxification using large pre-trained neural models[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Punta Cana, Dominica, 2021:7979-7996.
[13] SHUO Y. Tagging without rewriting:A probabilistic model for unpaired sentiment and style transfer[C]//Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Dublin, Ireland, 2022:293-303.
[14] FU Z X, TAN X Y, PENG N Y, et al. Style transfer in text:Exploration and evaluation[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, USA, 2018.
[15] YI X Y, LIU Z H, LI W H, et al. Text style transfer via learning style instance supported latent space[C]//Proceedings of the 29th International Joint Conference on Artificial Intelligence. Yokohama, Japan, 2020:3801-3807.
[16] LIU D Y H, FU J, ZHANG Y D, et al. Revision in continuous space:Unsupervised text style transfer without adversarial learning[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York, USA, 2020:8376-8383.
[17] DEVLIN J, CHANG M W, LEE K, et al. Bert:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Minneapolis, USA, 2019:4171-4186.
[18] AREFYEV N, SHELUDKO B, PODOLSKIY A, et al. Always keep your target in mind:Studying semantics and improving performance of neural lexical substitution[C]//Proceedings of the 28th International Conference on Computational Linguistics. Barcelona, Spain, 2020:1242-1255.
[19] HEAFIELD K. KenLM:Faster and smaller language model queries[C]//Proceedings of the Sixth Workshop on Statistical Machine Translation. Edinburgh, Scotland, 2011:187-197.
[20] LAFFERTY J D, MCCALLUM A, PEREIRA F C, et al. Conditional random fields:Probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the 18th International Conference on Machine Learning. San Francisco, USA, 2001:282-289.
[21] VITERBI A. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm[J]. IEEE Transactions on Information Theory, 1967, 13(2):260-269.
[22] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA, 2017:6000-6010.
[23] HOOVER B, STROBELT H, GEHRMANN S. exBERT:A visual analysis tool to explore learned representations in transformer models//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics:System Demonstrations. Online, 2020:187-196.
[24] PAPINENI K, ROUKOS S, WARD T, et al. BLEU:A method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Philadelphia, USA, 2002:311-318.
[25] ZHANG T Y, KISHORE V, WU F, et al. BERTScore:Evaluating text generation with BERT//8th International Conference on Learning Representations. Online, 2020:26-30.
[26] LAN Z Z, CHEN M D, GOODMAN S, et al. ALBERT:A lite BERT for self-supervised learning of language representations. arXiv:1909.11942, 2021.
[27] LIU Y H, OTT M, GOYAL N, et al. RoBERTa:A robustly optimized BERT pretraining approach. arXiv:1907.11692,2019.
[1] WANG Yun, HU Min, TA Na, SUN Haitao, GUO Yifeng, ZHOU Wuai, GUO Yu, ZHANG Wanzhe, FENG Jianhua. Large language models and their application in government affairs[J]. Journal of Tsinghua University(Science and Technology), 2024, 64(4): 649-658.
[2] WANG Qingren, WANG Yinzi, ZHONG Hong, ZHANG Yiwen. Chinese-oriented entity recognition method of character vocabulary combination sequence[J]. Journal of Tsinghua University(Science and Technology), 2023, 63(9): 1326-1338.
[3] LU Sicong, LI Chunwen. Human-machine conversation system for chatting based on scene and topic[J]. Journal of Tsinghua University(Science and Technology), 2022, 62(5): 952-958.
[4] HU Bin, GENG Tianyu, DENG Geng, DUAN Lei. Faster biomedical named entity recognition based on knowledge distillation[J]. Journal of Tsinghua University(Science and Technology), 2021, 61(9): 936-942.
[5] JIA Xudong, WANG Li. Text classification model based on multi-head attention capsule neworks[J]. Journal of Tsinghua University(Science and Technology), 2020, 60(5): 415-421.
[6] CHEN Lele, HUANG Song, SUN Jinlei, HUI Zhanwei, WU Kaishun. Bug report quality detection based on the BM25 algorithm[J]. Journal of Tsinghua University(Science and Technology), 2020, 60(10): 829-836.
[7] WANG Yuanlong, LI Ru, ZHANG Hu, WANG Zhiqiang. Causal options in Chinese reading comprehension[J]. Journal of Tsinghua University(Science and Technology), 2018, 58(3): 272-278.
[8] LU Zhaolin, LI Shengbo, Schroeder Felix, ZHOU Jichen, CHENG Bo. Driving comfort evaluation of passenger vehicles with natural language processing and improved AHP[J]. Journal of Tsinghua University(Science and Technology), 2016, 56(2): 137-143.
[9] ZHANG Xu, WANG Shengjin. Attributed object detection based on natural language processing[J]. Journal of Tsinghua University(Science and Technology), 2016, 56(11): 1137-1142.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
Copyright © Journal of Tsinghua University(Science and Technology), All Rights Reserved.
Powered by Beijing Magtech Co. Ltd