Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们 横山亮次奖 百年刊庆
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  横山亮次奖  |  百年刊庆
清华大学学报(自然科学版)  2019, Vol. 59 Issue (7): 530-536    DOI: 10.16511/j.cnki.qhdxxb.2018.26.024
  计算机科学与技术 本期目录 | 过刊浏览 | 高级检索 |
基于句法依存和条件随机场的韵律短语识别
钱揖丽1,2, 张二萌1
1. 山西大学 计算机与信息技术学院, 太原 030006;
2. 山西大学 计算机智能与中文信息处理教育部重点实验室, 太原 030006
Identification of prosodic phrases based on syntax dependency and conditional random fields
QIAN Yili1,2, ZHANG Ermeng1
1. School of Computer & Information Technology, Shanxi University, Taiyuan 030006, China;
2. Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan 03006, China
全文: PDF(1072 KB)  
输出: BibTeX | EndNote (RIS)      
摘要 正确划分句子的韵律结构对于提高合成语音的质量具有重要的意义。特征的选择是韵律结构预测的关键因素之一。在中文信息处理中,文本特征可以分为浅层文本特征与深层文本特征。浅层特征包括词、词性、词长等;深层特征包括句法信息、语义信息等。该文在挖掘剖析句法结构、依存句法结构同韵律结构之间关系的基础上,从文本中获取相关浅层和深层文本特征,并采用条件随机场模型实现韵律短语预测。首先以浅层文本特征进行韵律短语识别,然后在此基础上加入句法依存深层文本特征进行模型构建。实验结果表明:加入句法依存特征后,韵律短语预测精确率提高了13.3%,召回率提高了14.69%,F值提高了14.1%。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
钱揖丽
张二萌
关键词 韵律短语预测句法依存文本特征条件随机场    
Abstract:Synthesized speech quality can be significantly improved by correctly dividing the prosodic structure of sentences. The feature selection is then one of the key factors of prosodic structure prediction. In Chinese information processing, the text features can be divided into shallow text features and deep text features, with the shallow features including words, parts of speech, word length and other factors while the deep features include syntactic information, semantic information and other factors. The relationships between the syntactic dependency structure and the prosodic structure were analyzed to identify the shallow and deep text features in the text with a conditional random field model used for prosodic phrase prediction. This study first uses the shallow text features to recognize the prosodic phrases and then adds the syntactic dependency deep text features to construct the model. Tests show that the accuracy is increased by 13.3%, the recall rate is increased by 14.69%, and the F-score is increased by 14.1%.
Key wordsrhythmic phrases    syntax dependency    text features    conditional random fields (CRFs)
收稿日期: 2018-10-20      出版日期: 2019-06-21
基金资助:国家自然科学基金资助项目(61573231,61673248);山西省自然科学基金资助项目(201601D102030)
引用本文:   
钱揖丽, 张二萌. 基于句法依存和条件随机场的韵律短语识别[J]. 清华大学学报(自然科学版), 2019, 59(7): 530-536.
QIAN Yili, ZHANG Ermeng. Identification of prosodic phrases based on syntax dependency and conditional random fields. Journal of Tsinghua University(Science and Technology), 2019, 59(7): 530-536.
链接本文:  
http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2018.26.024  或          http://jst.tsinghuajournals.com/CN/Y2019/V59/I7/530
  图1 韵律结构
  图2 句法结构
  图3 句法结构转化后的二叉树
  表1 各句法层级差的韵律分布情况
  图4 依存句法结构
  表2 依存分析结果
  表3 各依存类型的韵律分布情况
  表4 各连接点的韵律分布情况
  表5 各内弧跨度的韵律分布情况
  表6 特征模板
  表7 句法依存+CRFs模型实验结果
  图5 F值结果比较
  表8 不同模型加入句法依存特征前后结果比较
  表9 与其他方法比较
[1] 曹剑芬. 基于语法信息的汉语韵律结构预测[J]. 中文信息学报, 2003, 17(3):41-46.CAO J F. Prediction of prosodic organization based on grammatical information[J]. Journal of Chinese Information Processing, 2003, 17(3):41-46. (in Chinese)
[2] 应宏, 蔡莲红. 基于结构助词驱动的韵律短语界定的研究[J].中文信息学报, 1999, 13(6):41-46, 64.YING H, CAI L H. Research on the segmentation of the prosodic phrase based on driven by the structural auxiliary word[J]. Journal of Chinese Information Processing, 1999, 13(6):41-46, 64. (in Chinese)
[3] 吴晓如, 王仁华, 刘庆峰. 基于韵律特征和语法信息的韵律边界检测模型[J]. 中文信息学报, 2003, 17(5):48-54.WU X R, WANG R H, LIU Q F. Detection model of prosodic boundary based on prosodic features and syntactic information[J]. Journal of Chinese Information Processing, 2003, 17(5):48-54. (in Chinese)
[4] 包森成. 基于统计模型的韵律结构预测研究[D]. 北京:北京邮电大学, 2009.BAO S C. Research on prosodic structure prediction based on statical model[D]. Beijing:Beijing University of Posts and Telecommunications, 2009. (in Chinese)
[5] 王琦. 基于深度神经网络的韵律结构预测研究[D]. 北京:北京交通大学, 2016.WANG Q. Research on prosodic structure prediction based on deep neural network[D]. Beijing:Beijing Jiaotong University, 2016. (in Chinese)
[6] 冯志茹. 基于语块的汉语韵律短语边界识别研究[D]. 太原:山西大学, 2015.FENG Z R. A research on identification of Chinese prosodic phrase boundary based on Chinese chunk[D]. Taiyuan:Shanxi University, 2015. (in Chinese)
[7] 朱玲. 基于句法特征的汉语韵律边界预测的研究[D]. 兰州:西北师范大学, 2013.ZHU L. Research on predicting Chinese prosodic boundary based on syntactic features[D]. Lanzhou:Northwest Normal University, 2013. (in Chinese)
[8] DONG Y, ZHOU T, DONG C Y, et al. A two-stage prosodic structure generation strategy for mandarin text-to-speech systems[J]. Acta Automatica Sinica, 2010, 36(11):1569-1574.
[9] NAKAMURA C, ARAI M, MAZUKA R. Immediate use of prosody and context in predicting a syntactic structure[J]. Cognition, 2012, 125(2):317-323.
[10] 邵艳秋, 穗志方, 韩纪庆, 等. 基于依存句法分析的汉语韵律层级自动预测技术研究[J]. 中文信息学报, 2008, 22(2):116-123.SHAO Y Q, SUI Z F, HAN J Q, et al. A study on Chinese prosodic hierarchy prediction based on dependency grammar analysis[J]. Journal of Chinese Information Processing, 2008, 22(2):116-123. (in Chinese)
[11] ROBINSON J J. Dependency structures and transformational rules[J]. Language, 1970, 46(2):259-285.
[12] LAFFERTY J D, MCCALLUM A, PEREIRA F C N. Conditional random fields:Probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the 18th International Conference on Machine Learning. San Francisco, USA:Morgan Kaufmann Publishers, 2001:282-289.
[13] 吕雁飞, 侯子骄, 张凯. 多分类BP-AdaBoost算法研究与应用[J]. 高技术通讯, 2015, 25(5):437-444.LV Y F, HOU Z J, ZHANG K. Study of multiclass BP-AdaBoost and its application[J]. Chinese High Technology Letters, 2015, 25(5):437-444. (in Chinese)
[14] 钱揖丽, 冯志茹. 利用AdaBoost-SVM集成算法和语块信息的韵律短语识别[J]. 计算机工程与科学, 2015, 37(12):2324-2330.QIAN Y L, FENG Z R. Recognition of Chinese prosodic phrases based on AdaBoost-SVM algorithm and chunk information[J]. Computer Engineering and Science, 2015, 37(12):2324-2330. (in Chinese)
[1] 宋青松, 张超, 陈禹, 王兴莉, 杨小军. 组合全卷积神经网络和条件随机场的道路分割[J]. 清华大学学报(自然科学版), 2018, 58(8): 725-731.
[2] 李煦, 屠明, 吴超, 国雁萌, 纳跃跃, 付强, 颜永红. 基于NMF和FCRF的单通道语音分离[J]. 清华大学学报(自然科学版), 2017, 57(1): 84-88.
[3] 刘泽文, 丁冬, 李春文. 基于条件随机场的中文短文本分词方法[J]. 清华大学学报(自然科学版), 2015, 55(8): 906-910,915.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《清华大学学报(自然科学版)》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn