基于混合深度神经网络模型的司法文书智能化处理

doi:10.16511/j.cnki.qhdxxb.2019.21.015

摘要
图/表
参考文献
相关文章
Metrics

全文: PDF(1250 KB)
输出: BibTeX | EndNote (RIS)

摘要在法律文书智能化处理过程中，针对罪名预测、法条推荐、刑期预测，该文提出了一种长文本分类的混合深度神经网络模型HAC（hybrid attention and CNN model），该模型利用残差网络融合了改进的层次注意力网络（iHAN）和深度金字塔卷积神经网络（DPCNN）。在"中国法研杯"司法人工智能挑战赛（CAIL-2018）的测试数据集上，该模型对罪名的预测与相关法条的推荐的F1-Score（Micro-F1和Macro-F1的均值）分别为85%和87%。对于刑期的预测，由于地区、年代、法院、法官、被告人的态度等方面的差异会导致刑期预测难度加大。该模型具有优良的预测性能和泛化能力，能够很好地适应这些差异。同时，将该模型在罪名预测和法条推荐的输出结果加入到刑期预测任务的输入中，并使用分类方法对刑期进行预测，进一步提升了模型的效果，最终在刑期预测任务中F1-Score超过77%，获得CAIL-2018刑期预测优秀成绩。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS

	作者相关文章
	王文广
	陈运文
	蔡华
	曾彦能
	杨慧宇

关键词 ：司法文书处理, 自然语言理解, 判决预测, 深度神经网络, 注意力模型

Abstract：This article presents a neural network model for crime prediction, legal article recommendation, and sentence prediction from judicial documents. The model is based on a hybrid attention and CNN model which combines the improved hierarchical attention network (iHAN) and the deep pyramid convolutional neural network (DPCNN) by ResNet. The F1-Scores (mean value of Micro-F1 and Macro-F1) for the crime prediction and related law samples from CAIL-2018 were 85% and 87%. The sentence prediction accuracy is impacted by differences in locations, dates, courts, judges, and defendant attitudes. The model adjusts well to these differences because of its high predictive ability and model generalization. The model prediction outputs for the recommended crime prediction and law items were then added to the model input for the sentence prediction task to further improve the model performance. The model got an excellent result in the sentence prediction task (CAIL-2018) with an F1-Score of over 77%.

Key words： judicial document processing natural language understanding verdict prediction deep neural networks attention model

收稿日期: 2018-12-18 出版日期: 2019-06-21

引用本文:

王文广, 陈运文, 蔡华, 曾彦能, 杨慧宇. 基于混合深度神经网络模型的司法文书智能化处理[J]. 清华大学学报（自然科学版）, 2019, 59(7): 505-511.
WANG Wenguan, CHEN Yunwen, CAI Hua, ZENG Yanneng, YANG Huiyu. Judicial document intellectual processing using hybrid deep neural networks. Journal of Tsinghua University(Science and Technology), 2019, 59(7): 505-511.

链接本文:

http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2019.21.015 或 http://jst.tsinghuajournals.com/CN/Y2019/V59/I7/505

图１层次注意力网络

图２深度金字塔卷积神经网络

图３罪名、法条和刑期分布

图4 词层级的iHAN网络

图５ HAC结构示意图

图６ HAC在CAILＧ２０１８的三个任务中的应用示意图

表1 模型在“中国法研杯”第1阶段预测任务中的F1-Score

表２ HAC在“中国法研杯”预测任务中的F１-Score

[1] KATZ D M. Quantitative legal prediction-or-how I learned to stop worrying and start preparing for the data-driven future of the legal services industry[J]. Emory LJ, 2012, 62:909-966.
[2] ASHLEY K, BRANTING K, MARGOLIS H, et al. Legal reasoning and artificial intelligence:How computers" think" like lawyers[J]. University of Chicago Law School Roundtable, 2001, 8(1):1-28.
[3] SURDEN H. Machine learning and law[J]. Washington Law Review, 2014, 89:87-115.
[4] ALETRAS N, TSARAPATSANIS D, PREOA AT'XU XUIUC-PIETRO D, et al. Predicting judicial decisions of the European Court of Human Rights:A natural language processing perspective[J]. PeerJ Computer Science, 2016, 2:1-19.
[5] WILLIAMS T. Civilisation 2030:The near future for law firms[R]. London, UK:Jomati Consultants, 2014.
[6] NAVIGLI R. Natural language understanding:Instructions for (present and future) use[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI). California, USA:Morgan Kaufman, 2018:5697-5702.
[7] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//Advances in Neural Information Processing Systems (NIPS). Massachusetts, USA:MIT Press, 2013:3111-3119.
[8] PENNINGTON J, SOCHER R, MANNING C. Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA, USA:ACL,2014:1532-1543.
[9] JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg, PA, USA:ACL, 2017:427-431.
[10] LIU P F, QIU X P, HUANG X J. Recurrent neural network for text classification with multi-task learning[C]//Proceedings of the 25th International Joint Conference on Artificial Intelligence(IJCAI). California, USA:Morgan Kaufman, 2016:2873-2879.
[11] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA, USA:ACL, 2014:1746-1751
[12] CONNEAU A, SCHWENK H, BARRAULT L, et al. Very deep convolutional networks for text classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg, PA, USA:ACL, 2017:1107-1116.
[13] CHO K, VAN MERRIЁNBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA, USA:ACL, 2014:1724-1734.
[14] JOHNSON R, ZHANG T. Deep pyramid convolutional neural networks for text categorization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA:ACL, 2017:562-570.
[15] ZHOU X, WAN X J, XIAO J. Attention-based LSTM network for cross-lingual sentiment classification[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA, USA:ACL, 2016:247-256.
[16] YANG Z, YANG D, DYER C, et al. Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg, PA, USA:ACL, 2016:1480-1489.
[17] XIAO C J, ZHONG H X, GUO Z P, et al. Cail2018:A large-scale legal dataset for judgment prediction[J/OL]. (2018-07-04). https://arxiv.org/abs/1807.02478
[18] CHEN X X, XU L, LIU Z Y, et al. Joint learning of character and word embeddings[C]//Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI). California, USA:Morgan Kaufman, 2015:1236-1242.
[19] QIAO C, HUANG B, NIU G C, et al. A new method of region embedding for text classification[C]//International Conference on Learning Representations. Massachusetts, USA:OpenReview, 2018:1-12.
[20] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems (NIPS). Massachusetts, USA:MIT Press, 2017:5998-6008.
[21] SHEN Y T, HUANG X J. Attention-based convolutional neural network for semantic relation extraction[C]//The 26th International Conference on Computational Linguistics (COLIN). New York, NY, USA:ACM, 2016:2526-2536.
[22] YIN W, SCHVTZE H. Attentive convolution:Equipping CNNs with RNN-style attention mechanisms[J/OL]. (2017-10-02). https://arxiv.org/abs/1710.00519
[23] 顾永忠. 浅析个案间量刑的失衡[J]. 人民司法, 2003, 2:64-66. GU Y Z. Analysis on the unbalance of sentencing between cases[J]. People's Judicature. 2003,2:64-66. (in Chinese)
[24] 龙光伟. 论量刑失衡及其对策[J]. 吉林大学社会科学学报, 2003,2:58-61. LONG G W. On the unbalance of measurement of penalty and the countermeasure[J]. Jilin University Journal Social Sciences Edition. 2003,2:58-61. (in Chinese)

[1]	刘宗林, 张梅山, 甄冉冉, 公佐权, 余南, 付国宏. 融入罪名关键词的法律判决预测多任务学习模型[J]. 清华大学学报（自然科学版）, 2019, 59(7): 497-504.
[2]	王晓明, 赵歆波. 基于深度神经网络的个体阅读眼动预测[J]. 清华大学学报（自然科学版）, 2019, 59(6): 468-475.
[3]	张雪英, 牛溥华, 高帆. 基于DNN-LSTM的VAD算法[J]. 清华大学学报（自然科学版）, 2018, 58(5): 509-515.
[4]	张仰森, 郑佳, 黄改娟, 蒋玉茹. 基于双重注意力模型的微博情感分析方法[J]. 清华大学学报（自然科学版）, 2018, 58(2): 122-130.
[5]	艾斯卡尔·肉孜, 殷实, 张之勇, 王东, 艾斯卡尔·艾木都拉, 郑方. THUYG-20:免费的维吾尔语语音数据库[J]. 清华大学学报（自然科学版）, 2017, 57(2): 182-187.
[6]	田垚, 蔡猛, 何亮, 刘加. 基于深度神经网络和Bottleneck特征的说话人识别系统[J]. 清华大学学报（自然科学版）, 2016, 56(11): 1143-1148.
[7]	张劲松, 高迎明, 解焱陆. 基于DNN的发音偏误趋势检测[J]. 清华大学学报（自然科学版）, 2016, 56(11): 1220-1225.

Viewed

Full text

Abstract

Cited

Shared

Discussed