Chinese-English-Burmese neural machine translation based on multilingual joint training
MAN Zhibo, MAO Cunli, YU Zhengtao, LI Xunyu, GAO Shengxiang, ZHU Junguo
Yunnan Key Laboratory of Artificial Intelligence, Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
Abstract:Multilingual neural machine translation is an effective method for translations of low-resource languages that have relatively small amounts of data available to train machine translations. Existing methods usually rely on shared vocabulary for multilingual translations between similar languages such as English, French, and German. However, the Burmese language is a typical low-resource language. The language structures of Chinese, English and Burmese are also quite different. A multilingual joint training method is presented here for a Chinese-English-Burmese neural machine translation that alleviates the problem of the limited amount of shared vocabulary between these languages. The rich Chinese-English parallel corpus and the poor Chinese-Burmese and English-Burmese corpora are jointly trained using the Transformer framework. The model maps the Chinese-Burmese, Chinese-English and English-Burmese vocabulary to the same semantic space on the encoding and decoding sides to reduce the differences between the Chinese, English and Burmese language structures. The influence of the shared vocabulary compensates for the lack of Chinese-Burmese and English-Burmese data by sharing the Chinese-English corpus training parameters. Tests show that in one-to-many and many-to-many translation scenarios, this method has significantly better BLEU scores over the baseline models for Chinese-English, English-Burmese, and Chinese-Burmese translations.
[1] AHARONI R, JOHNSON M, FIRAT O. Massively multilingual neural machine translation [C]//Proceedings of NAACL-HLT 2019. Minneapolis, USA, 2019. [2] LEE J, CHO K, HOFMANN T. Fully character-level neural machine translation without explicit segmentation [J]. Transactions of the Association for Computational Linguistics, 2017, 5: 365-378. [3] WANG Y N, ZHANG J J, ZHAI F F, et al. Three strategies to improve one-to-many multilingual translation [C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium, 2018: 2955-2960. [4] WANG Y N, ZHOU L, ZHANG J J, et al. A compact and language-sensitive multilingual translation method [C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy, 2019: 1213-1223. [5] CARUANA R. Multitask learning [J]. Machine Learning, 1997, 28(1): 41-75. [6] 哈里旦木·阿布都克里木, 刘洋, 孙茂松. 神经机器翻译系统在维吾尔语-汉语翻译中的性能对比[J]. 清华大学学报(自然科学版), 2017, 57(8): 878-883. ABUDUKELIMU H, LIU Y, SUN M S. Performance comparison of neural machine translation systems in Uyghur-Chinese translation [J]. Journal of Tsinghua University (Science & Technology), 2017, 57(8): 878-883. (in Chinese) [7] 李鹏, 刘洋, 孙茂松. 层次短语翻译的神经网络调序模型[J]. 清华大学学报(自然科学版), 2014, 54(12): 1529-1533. LI P, LIU Y, SUN M S. Neural reordering model for hierarchical phrase-based translations [J]. Journal of Tsinghua University (Science & Technology), 2014, 54(12): 1529-1533. (in Chinese) [8] JOHNSON M, SCHUSTER M, LE Q V, et al. Google's multilingual neural machine translation system: Enabling zero-shot translation [J]. Transactions of the Association for Computational Linguistics, 2017, 5: 339-351. [9] NWET K T. Building bilingual corpus based on hybrid approach for Myanmar-English machine translation [J]. International Journal of Scientific & Engineering Research, 2011, 2(9): 1-6. [10] NWET K T, SOE K M, THEIN N L. Developing word-aligned Myanmar-English parallel corpus based on the IBM models [J]. International Journal of Computer Applications, 2011, 27(8): 12-18. [11] KIM Y, PETROV P, PETRUSHKOV P, et al. Pivot-based transfer learning for neural machine translation between non-English languages [C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Hong Kong, China, 2019. [12] LAKEW S M, EROFEEVA A, NEGRI M, et al. Transfer learning in multilingual neural machine translation with dynamic vocabulary [Z]. arXiv: 1811.01137, 2018. [13] DABRE R, FUJITA A, CHU C H. Exploiting multilingualism through multistage fine-tuning for low-resource neural machine translation [C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Hong Kong, China, 2019: 1410-1416. [14] FIRAT O, SANKARAN B, AL-ONAIZAN Y, et al. Zero-resource translation with multi-lingual neural machine translation [C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin, USA, 2016. [15] SACHAN D S, NEUBIG G. Parameter sharing methods for multilingual self-attentional translation models [C]//Proceedings of the Third Conference on Machine Translation (WMT), Volume I: Research Papers. Brussels, Belgium, 2018. [16] DONG D X, WU H, HE W, et al. Multi-task learning for multiple language translation [C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Beijing, China, 2015: 1723-1732. [17] FIRAT O, CHO K, SANKARAN B, et al. Multi-way, multilingual neural machine translation [J]. Computer Speech & Language, 2017, 45: 236-252. [18] ZOPH B, KNIGHT K. Multi-source neural translation [C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego, USA, 2016. [19] HA T L, NIEHUES J, WAIBEL A. Toward multilingual neural machine translation with universal encoder and decoder [Z]. arXiv: 1611.04798, 2016. [20] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates, 2017: 5998-6008. [21] SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks [C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada, 2014: 3104-3112. [22] WIN A T. Phrase reordering translation system in Myanmar-English [C]//Ninth International Conference on Computer Applications (ICCA 2011). Yangon, Myanmar, 2011: 50-53. [23] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space [Z]. arXiv: 1301.3781, 2013. [24] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate [Z]. arXiv: 1409.0473, 2014. [25] PRESS O, WOLF L. Using the output embedding to improve language models [Z]. arXiv: 1608.05859, 2016.