Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  百年期刊
Journal of Tsinghua University(Science and Technology)    2020, Vol. 60 Issue (2) : 171-180     DOI: 10.16511/j.cnki.qhdxxb.2019.21.038
ELECTRONIC ENGINEERING |
Research progress on drug representation learning
CHEN Xin, LIU Xien, WU Ji
Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
Download: PDF(1031 KB)  
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks    
Abstract  The drug development process is characterized by large capital density, high risk and long cycles; thus, drug development requires much capital, manpower and resources. While traditional machine learning methods can aid drug development some, they require molecular descriptors as inputs. The selection of the molecular descriptors then greatly impacts the performance of the machine learning models. Therefore, most traditional machine learning methods require complex and time-consuming feature engineering. The emerging deep learning methods can directly learn the features from raw representations of the drugs which bypasses the feature engineering and shortens the drug development cycle. In this paper, the drug representation learning methods are divided into simplified molecular input line entry specification (SMILES) expression based drug representation learning methods and molecular graph based representation learning methods. This paper then surveys the innovations and limitations of various drug representation learning methods. This paper then identifies major challenges in current drug representation learning methods and presents possible solutions.
Keywords drug      representation learning      simplified molecular input line entry specification (SMILES)      molecular graph     
Issue Date: 15 January 2020
Service
E-mail this article
E-mail Alert
RSS
Articles by authors
CHEN Xin
LIU Xien
WU Ji
Cite this article:   
CHEN Xin,LIU Xien,WU Ji. Research progress on drug representation learning[J]. Journal of Tsinghua University(Science and Technology), 2020, 60(2): 171-180.
URL:  
http://jst.tsinghuajournals.com/EN/10.16511/j.cnki.qhdxxb.2019.21.038     OR     http://jst.tsinghuajournals.com/EN/Y2020/V60/I2/171
  
  
  
  
[1] MERKWIRTH C, LENGAUER T. Automatic generation of complementary descriptors with molecular graph networks[J]. Journal of Chemical Information and Modeling, 2005, 45(5):1159-1168.
[2] WEININGER D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules[J]. Journal of Chemical Information and Computer Sciences, 1988, 28(1):31-36.
[3] SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge, USA:MIT Press, 2014:3104-3112.
[4] KINGMA D P, WELLING M. Auto-encoding variational Bayes[Z/OL]. (2014-05-01)[2019-06-23]. https://arxiv.org/abs/1312.6114.
[5] KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[Z/OL]. (2016-09-09)[2019-06-23]. https://arxiv.org/abs/1609.02907.
[6] GÓMEZ-BOMBARELLI R, WEI J N, DUVENAUD D, et al. Automatic chemical design using a data-driven continuous representation of molecules[J]. ACS Central Science, 2018, 4(2):268-276.
[7] BENGIO Y, DUCHARME R, VINCENT P, et al. A neural probabilistic language model[J]. Journal of Machine Learning Research, 2003, 3(6):1137-1155.
[8] MIKOLOV T, KARAFIÁT M, BURGET L, et al. Recurrent neural network based language model[C]//Eleventh Annual Conference of the International Speech Communication Association. Makuhari, Chiba, Japan:IEEE, 2010:1045-1048.
[9] SEGLER M H S, KOGEJ T, TYRCHAN C, et al. Generating focused molecule libraries for drug discovery with recurrent neural networks[J]. ACS Central Science, 2017, 4(1):120-131.
[10] OLIVECRONA M, BLASCHKE T, ENGKVIST O, et al. Molecular de-novo design through deep reinforcement learning[J]. Journal of Cheminformatics, 2017, 9:48.
[11] POPOVA M, ISAYEV O, TROPSHA A. Deep reinforcement learning for de novo drug design[J]. Science Advances, 2018, 4(7):eaap7885.
[12] ZHENG S J, YAN X, YANG Y D, et al. Identifying structure-property relationships through SMILES syntax analysis with self-attention mechanism[J]. Journal of Chemical Information and Modeling, 2019, 59(2):914-923.
[13] SCHUSTER M, PALIWAL K K. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing, 1997, 45(11):2673-2681.
[14] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[Z/OL]. (2017-12-06). https://arxiv.org/abs/1706.03762.
[15] JAEGER S, FULLE S, TURK S. Mol2vec:Unsupervised machine learning approach with chemical intuition[J]. Journal of Chemical Information and Modeling, 2018, 58(1):27-35.
[16] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[Z/OL]. (2013-09-07)[2019-06-23]. https://arxiv.org/abs/1301.3781.
[17] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[Z/OL]. (2016-05-19)[2019-06-23]. https://arxiv.org/abs/1409.0473v2.
[18] XU Z, WANG S, ZHU F Y, et al. Seq2seq fingerprint:An unsupervised deep molecular embedding for drug discovery[C]//Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics. Boston, USA:ACM, 2017:285-294.
[19] KIM S, THIESSEN P A, BOLTON E E, et al. PubChem substance and compound databases[J]. Nucleic Acids Research, 2015, 44(D1):D1202-D1213.
[20] GAULTON A, HERSEY A, NOWOTKA M, et al. The ChEMBL database in 2017[J]. Nucleic Acids Research, 2016, 45(D1):D945-D954.
[21] WINTER R, MONTANARI F, NOÉ F, et al. Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations[J]. Chemical Science, 2019, 10(6):1692-1701.
[22] HELLER S, MCNAUGHT A, STEIN S, et al. InChI:The worldwide chemical structure identifier standard[J]. Journal of Cheminformatics, 2013, 5:7.
[23] LIU B W, RAMSUNDAR B, KAWTHEKAR P, et al. Retrosynthetic reaction prediction using neural sequence-to-sequence models[J]. ACS Central Science, 2017, 3(10):1103-1113.
[24] LIM J, RYU S, KIM J W, et al. Molecular generative model based on conditional variational autoencoder for de novo molecular design[J]. Journal of Cheminformatics, 2018, 10:31.
[25] KANG S, CHO K. Conditional molecular design with deep generative models[J]. Journal of Chemical Information and Modeling, 2018, 59(1):43-52.
[26] BLASCHKE T, OLIVECRONA M, ENGKVIST O, et al. Application of generative autoencoder in de novo molecular design[J]. Molecular Informatics, 2018, 37(1-2):1700123.
[27] IOVANAC N, SAVOIE B M. Improved chemical prediction from scarce data sets via latent space enrichment[J]. The Journal of Physical Chemistry A, 2019, 123(19):4295-4305.
[28] SHUMAN D I, NARANG S K, FROSSARD P, et al. The emerging field of signal processing on graphs:Extending high-dimensional data analysis to networks and other irregular domains[J]. IEEE Signal Processing Magazine, 2013, 30(3):83-98.
[29] FIGUEIREDO D R, RIBEIRO L F R, SAVERESE P H P. Struc2vec:Learning node representations from structural identity[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Halifax, Canada:ACM, 2017:385-394.
[30] BRUNA J, ZAREMBA W, SZLAM A, et al. Spectral networks and locally connected networks on graphs[Z/OL]. (2014-05-21)[2019-06-23]. https://arxiv.org/abs/1312.6203.
[31] HENAFF M, BRUNA J, LECUN Y. Deep convolutional networks on graph-structured data[Z/OL]. (2015-06-16)[2019-06-07]. https://arxiv.org/abs/1506.05163.
[32] DUVENAUD D K, MACLAURIN D, AGUILERA-IPARRAGUIRRE J, et al. Convolutional networks on graphs for learning molecular fingerprints[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge, USA:ACM, 2015:2224-2232.
[33] DEFFERRARD M, BRESSON X, VANDERGHEYNST P. Convolutional neural networks on graphs with fast localized spectral filtering[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain:ACM, 2016:3844-3852.
[34] KEARNES S, MCCLOSKEY K, BERNDL M, et al. Molecular graph convolutions:Moving beyond fingerprints[J]. Journal of Computer-Aided Molecular Design, 2016, 30(8):595-608.
[35] SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA:IEEE, 2015:1-9.
[36] SCHVTT K T, ARBABZADAH F, CHMIELA S, et al. Quantum-chemical insights from deep tensor neural networks[J]. Nature Communications, 2017, 8:13890.
[37] GILMER J, SCHOENHOLZ S S, RILEY P F, et al. Neural message passing for quantum chemistry[C]//Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia:JMLR, 2017:1263-1272.
[38] SCHVTT K, KINDERMANS P J, SAUCEDA H E, et al. Schnet:A continuous-filter convolutional neural network for modeling quantum interactions[C]//Advances in Neural Information Processing Systems. Long Beach, USA:ACM, 2017:991-1001.
[39] LI J Y, CAI D, HE X F. Learning graph-level representation for drug discovery[Z/OL]. (2017-09-12)[2019-06-07]. https://arxiv.org/abs/1709.03741.
[40] COLEY C W, BARZILAY R, GREEN W H, et al. Convolutional embedding of attributed molecular graphs for physical property prediction[J]. Journal of Chemical Information and Modeling, 2017, 57(8):1757-1772.
[41] GAO K Y, FOKOUE A, LUO H, et al. Interpretable drug target prediction using deep neural representation[C]//Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. Stockholm, Sweden:IJCAI, 2018:3371-3377.
[42] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780.
[43] TSUBAKI M, TOMII K, SESE J. Compound-protein interaction prediction with end-to-end learning of neural networks for graphs and sequences[J]. Bioinformatics, 2019, 35(2):309-318.
[44] WEISFEILER B, LEHMAN A A. A reduction of a graph to a canonical form and an algebra arising during this reduction[J]. Nauchno-Technicheskaya Informatsia, 1968, 2(9):12-16.
[45] XU K L, HU W H, LESKOVEC J, et al. How powerful are graph neural networks?[Z/OL]. (2019-02-22)[2019-06-23]. https://arxiv.org/abs/1810.00826.
[46] ZITNIK M, AGRAWAL M, LESKOVEC J. Modeling polypharmacy side effects with graph convolutional networks[J]. Bioinformatics, 2018, 34(13):i457-i466.
[47] MA T F, XIAO C, ZHOU J Y, et al. Drug similarity integration through attentive multi-view graph auto-encoders[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm, Sweden:IJCAI, 2018:3371-3377.
[48] DEAC A, HUANG Y H, VELI AČG KOVIĆ P, et al. Drug-drug adverse effect prediction with graph co-attention[Z/OL]. (2019)[2019-06-22]. https://arxiv.org/abs/1905.00534.
[49] XU N, WANG P H, CHEN L, et al. MR-GNN:Multi-resolution and dual graph neural network for predicting structured entity interactions[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence. Macau, China:IJCAI, 2019:3968-3974.
[50] YOU J, LIU B, YING Z, et al. Graph convolutional policy network for goal-directed molecular graph generation[C]//Advances in Neural Information Processing Systems. Montréal, Canada:ACM, 2018:6410-6421.
[51] CAO N D, KIPF T. MolGAN:An implicit generative model for small molecular graphs[Z/OL]. (2018-05-30)[2019-06-07]. https://arxiv.org/abs/1805.11973.
[52] YOU J X, YING R, REN X, et al. Graphrnn:Generating realistic graphs with deep auto-regressive models[Z/OL]. (2018-02-24)[2019-06-07]. https://arxiv.org/abs/1802.08773.
[53] KUZMINYKH D, POLYKOVSKIY D, KADURIN A, et al. 3D molecular representations based on the wave transform for convolutional neural networks[J]. Molecular Pharmaceutics, 2018, 15(10):4378-4385.
[54] VERMA N, BOYER E, VERBEEK J. Feastnet:Feature-steered graph convolutions for 3D shape analysis[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA:IEEE, 2018:2598-2606.
[55] TORNG W, ALTMAN R B. 3D deep convolutional neural networks for amino acid environment similarity analysis[J]. BMC Bioinformatics, 2017, 18:302.
[56] ZHANG Z W, CUI P, ZHU W W. Deep learning on graphs:A survey[Z/OL]. (2018-12-11)[2019-06-07]. https://arxiv.org/abs/1812.04202.
[57] DENG J, DONG W, SOCHER R, et al. ImageNet:A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA:IEEE, 2009:248-255.
[58] DEVLIN J, CHANG M W, LEE K, et al. Bert:Pre-training of deep bidirectional transformers for language understanding[Z/OL]. (2019-05-24). https://arxiv.org/abs/1810.04805.
[59] NAVARIN N, TRAN D V, SPERDUTI A. Pre-training graph neural networks with kernels[Z/OL]. (2018-11-16)[2019-06-07]. https://arxiv.org/abs/1811.06930.
[60] HU W H, LIU B W, GOMES J, et al. Pre-training graph neural networks[Z/OL]. (2019-05-29)[2019-06-22]. https://arxiv.org/abs/1905.12265.
[1] XU Xiao, WANG Ying, JIN Tao, WANG Jianmin. Representation learning approach for medical activities enhanced by topical modeling[J]. Journal of Tsinghua University(Science and Technology), 2019, 59(3): 169-177.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
Copyright © Journal of Tsinghua University(Science and Technology), All Rights Reserved.
Powered by Beijing Magtech Co. Ltd