Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  百年期刊
Journal of Tsinghua University(Science and Technology)    2020, Vol. 60 Issue (4) : 312-320     DOI: 10.16511/j.cnki.qhdxxb.2020.25.005
ELECTRONIC ENGINEERING |
Parallel algorithm for solving sparse linear equations based on variable correlation decomposition
LIU Shanxia, JIN Song, WANG Wenchen, WANG Yuhang, WANG Yu
Department of Electronics and Communication Engineering, North China Electric Power University, Baoding 071001, China
Download: PDF(2868 KB)  
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks    
Abstract  Sparse linear equations are the core of many large scientific computing tasks. Although parallel algorithms are providing powerful, new tools for solving sparse linear equations, existing parallel algorithms have some drawbacks such as the optimal sub-matrix being difficult to obtain and the algorithms requiring a large overhead to synchronize parallel tasks. This paper presents a parallel algorithm for sparse linear equations based on the variable correlation decomposition method. Firstly, the algorithm performs an incomplete LU decomposition on the coefficient matrix to obtain two equations for the upper and lower triangles. Then, the correlation between y and x is used to solve for the independent part of the variable in parallel from the upper and lower triangles. All the individual partial values are then added to get the final value of the variable. Since the solution does not need to wait for all the previous variables to be calculated, the partial value calculation can be performed in parallel as the calculation proceeds, which significantly reduces the algorithm execution time and improves the algorithm solution speed and parallelism. Tests show that this sparse linear equation solver is more than 50% faster than the parallel solution method in the cusparse library.
Keywords graphics processing unit (GPU)      sparse linear equations      parallel computation      variable partial value      thread     
Issue Date: 03 April 2020
Service
E-mail this article
E-mail Alert
RSS
Articles by authors
LIU Shanxia
JIN Song
WANG Wenchen
WANG Yuhang
WANG Yu
Cite this article:   
LIU Shanxia,JIN Song,WANG Wenchen, et al. Parallel algorithm for solving sparse linear equations based on variable correlation decomposition[J]. Journal of Tsinghua University(Science and Technology), 2020, 60(4): 312-320.
URL:  
http://jst.tsinghuajournals.com/EN/10.16511/j.cnki.qhdxxb.2020.25.005     OR     http://jst.tsinghuajournals.com/EN/Y2020/V60/I4/312
  
  
  
  
  
  
  
  
[1] SALINAS S, LUO C Q, CHEN X H, et al. Efficient secure outsourcing of large-scale sparse linear systems of equations[J]. IEEE Transactions on Big Data, 2017, 4(1):26-39.
[2] 沈海龙. 线性代数系统迭代解法与预条件方法研究[D]. 沈阳:东北大学, 2013. SHEN H L. The study of iterative methods and preconditioning methods for linear algebraic systems[D]. Shenyang:Northeastern University, 2013. (in Chinese)
[3] 周挺辉, 赵文恺, 严正, 等. 基于图形处理器的电力系统稀疏线性方程组求解方法[J]. 电力系统自动化, 2015, 39(2):74-80. ZHOU T K, ZHAO W K, YAN Z, et al. A method for solving sparse linear equations of power systems based on GPU[J]. Automation of Electric Power System, 2015, 39(2):74-80. (in Chinese)
[4] 谢力, 王武, 冯仰德. 基于多层半可分结构矩阵的快速算法与并行实现[J]. 数值计算与计算机应用, 2017, 38(1):37-48. XIE L, WANG W, FENG Y D. A fast algorithm based on hierarchically semiseparable structured matrix and its parallel implementation[J]. Journal on Numerical Methods and Computer Applications, 2017, 38(1):37-48. (in Chinese)
[5] KECKLER S W, DALLY W J, KHAILANY B, et al. GPUs and the future of parallel computing[J]. IEEE Micro, 2011, 31(5):7-17.
[6] 郑方, 张昆, 邬贵明, 等. 面向高性能计算的众核处理器结构级高能效技术[J]. 计算机学报, 2014, 37(10):2176-2186. ZHENG F, ZHANG K, WU G M, et al. Architecture techniques of many-core processor for energy-efficient in high performance computing[J]. Journal of Computers, 2014, 37(10):2176-2186. (in Chinese)
[7] 朱宇兰. 基于GPU通用计算的并行算法和计算框架的实现[J]. 山东农业大学学报(自然科学版), 2016, 47(3):473-476, 480. ZHU Y L. Parallel algorithm based on general purpose computing on GPU and the implementation of calculation framework[J]. Journal of Shandong Agricultural University (Natural Science Edition), 2016, 47(3):473-476, 480. (in Chinese)
[8] PARK J, SMELYANSKIY M, SUNDARAM N, et al. Sparsifying synchronization for high-performance shared memory sparse-triangular solver[M]//KUNKEL J M, LUDWIG T, MEUER H W. Supercomputing. Cham:Springer, 2014:124-140.
[9] NAUMOV M. Parallel solution of sparse triangular linear systems in the preconditioned iterative methods on the GPU[R]. Technical Report NVR-2011-001 NVIDIA. NVIDIA, 2011.
[10] LI R P, SAAD Y. GPU-accelerated preconditioned iterative linear solvers[J]. The Journal of Supercomputing, 2013, 63(2):443-466.
[11] WOLF M M, HEROUX M A, BOMAN E G. Factors impacting performance of multithreaded sparse triangular solve[M]//PALMA J M L M, DAYDÉ M, MARQUES O, et al. High Performance Computing for Computational Science-VECPAR 2010. Berlin, Heidelberg:Springer, 2011, 6449:32-44.
[12] KABIR H, BOOTH J D, AUPY G, et al. STS-k:A multilevel sparse triangular solution scheme for NUMA multicores[C]//Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. New York, USA:ACM, 2015.
[13] MAADASAMY S, DORAISWAMY H, NATARAJAN V. A hybrid parallel algorithm for computing and tracking level set topology[C]//2012 19th International Conference on High Performance Computing. Pune, India:IEEE, 2012.
[14] PICCIAU A, INGGS G E, WICKERSON J, et al. Balancing locality and concurrency:Solving sparse triangular systems on GPUs[C]//2016 IEEE 23rd International Conference on High Performance Computing. Hyderabad, India:IEEE, 2016.
[15] SUCHOSKI B, SEVERN C, SHANTHARAM M, et al. Adapting sparse triangular solution to GPUs[C]//201241stInternational Conferenceon Parallel ProcessingWorkshops. Pittsburgh, USA:IEEE, 2012.
[16] LI A, VAN DEN BRAAK G J, CORPORAAL H, et al. Fine-grained synchronizations and dataflow programming on GPUs[C]//Proceedings of the 29th ACM on International Conference on Supercomputing. New York, USA:ACM, 2015.
[17] CHE S, SHEAFFER J W, SKADRON K. Dymaxion:optimizing memory access patterns for heterogeneous systems[C]//Proceedings of the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis. New York, USA:IEEE, 2011.
[18] DAVIS T A, HU Y F. The university of Florida sparse matrix collection[J]. ACM Transactions on Mathematical Software, 2011, 38(1):1-25.
[19] NVIDIA Corporation. CUDA Toolkit 10. 0, cusparse library, 2018[EB/OL]. (2018-09-19) http://developer.nvidia.com/cuda-toolkit-40.
[20] Ahamed A K C, MAGOULōS F. Parallel sub-structuring methods for solving sparse linear systems on a cluster of GPUs[C]//2014 IEEE International Conference on High Performance Computing and Communications, 2014 IEEE 6th International Symposium on Cyberspace Safety and Security, 2014 IEEE 11th International Conference on Embedded Software and Syst. Paris, France:IEEE, 2015.
[21] SAAD Y. Iterative methods for sparse linear systems[M]. Boston:PWS Pub. Co., 2009:216-220.
[22] YOUNG D M. Iterative solution of large linear systems[M]. Amsterdam:Elsevier, 2014.
[23] DAVIS T A. Direct methods for sparse linear systems[M]. Philadelphia:Fundamentals of Algorithms, 2006.
[24] DAVIS T A, RAJAMANICKAM S, SID-LAKHDAR W M. A survey of direct methods for sparse linear systems[J]. Acta Numerica, 2016, 25:383-566.
[25] AMENT M, KNITTEL G, WEISKOPF D, et al. A parallel preconditioned conjugate gradient solver for the Poisson problem on a multi-GPU platform[C]//2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing. Pisa, Italy:IEEE, 2010:583-592.
[26] BAHI J M, RAPHAëL C, KHODJA L Z. Parallel sparse linear solver GMRES for GPU clusters with compression of exchanged data[M]. Berlin Heidelberg:Parallel Processing Workshops, 2012.
[27] KLIE H M, SUDAN H H, LI R, et al. Exploiting capabilities of many core platforms in reservoir simulation[C]//Society of Petroleum Engineers SPE Reservoir Simulation Symposium. The Woodlands, Texas, USA:SPE Reservoir Simulation Symposium, 2011.
[28] DZIEKONSKI A, LAMECKI A, MROZOWSKI M. Jacobi and Gauss-Seidel preconditioned complex conjugate gradient method with GPU acceleration for finite element method[C]//The 40th European Microwave Conference. Paris, France:IEEE, 2010.
[29] GAIKWAD A, TOKE I M. Parallel iterative linear solvers on GPU:A financial engineering case[C]//2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing. Pisa, Italy:IEEE, 2010.
[30] DUFF I S, ERISMAN A M, REID J K. Direct methods for sparse matrices[M]. 2nd ed. Oxford:Oxford University Press, 2017.
[31] Yeralan S N, Davis T A, Ranka S. Sparse QR factorization on GPU architectures[R]. Florida:University of Florida, 2013:1-36.
[32] AGULLO E, BUTTARI A, GUERMOUCHE A, et al. Multifrontal QR factorization for multicore architectures over runtime systems[M]//WOLF F, MOHR B, AN MEY D. Euro-Par 2013 Parallel Processing. Berlin Heidelberg:Springer, 2013.
[33] BUTTARI A. Fine-grained multithreading for themultifrontal QR factorization of sparse matrices[J]. SIAM Journal on Scientific Computing, 2013, 35(4):C323-C345.
[34] GEORGE T, SAXENAA V, GUPTA A, et al. Multifrontal factorization of sparse SPD matrices on GPUs[C]//2011 IEEE International Parallel & Distributed Processing Symposium, Anchorage, USA:IEEE, 2011:372-383.
[35] LUCAS R F, WAGENBRETH G, TRAN J J, et al. Multifrontal sparse matrix factorization on graphics processing units[EB/OL].[2012-01-13]. http://dx.doi.org/.2012.
[36] RENNICH S C, STOSIC D, DAVIS T A. Accelerating sparse cholesky factorization on GPUs[J]. Parallel Computing, 2016, 59:140-150.
[37] CHEN X M, REN L, WANG Y, et al. GPU-accelerated sparse LU factorization for circuit simulation with performance modeling[J]. IEEE Transactions on Parallel and Distributed Systems, 2015, 26(3):786-795.
[38] KAYAASLAN E, UÇAR B. Reducing elimination tree height for parallel LU factorization of sparse unsymmetric matrices[C]//201421st International Conference on High Performance Computing. Dona Paula, India:IEEE, 2014:1-10.
[39] NVIDIA. CUDA toolkit[EB/OL].[2019-05-18]. https://developer.nvidia.com/cuda-toolkit.
[40] NVIDIA. NVIDIA CUDA C programming guide. Version 3.1[M]. San Jose:NVIDIA, 2010.
[1] LÜ Zhenhua, LIU Sai. Finite element contact modeling method and error characteristics of partitioned parallel computations for bullet penetration simulations[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(5): 483-490.
[2] LI Haijiang, TIAN Yu, MENG Yonggang, CHEN Kaikai. Experimental study of the loosening of threaded fasteners with transverse vibration[J]. Journal of Tsinghua University(Science and Technology), 2016, 56(2): 171-175,184.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
Copyright © Journal of Tsinghua University(Science and Technology), All Rights Reserved.
Powered by Beijing Magtech Co. Ltd