Solving sparse linear systems is a problem that arises in many scientific applications, and sparse direct solvers are a time-consuming and key kernel for those applications and for more advanced solvers such as hybrid direct-iterative solvers. For this reason, optimizing their performance on modern architectures is critical. The preprocessing steps of sparse direct solvers---ordering and block-symbolic factorization---are two major steps that lead to a reduced amount of computation and memory and to a better task granularity to reach a good level of performance when using BLAS kernels. With the advent of GPUs, the granularity of the block computation has become more important than ever. In this paper, we present a reordering strategy that increases this block granularity. This strategy relies on block-symbolic factorization to refine the ordering produced by tools such as Metis or Scotch, but it does not impact the number of operations required to solve the problem. We integrate this algorithm in the PaStiX solver and show an important reduction of the number of off-diagonal blocks on a large spectrum of matrices. This improvement leads to an increase in efficiency of up to 20% on GPUs.


  1. sparse block linear solver
  2. nested dissection
  3. sparse matrix ordering
  4. heterogeneous architectures

MSC codes

  1. 05C50
  2. 65F05
  3. 65F50
  4. 68Q25

Get full access to this article

View all available purchase options and get full access to this article.

Supplementary Material

PLEASE NOTE: These supplementary files have not been peer-reviewed.

Index of Supplementary Materials

Title of paper: Reordering Strategy for Blocking Optimization in Sparse Linear Solvers

Authors: Gregoire Pichon, Mathieu Faverge, Pierre Ramet, and Jean Roman

File: setofmatrices.pdf

Type: PDF file

Contents: Characteristics list of all the matrices used for the exepriments.


E. Agullo, A. Buttari, A. Guermouche, and F. Lopez, Multifrontal QR factorization for multicore architectures over runtime systems, in Euro-Par 2013 Parallel Processing, Lecture Notes in Comput. Sci. 8097, F. Wolf, B. Mohr, and D. Mey, eds., Springer, Berlin, Heidelberg, 2013, pp. 521--532, https://doi.org/10.1007/978-3-642-40047-6_53.
P. R. Amestoy, A. Buttari, I. S. Duff, A. Guermouche, J. L'Excellent, and B. Uçar, MUMPS, in Encyclopedia of Parallel Computing, D. Padua, ed., Springer, 2011, pp. 1232--1238, https://doi.org/10.1007/978-0-387-09766-4_204.
P. R. Amestoy, T. A. Davis, and I. S. Duff, An approximate minimum degree ordering algorithm, SIAM J. Matrix Anal. Appl., 17 (1996), pp. 886--905, https://doi.org/10.1137/S0895479894278952.
D. L. Applegate, R. E. Bixby, V. Chvatal, and W. J. Cook, The Traveling Salesman Problem: A Computational Study, Princeton Series in Applied Mathematics, Princeton University Press, Princeton, NJ, 2007.
G. Bosilca, A. Bouteiller, A. Danalis, M. Faverge, T. Hérault, and J. J. Dongarra, PaRSEC: Exploiting heterogeneity to enhance scalability, Comput. Sci. Engrg., 15 (2013), pp. 36--45.
P. Charrier and J. Roman, Algorithmic study and complexity bounds for a nested dissection solver, Numer. Math., 55 (1989), pp. 463--476.
N. Christofides, Worst-Case Analysis of a New Heuristic for the Travelling Salesman Problem, tech. report, DTIC Document, 1976.
T. A. Davis and Y. Hu, The University of Florida sparse matrix collection, ACM Trans. Math. Software, 38 (2011), 1, https://doi.org/10.1145/2049662.2049663.
J. Dongarra, J. D. Croz, S. Hammarling, and I. S. Duff, A set of level 3 basic linear algebra subprograms, ACM Trans. Math. Software, 16 (1990), pp. 1--17, https://doi.org/10.1145/77626.79170.
A. George, Nested dissection of a regular finite element mesh, SIAM J. Numer. Anal., 10 (1973), pp. 345--363, https://doi.org/10.1137/0710032.
A. George and J. W. Liu, Computer Solution of Large Sparse Positive Definite, Prentice--Hall Professional Technical Reference, 1981.
A. George and D. R. McIntyre, On the application of the minimum degree algorithm to finite element systems, SIAM J. Numer. Anal., 15 (1978), pp. 90--112, https://doi.org/10.1137/0715006.
R. W. Hamming, Error detecting and error correcting codes, Bell System Tech. J., 26 (1950), pp. 147--160.
P. Hénon, P. Ramet, and J. Roman, PaStiX: A high-performance parallel direct solver for sparse symmetric definite systems, Parallel Comput., 28 (2002), pp. 301--321.
J. D. Hogg, J. K. Reid, and J. A. Scott, Design of a multicore sparse Cholesky factorization using DAGs, SIAM J. Sci. Comput., 32 (2010), pp. 3627--3649, https://doi.org/10.1137/090757216.
D. S. Johnson and L. A. McGeoch, The traveling salesman problem: A case study in local optimization, in Local Search in Combinatorial Optimization, Wiley, Chichester, UK, 1997, pp. 215--310.
G. Karypis and V. Kumar, A fast and high quality multilevel scheme for partitioning irregular graphs, SIAM J. Sci. Comput., 20 (1998), pp. 359--392, https://doi.org/10.1137/S1064827595287997.
X. Lacoste, Scheduling and Memory Optimizations for Sparse Direct Solver on Multi-core/Multi-gpu Cluster Systems, Ph.D. thesis, Université Bordeaux, Talence, France, 2015.
X. Lacoste, M. Faverge, P. Ramet, S. Thibault, and G. Bosilca, Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes, in Proceedings of the 2014 IEEE International Parallel Distributed Processing Symposium Workshops (IPDPSW), Phoenix, AZ, IEEE, 2014, pp. 29--38.
R. J. Lipton and R. E. Tarjan, A separator theorem for planar graphs, SIAM J. Appl. Math., 36 (1979), pp. 177--189, https://doi.org/10.1137/0136016.
J. W. H. Liu, The role of elimination trees in sparse factorization, SIAM J. Matrix Anal. Appl., 11 (1990), pp. 134--172, https://doi.org/10.1137/0611010.
J. W. H. Liu, E. G. Ng, and B. W. Peyton, On finding supernodes for sparse matrix computations, SIAM J. Matrix Anal. Appl., 14 (1993), pp. 242--252, https://doi.org/10.1137/0614019.
R. Luce and E. G. Ng, On the minimum FLOPs problem in the sparse Cholesky factorization, SIAM J. Matrix Anal. Appl., 35 (2014), pp. 1--21, https://doi.org/10.1137/130912438.
G. L. Miller, S.-H. Teng, and S. A. Vavasis, A unified geometric approach to graph separators, in Proceedings of the 31st Annual Symposium on Foundations of Computer Science, IEEE, 1991, pp. 538--547.
G. L. Miller and S. A. Vavasis, Density graphs and separators, in Proceedings of the Second Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, Philadelphia, ACM, New York, 1991, pp. 331--336.
F. Pellegrini, Scotch and libScotch \textup5.1 User's Guide, 2008.
D. J. Rose and R. E. Tarjan, Algorithmic aspects of vertex elimination on directed graphs, SIAM J. Appl. Math., 34 (1978), pp. 176--197, https://doi.org/10.1137/0134014.
D. J. Rosenkrantz, R. E. Stearns, and P. M. Lewis, II, An analysis of several heuristics for the traveling salesman problem, SIAM J. Comput., 6 (1977), pp. 563--581, https://doi.org/10.1137/0206041.
W. M. Sid-Lakhdar, Scaling the Solution of Large Sparse Linear Systems Using Multifrontal Methods on Hybrid Shared-Distributed Memory Architectures, Ph.D. thesis, École Normale Supérieure de Lyon, 2014.
STFC (Science and Technology Facilities Council), The HSL Mathematical Software Library. A Collection of Fortran Codes for Large Scale Scientific Computation, http://www.hsl.rl.ac.uk/.
W. F. Tinney and J. W. Walker, Direct solutions of sparse network equations by optimally ordered triangular factorization, Proc. IEEE, 55 (1967), pp. 1801--1809.
University of Waterloo, Concorde TSP Solver, http://www.math.uwaterloo.ca/tsp/concorde.html.

Information & Authors


Published In

cover image SIAM Journal on Matrix Analysis and Applications
SIAM Journal on Matrix Analysis and Applications
Pages: 226 - 248
ISSN (online): 1095-7162


Submitted: 22 February 2016
Accepted: 22 December 2016
Published online: 23 March 2017


  1. sparse block linear solver
  2. nested dissection
  3. sparse matrix ordering
  4. heterogeneous architectures

MSC codes

  1. 05C50
  2. 65F05
  3. 65F50
  4. 68Q25



Funding Information

Direction Générale de l'Armement http://dx.doi.org/10.13039/501100006021
Université de Bordeaux http://dx.doi.org/10.13039/501100006251

Metrics & Citations



If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited By







Copy the content Link

Share with email

Email a colleague

Share on social media

The SIAM Publications Library now uses SIAM Single Sign-On for individuals. If you do not have existing SIAM credentials, create your SIAM account https://my.siam.org.