Analyzing the Effect of Local Rounding Error Propagation on the Maximal Attainable Accuracy of the Pipelined Conjugate Gradient Method

Abstract

Pipelined Krylov subspace methods typically offer improved strong scaling on parallel HPC hardware compared to standard Krylov subspace methods for large and sparse linear systems. In pipelined methods the traditional synchronization bottleneck is mitigated by overlapping time-consuming global communications with useful computations. However, to achieve this communication-hiding strategy, pipelined methods introduce additional recurrence relations for a number of auxiliary variables that are required to update the approximate solution. This paper aims at studying the influence of local rounding errors that are introduced by the additional recurrences in the pipelined Conjugate Gradient (CG) method. Specifically, we analyze the impact of local round-off effects on the attainable accuracy of the pipelined CG algorithm and compare it to the traditional CG method. Furthermore, we estimate the gap between the true residual and the recursively computed residual used in the algorithm. Based on this estimate we suggest an automated residual replacement strategy to reduce the loss of attainable accuracy on the final iterative solution. The resulting pipelined CG method with residual replacement improves the maximal attainable accuracy of pipelined CG while maintaining the efficient parallel performance of the pipelined method. This conclusion is substantiated by numerical results for a variety of benchmark problems.

Keywords

  1. conjugate gradients
  2. parallelization
  3. latency hiding
  4. global communication
  5. pipelining
  6. rounding errors
  7. maximal attainable accuracy

MSC codes

  1. 65F10
  2. 65F50
  3. 65G50
  4. 65Y05
  5. 65Y20

Get full access to this article

View all available purchase options and get full access to this article.

References

1.
S. Balay, S. Abhyankar, M. Adams, J. Brown, P. Brune, K. Buschelman, L. Dalcin, V. Eijkhout, W. Gropp, D. Kaushik, M. Knepley, L. C. McInnes, K. Rupp, B. Smith, S. Zampini, and H. Zhang, PETSc Web page, http://www.mcs.anl.gov/petsc, 2015.
2.
R. Barrett, M. Berry, T. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C. Romine, and H. Van der Vorst, Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, 2nd ed., SIAM, Philadelphia, PA, 1994.
3.
E. Carson and J. Demmel, A residual replacement strategy for improving the maximum attainable accuracy of s-step Krylov subspace methods, SIAM J. Matrix Anal. Appl., 35 (2014), pp. 22--43.
4.
E. Carson, N. Knight, and J. Demmel, Avoiding communication in nonsymmetric Lanczos-based Krylov subspace methods, SIAM J. Sci. Comput., 35 (2013), pp. S42--S61.
5.
E. Carson, M. Rozlozník, Z. Strakoš, P. Tichỳ, and M. Tuma, On the numerical stability analysis of pipelined Krylov subspace methods, SIAM J. Sci. Comput., submitted.
6.
A. Chronopoulos and C. Gear, s-Step iterative methods for symmetric linear systems, J. Comput. Appl. Math., 25 (1989), pp. 153--168.
7.
A. Chronopoulos and A. Kucherov, Block s-step Krylov iterative methods, Numer. Linear Algebra Appl., 17 (2010), pp. 3--15.
8.
A. Chronopoulos and C. Swanson, Parallel iterative s-step methods for unsymmetric linear systems, Parallel Comput., 22 (1996), pp. 623--641.
9.
E. D'Azevedo, V. Eijkhout, and C. Romine, Reducing Communication Costs in the Conjugate Gradient Algorithm on Distributed Memory Multiprocessors, Technical report TM/12192, Oak Ridge National Lab, Oak Ridge, TN, 1992.
10.
E. de Sturler, A parallel variant of GMRES(m), in Proceedings of the 13th IMACS World Congress on Computational and Applied Mathematics, Vol. 9, Dublin, Ireland, Criterion Press, 1991, pp. 682--683.
11.
E. De Sturler and H. Van der Vorst, Reducing the effect of global communication in GMRES(m) and CG on parallel distributed memory computers, Appl. Numer. Math., 18 (1995), pp. 441--459.
12.
J. Demmel, Applied Numerical Linear Algebra, SIAM, Philadelphia, PA, 1997.
13.
J. Demmel, M. Heath, and H. Van der Vorst, Parallel Numerical Linear Algebra, Acta Numer., 2 (1993), pp. 111--197.
14.
J. Dongarra, P. Beckman, T. Moore, P. Aerts, G. Aloisio, J. Andre, D. Barkai, J. Berthou, T. Boku, B. Braunschweig, et al., The international exascale software project roadmap, Internat. J. High Performance Comput. Appli., 25 (2011), pp. 3--60.
15.
J. Dongarra, I. Duff, D. Sorensen, and H. Van der Vorst, Numerical Linear Algebra for High-Performance Computers, SIAM, Philadelphia, PA, 1998.
16.
J. Dongarra and M. Heroux, Toward a New Metric for Ranking High Performance Computing Systems, Technical report SAND2013-4744, 312, Sandia National Laboratories, Livermore, CA, 2013.
17.
J. Dongarra, M. Heroux, and P. Luszczek, HPCG Benchmark: A New Metric for Ranking High Performance Computing Systems, Knoxville, technical report UT-EECS-15-736, University of Tennessee, 2015.
18.
P. Eller and W. Gropp, Scalable non-blocking preconditioned conjugate gradient methods, in SC16: International Conference for High Performance Computing, Networking, Storage and Analysis, IEEE, 2016, pp. 204--215.
19.
J. Erhel, A parallel GMRES version for general sparse matrices, Electron. Trans. Numer. Anal., 3 (1995), pp. 160--176.
20.
T. Gergelits and Z. Strakoš, Composite convergence bounds based on Chebyshev polynomials and finite precision conjugate gradient computations, Numer. Algorithms, 65 (2014), pp. 759--782.
21.
P. Ghysels, T. Ashby, K. Meerbergen, and W. Vanroose, Hiding global communication latency in the GMRES algorithm on massively parallel machines, SIAM J. Sci. Comput., 35 (2013), pp. C48--C71.
22.
P. Ghysels and W. Vanroose, Hiding global synchronization latency in the preconditioned conjugate gradient algorithm, Parallel Comput., 40 (2014), pp. 224--238.
23.
A. Greenbaum, Behavior of slightly perturbed Lanczos and conjugate-gradient recurrences, Linear Algebra Appl., 113 (1989), pp. 7--63.
24.
A. Greenbaum, Estimating the attainable accuracy of recursively computed residual methods, SIAM J. Matrix Anal. Appl., 18 (1997), pp. 535--551.
25.
A. Greenbaum, Iterative Methods for Solving Linear Systems, SIAM, Philadelphia, PA, 1997.
26.
A. Greenbaum and Z. Strakoš, Predicting the behavior of finite precision Lanczos and conjugate gradient computations, SIAM J. Matrix Anal. Appl., 13 (1992), pp. 121--137.
27.
M. Gutknecht and Z. Strakoš, Accuracy of two three-term and three two-term recurrences for Krylov space solvers, SIAM J. Matrix Anal. Appl., 22 (2000), pp. 213--229.
28.
N. Halko, P. Martinsson, and J. Tropp, Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions, SIAM Rev., 53 (2011), pp. 217--288.
29.
M. Hestenes and E. Stiefel, Methods of conjugate gradients for solving linear systems, J. Res. Natl. Bureau Standards, 49 (1952), pp. 409--436.
30.
N. Higham, Accuracy and Stability of Numerical Algorithms, SIAM, Philadelphia, PA, 2002.
31.
J. Liesen and Z. Strakoš, Krylov Subspace Methods: Principles and Analysis, Oxford University Press, Oxford, 2012.
32.
G. Meurant, Computer Solution of Large Linear Systems, Vol. 28, Elsevier, Amsterdam, 1999.
33.
G. Meurant and Z. Strakoš, The Lanczos and conjugate gradient algorithms in finite precision arithmetic, Acta Numer., 15 (2006), pp. 471--542.
34.
Y. Notay, On the convergence rate of the conjugate gradients in presence of rounding errors, Numer. Math., 65 (1993), pp. 301--317.
35.
C. Paige, The Computation of Eigenvalues and Eigenvectors of Very Large Sparse Matrices, Ph.D. thesis, University of London, London, 1971.
36.
C. Paige, Computational variants of the Lanczos method for the eigenproblem, IMA J. Appl. Math., 10 (1972), pp. 373--381.
37.
C. Paige, Error analysis of the Lanczos algorithm for tridiagonalizing a symmetric matrix, IMA J. Appl. Math., 18 (1976), pp. 341--349.
38.
C. Paige, Accuracy and effectiveness of the Lanczos algorithm for the symmetric eigenproblem, Linear Algebra Appl., 34 (1980), pp. 235--258.
39.
Y. Saad, Iterative Methods for Sparse Linear Systems, SIAM, Philadelphia, PA, 2003.
40.
G. Sleijpen and H. Van der Vorst, Reliable updated residuals in hybrid Bi-CG methods, Computing, 56 (1996), pp. 141--163.
41.
G. Sleijpen, H. Van der Vorst, and D. Fokkema, BiCGstab($\ell$) and other hybrid Bi-CG methods, Numer. Algorithms, 7 (1994), pp. 75--109.
42.
G. Sleijpen, H. Van der Vorst, and J. Modersitzki, Differences in the effects of rounding errors in Krylov solvers for symmetric indefinite linear systems, SIAM J. Matrix Anal. Appl., 22 (2001), pp. 726--751.
43.
E. Stiefel, Relaxationsmethoden bester Strategie zur Lösung linearer Gleichungssysteme, Comment. Math. Helv., 29 (1955), pp. 157--179.
44.
Z. Strakoš, Effectivity and optimizing of algorithms and programs on the host-computer/array-processor system, Parallel Comput., 4 (1987), pp. 189--207.
45.
Z. Strakoš and P. Tichỳ, On error estimation in the Conjugate Gradient method and why it works in finite precision computations, Electron. Trans. Numer. Anal., 13 (2002), pp. 56--80.
46.
Z. Strakoš and P. Tichỳ, Error estimation in preconditioned Conjugate Gradients, BIT Numer. Math., 45 (2005), pp. 789--817.
47.
C. Tong and Q. Ye, Analysis of the finite precision Bi-Conjugate Gradient algorithm for nonsymmetric linear systems, Math. Comp., 69 (2000), pp. 1559--1575.
48.
H. Van der Vorst, Iterative Krylov Methods for Large Linear Systems, Vol. 13, Cambridge University Press, Cambridge, 2003.
49.
H. Van der Vorst and Q. Ye, Residual replacement strategies for Krylov subspace iterative methods for the convergence of true residuals, SIAM J. Sci. Comput., 22 (2000), pp. 835--852.
50.
J. Wilkinson, Rounding Errors in Algebraic Processes, Courier Corporation, North Chelms ford, MA, 1994.

Information & Authors

Information

Published In

cover image SIAM Journal on Matrix Analysis and Applications
SIAM Journal on Matrix Analysis and Applications
Pages: 426 - 450
ISSN (online): 1095-7162

History

Submitted: 22 February 2017
Accepted: 27 November 2017
Published online: 13 March 2018

Keywords

  1. conjugate gradients
  2. parallelization
  3. latency hiding
  4. global communication
  5. pipelining
  6. rounding errors
  7. maximal attainable accuracy

MSC codes

  1. 65F10
  2. 65F50
  3. 65G50
  4. 65Y05
  5. 65Y20

Authors

Affiliations

Funding Information

FP7 Ideas: European Research Council https://doi.org/10.13039/100011199 : 610741
Fonds Wetenschappelijk Onderzoek https://doi.org/10.13039/501100003130

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited By

View Options

View options

PDF

View PDF

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media