Abstract.

Matrix-free techniques play an increasingly important role in large-scale simulations. Schur complement techniques and massively parallel multigrid solvers for second-order elliptic partial differential equations significantly benefit from reduced memory traffic and consumption. The matrix-free approach often restricts solver components to purely local operations, for instance, to the most basic schemes like Jacobi- or Gauss–Seidel-Smoothers in multigrid methods. An incomplete LU(0)-decomposition (ILU) cannot be calculated from local information and is therefore not applicable to an on-the-fly computation typically needed for matrix-free calculations. It requires storing and factorizing a sparse matrix, contradicting the low memory requirements in large-scale scenarios. Here, we propose a matrix-free ILU realization. More precisely, we introduce a memory-efficient matrix-free ILU-Smoother component for low-order conforming finite elements on tetrahedral hybrid grids. Hybrid grids consist of an unstructured macro-mesh which is subdivided into structured micro-meshes. The ILU operates on the degrees of freedom assigned to the interior of macro-tetrahedra. This ILU-Smoother can be applied to the efficient matrix-free evaluation of the Steklov–Poincaré operator from domain-decomposition methods, as well as for the finite element tearing and interconnecting dual-primal and balancing domain decomposition by constraints methods. After introducing and formally defining our smoother, we investigate its performance on refined macro-tetrahedra. On the macro-tetrahedra the ILU-Smoother is implemented via surrogate matrix polynomials, which we combine with a fast on-the-fly evaluation scheme, resulting in an efficient matrix-free algorithm. We obtain the polynomial coefficients by solving a least-squares problem on a small part of the factorized ILU matrices to remain memory efficient. The convergence rates of this smoother in relation to the polynomial order are thoroughly studied.

Keywords

  1. hybrid ILU-Smoother
  2. multigrid
  3. hybrid grids
  4. polynomial surrogates
  5. matrix-free

MSC codes

  1. 65F55
  2. 65N55

Get full access to this article

View all available purchase options and get full access to this article.

Acknowledgment.

We want to thank both of our anonymous reviewers for their detailed and insightful comments, which significantly contributed to improving the manuscript.

Supplementary Materials

PLEASE NOTE: These supplementary files have not been peer-reviewed.
Index of Supplementary Materials
Title of paper: A Matrix-Free ILU Realization Based on Surrogates
Authors: Daniel Drzisga, Andreas Wagner, and Barbara Wohlmuth
File: 123595_2_supp_533627_ry3dnn_sc.pdf
Type: PDF
Contents: Second example application for our surrogate ILU, simulation data, and elementary proofs.

References

1.
ILU Implementation - Source Code, https://doi.org/10.5281/zenodo.7199022.
2.
H. Anzt, E. Chow, and J. Dongarra, ParILUT—A new parallel threshold ILU factorization, SIAM J. Sci. Comput., 40 (2018), pp. C503–C519, https://doi.org/10.1137/16M1079506.
3.
H. Anzt, T. Ribizel, G. Flegar, E. Chow, and J. Dongarra, ParILUT-a parallel threshold ILU for GPUs, in Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2019, pp. 231–241.
4.
O. Axelsson, Incomplete block matrix factorization preconditioning methods. The ultimate answer?, J. Comput. Appl. Math., 12 (1985), pp. 3–18.
5.
S. Bauer, H.-P. Bunge, D. Drzisga, S. Ghelichkhan, M. Huber, N. Kohl, M. Mohr, U. Rüde, D. Thönnes, and B. Wohlmuth, TerraNeo–mantle convection beyond a trillion degrees of freedom, in Software for Exascale Computing-SPPEXA 2016-2019, Springer, Cham, 2020, pp. 569–610.
6.
S. Bauer, D. Drzisga, M. Mohr, U. Rüde, C. Waluga, and B. Wohlmuth, A stencil scaling approach for accelerating matrix-free finite element implementations, SIAM J. Sci. Comput., 40 (2018), pp. C748–C778, https://doi.org/10.1137/17M1148384.
7.
S. Bauer, M. Huber, S. Ghelichkhan, M. Mohr, U. Rüde, and B. Wohlmuth, Large-scale simulation of mantle convection based on a new matrix-free approach, J. Comput. Sci., 31 (2019), pp. 60–76, https://doi.org/10.1016/j.jocs.2018.12.006.
8.
S. Bauer, M. Huber, M. Mohr, U. Rüde, and B. Wohlmuth, A new matrix-free approach for large-scale geodynamic simulations and its performance, in Proceedings of the 18th International Conference on Computational Science (ICCS), Springer, Cham, 2018, pp. 17–30.
9.
S. Bauer, M. Mohr, U. Rüde, J. Weismüller, M. Wittmann, and B. Wohlmuth, A two-scale approach for efficient on-the-fly operator assembly in massively parallel high performance multigrid codes, Appl. Numer. Math., 122 (2017), pp. 14–38, https://doi.org/10.1016/j.apnum.2017.07.006.
10.
B. Bergen, Hierarchical Hybrid Grids: Data Structures and Core Algorithms for Efficient Finite Element Simulations on Supercomputers, SCS Publishing House, Erlangen, Germany, 2005.
11.
B. Bergen and F. Hülsemann, Hierarchical hybrid grids: Data structures and core algorithms for multigrid, Numer. Linear Algebra Appl., 11 (2004), pp. 279–291.
12.
B. Bergen, G. Wellein, F. Hülsemann, and U. Rüde, Hierarchical hybrid grids: Achieving TERAFLOP performance on large scale finite element simulations, Int. J. Parallel Emergent Distrib. Syst., 22 (2007), pp. 311–329.
13.
J. Bey, Tetrahedral grid refinement, Computing, 55 (1995), pp. 355–378.
14.
T. F. Chan and T. P. Mathew, Domain decomposition algorithms, Acta Numer., 3 (1994), pp. 61–143.
15.
C. R. Dohrmann and R. B. Lehoucq, A primal-based penalty preconditioner for elliptic saddle point systems, SIAM J. Numer. Anal., 44 (2006), pp. 270–282, https://doi.org/10.1137/040619016.
16.
D. Drzisga, B. Keith, and B. Wohlmuth, The surrogate matrix methodology: A priori error estimation, SIAM J. Sci. Comput., 41 (2019), pp. A3806–A3838, https://doi.org/10.1137/18M1226580.
17.
D. Drzisga, B. Keith, and B. Wohlmuth, The surrogate matrix methodology: Low-cost assembly for isogeometric analysis, Comput. Methods Appl. Mech. Engrg., 361 (2020), 112776, https://doi.org/10.1016/j.cma.2019.112776.
18.
D. Drzisga, U. Rüde, and B. Wohlmuth, Stencil scaling for vector-valued PDEs on hybrid grids with applications to generalized Newtonian fluids, SIAM J. Sci. Comput., 42 (2020), pp. B1429–B1461, https://doi.org/10.1137/19M1267891.
19.
C. Farhat, M. Lesoinne, P. LeTallec, K. Pierson, and D. Rixen, FETI-DP: A dual–primal unified FETI method–part I: A faster alternative to the two-level FETI method, Internat. J. Numer. Methods Engrg., 50 (2001), pp. 1523–1544.
20.
C. Farhat, M. Lesoinne, and K. Pierson, A scalable dual-primal domain decomposition method, Numer. Linear Algebra Appl., 7 (2000), pp. 687–714.
21.
F. J. Gaspar, J. L. Gracia, and F. J. Lisbona, Fourier analysis for multigrid methods on triangular grids, SIAM J. Sci. Comput., 31 (2009), pp. 2081–2102, https://doi.org/10.1137/080713483.
22.
B. Gmeiner, T. Gradl, F. Gaspar, and U. Rüde, Optimization of the multigrid-convergence rate on semi-structured meshes by local Fourier analysis, Comput. Math. Appl., 65 (2013), pp. 694–711.
23.
L. Grigori and S. Moufawad, Communication avoiding ILU0 preconditioner, SIAM J. Sci. Comput., 37 (2015), pp. C217–C246, https://doi.org/10.1137/130930376.
24.
W. Hackbusch, Multi-grid Methods and Applications, Springer Ser. Comput. Math. 4, Springer, Berlin, Heidelberg, 2013.
25.
S. Kang, L. C. Ngo, H. Choi, W. Chung, Y.-H. Yoo, and J. Y. Yoo, Performance comparison of parallel ILU preconditioners for the incompressible Navier-Stokes equations, J. Mech. Sci. Technol., 34 (2020), pp. 1175–1184.
26.
G. Karypis and V. Kumar, A fast and high quality multilevel scheme for partitioning irregular graphs, SIAM J. Sci. Comput., 20 (1998), pp. 359–392, https://doi.org/10.1137/S1064827595287997.
27.
R. Kettler and P. Wesseling, Aspects of multigrid methods for problems in three dimensions, Appl. Math. Comput., 19 (1986), pp. 159–168.
28.
N. Kohl, D. Thönnes, D. Drzisga, D. Bartuschat, and U. Rüde, The HyTeG finite-element software framework for scalable multigrid solvers, Int. J. Parallel Emergent Distrib. Syst., 34 (2019), pp. 477–496.
29.
T. Kolev, P. Fischer, M. Min, J. Dongarra, J. Brown, V. Dobrev, T. Warburton, S. Tomov, M. S. Shephard, A. Abdelfattah, V. Barra, N. Beams, J.-S. Camier, N. Chalmers, Y. Dudouit, A. Karakus, I. Karlin, S. Kerkemeier, Y.-H. Lan, D. Medina, E. Merzari, A. Obabko, W. Pazner, T. Rathnayake, C. W. Smith, L. Spies, K. Swirydowicz, J. Thompson, A. Tomboulides, and V. Tomov, Efficient exascale discretizations: High-order finite element methods, Int. J. High Perform. Comput. Appl., 35 (2021), pp. 527–552, https://doi.org/10.1177/10943420211020803.
30.
V. G. Korneev and U. Langer, Domain Decomposition Methods and Preconditioning, John Wiley & Sons, New York, 2004, https://doi.org/10.1002/0470091355.ecm019.
31.
M. Kronbichler and K. Kormann, Fast matrix-free evaluation of discontinuous Galerkin finite element operators, ACM Trans. Math. Software, 45 (2019), 29, https://doi.org/10.1145/3325864.
32.
D. Lukarski, H. Anzt, S. Tomov, and J. Dongarra, Hybrid multi-elimination ILU preconditioners on GPUs, in Proceedings of the IEEE International Parallel Distributed Processing Symposium Workshops, 2014, pp. 7–16, https://doi.org/10.1109/IPDPSW.2014.7.
33.
T. Malas and L. Gürel, Incomplete LU preconditioning with the multilevel fast multipole algorithm for electromagnetic scattering, SIAM J. Sci. Comput., 29 (2007), pp. 1476–1494, https://doi.org/10.1137/060659107.
34.
J. Mandel and C. R. Dohrmann, Convergence of a balancing domain decomposition by constraints and energy minimization, Numer. Linear Algebra Appl., 10 (2003), pp. 639–659.
35.
T. Mathew, Domain Decomposition Methods for the Numerical Solution of Partial Differential Equations, Lect. Notes Comput. Sci. Eng. 61, Springer-Verlag, Berlin, 2008.
36.
M. Mayr, L. Berger-Vergiat, P. Ohm, and R. S. Tuminaro, Non-invasive Multigrid for Semi-structured Grids, preprint, arXiv:2103.11962, 2021.
37.
J. A. Meijerink and H. A. Van Der Vorst, An iterative solution method for linear systems of which the coefficient matrix is a symmetric \(m\)-matrix, Math. Comp., 31 (1977), pp. 148–162.
38.
K.-D. Oertel and K. Stüben, Multigrid with ILU-smoothing: Systematic tests and improvements, in Robust Multi-Grid Methods, Vieweg+Teubner Verlag, Wiesbaden, 1989, pp. 188–199.
39.
F. Pellegrini, Scotch and PT-Scotch graph partitioning software: An overview, in Combinatorial Scientific Computing, U. Naumann and O. Schenk, eds., Chapman and Hall/CRC, Boca Raton, FL, 2012, pp. 373–406, https://doi.org/10.1201/b11644-15.
40.
M. Pinto, C. Rodrigo, F. Gaspar, and C. Oosterlee, On the robustness of ILU smoothers on triangular grids, Appl. Numer. Math., 106 (2016), pp. 37–52.
41.
A. Quarteroni and A. Valli, Domain Decomposition Methods for Partial Differential Equations, Oxford University Press, Oxford, UK, 1999.
42.
A. Rappoport, Rendering curves and surfaces with hybrid subdivision and forward differencing, ACM Trans. Graphics, 10 (1991), pp. 323–341.
43.
A. Rockwood, A generalized scanning technique for display of parametrically defined surfaces, IEEE Comput. Graph. Appl., 7 (1987), pp. 15–26.
44.
T. Roehl, J. Treibig, G. Hager, and G. Wellein, Overhead analysis of performance counter measurements, in Proceedings of the 43rd International Conference on Parallel Processing Workshops (ICCPW), 2014, pp. 176–185, https://doi.org/10.1109/ICPPW.2014.34.
45.
Y. Saad, Iterative Methods for Sparse Linear Systems, SIAM, Philadelphia, 2003, https://doi.org/10.1137/1.9780898718003.
46.
R. Tielen, M. Möller, D. Göddeke, and C. Vuik, p-multigrid methods and their comparison to h-multigrid methods within isogeometric analysis, Comput. Methods Appl. Mech. Engrg., 372 (2020), 113347.
47.
R. Tielen, M. Möller, and K. Vuik, A direct projection to low-order level for p-multigrid methods in isogeometric analysis, in Numerical Mathematics and Advanced Applications: ENUMATH 2019, Springer, Cham, 2021, pp. 1001–1009.
48.
A. Toselli and O. Widlund, Domain Decomposition Methods - Algorithms and Theory, Springer Ser. Comput. Math. 34, Springer, Berlin, Heidelberg, 2004.
49.
J. Treibig, G. Hager, and G. Wellein, LIKWID: A lightweight performance-oriented tool suite for x86 multicore environments, in Proceedings of PSTI2010, the First International Workshop on Parallel Software Tools and Tool Infrastructures, San Diego, CA, 2010.
50.
N. Umetani, S. P. MacLachlan, and C. W. Oosterlee, A multigrid-based shifted Laplacian preconditioner for a fourth-order Helmholtz discretization, Numer. Linear Algebra Appl., 16 (2009), pp. 603–626.
51.
P. Wesseling, Introduction to Multigrid Methods, Technical report, 1995.

Information & Authors

Information

Published In

cover image SIAM Journal on Scientific Computing
SIAM Journal on Scientific Computing
Pages: C304 - C329
ISSN (online): 1095-7197

History

Submitted: 27 October 2022
Accepted: 26 July 2023
Published online: 4 December 2023

Keywords

  1. hybrid ILU-Smoother
  2. multigrid
  3. hybrid grids
  4. polynomial surrogates
  5. matrix-free

MSC codes

  1. 65F55
  2. 65N55

Authors

Affiliations

Lehrstuhl für Numerische Mathematik, Fakultät für Mathematik (M2), Technische, Universität München, 85748 Garching bei München, Germany.
Andreas Wagner Contact the author
Corresponding author. Lehrstuhl für Numerische Mathematik, Fakultät für Mathematik (M2), Technische, Universität München, 85748 Garching bei München, Germany.
Barbara Wohlmuth
Lehrstuhl für Numerische Mathematik, Fakultät für Mathematik (M2), Technische, Universität München, 85748 Garching bei München, Germany.

Funding Information

Funding: The work of the second author was supported by Deutsche Forschungsgemeinschaft grant WO671/11-1. The work of the third author was supported by the Federal Ministry of Education and Research (BMBF) as part of the “Multi-physics simulations for Geodynamics on heterogeneous Exascale Systems” (CoMPS) project (FKZ 16ME0651) inside the federal research program on “High-Performance and Supercomputing for the Digital Age 2021–2024 – Research and Investments in High-Performance Computing.”

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited By

View Options

View options

PDF

View PDF

Full Text

View Full Text

Figures

Tables

Media

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media