Abstract

The QZ algorithm for computing eigenvalues and eigenvectors of a matrix pencil $A - \lambda B$ requires that the matrices first be reduced to Hessenberg-triangular (HT) form. The current method of choice for HT reduction relies entirely on Givens rotations regrouped and accumulated into small dense matrices which are subsequently applied using matrix multiplication routines. A nonvanishing fraction of the total flop-count must nevertheless still be performed as sequences of overlapping Givens rotations alternately applied from the left and from the right. The many data dependencies associated with this computational pattern leads to inefficient use of the processor and poor scalability. In this paper, we therefore introduce a fundamentally different approach that relies entirely on (large) Householder reflectors partially accumulated into block reflectors, by using (compact) WY representations. Even though the new algorithm requires more floating point operations than the state-of-the-art algorithm, extensive experiments on both real and synthetic data indicate that it is still competitive, even in a sequential setting. The new algorithm is conjectured to have better parallel scalability, an idea which is partially supported by early small-scale experiments using multithreaded BLAS. The design and evaluation of a parallel formulation is future work.

Keywords

  1. Hessenberg-triangular reduction
  2. Householder reflectors
  3. iterative refinement

MSC codes

  1. 15A22
  2. 15A23
  3. 65Y20

Get full access to this article

View all available purchase options and get full access to this article.

References

2.
B. Adlerborn, L. Karlsson, and B. K\aagström, Distributed One-Stage Hessenberg-Triangular Reduction with Wavefront Scheduling, Tech. Report UMINF 16.10, Department of Computing Science, Ume\aa University, Ume\aa, Sweden, 2016.
3.
B. Adlerborn, B. K\ragström, and D. Kressner, A parallel QZ algorithm for distributed memory HPC systems, SIAM J. Sci. Comput., 36 (2014), pp. C480--C503, https://doi.org/10.1137/140954817.
4.
T. Auckenthaler, V. Blum, H.-J. Bungartz, T. Huckle, R. Johanni, L. Krämer, B. Lang, H. Lederer, and P. Willems, Parallel solution of partial symmetric eigenvalue problems from electronic structure calculations, Parallel Comput., 37 (2011), pp. 783--794, https://doi.org/10.1016/j.parco.2011.05.002.
5.
P. Bientinesi, F. D. Igual, D. Kressner, M. Petschow, and E. S. Quintana-Ortí, Condensed forms for the symmetric eigenvalue problem on multi-threaded architectures, Concurrency Computation Practice Experience, 23 (2011), pp. 694--707.
6.
C. H. Bischof, B. Lang, and X. Sun, A framework for symmetric band reduction, ACM Trans. Math. Software, 26 (2000), pp. 581--601.
7.
C. H. Bischof and C. F. Van Loan, The ${W}{Y}$ representation for products of Householder matrices, SIAM J. Sci. Statist. Comput., 8 (1987), pp. S2--S13.
8.
K. Braman, R. Byers, and R. Mathias, The multishift $QR$ algorithm. II. Aggressive early deflation, SIAM J. Matrix Anal. Appl., 23 (2002), pp. 948--973.
9.
Z. Bujanović, L. Karlsson, and D. Kressner, A Householder-Based Algorithm for Hessenberg-Triangular Reduction, arXiv:1710.08538, 2017.
10.
Y. Chahlaoui and P. Van Dooren, A Collection of Benchmark Examples for Model Reduction of Linear Time Invariant Dynamical Systems, SLICOT Working Note 2002-2, 2002, http://www.icm.tu-bs.de/NICONET/benchmodred.html.
11.
J. J. Dongarra, D. C. Sorensen, and S. J. Hammarling, Block reduction of matrices to condensed forms for eigenvalue computations, J. Comput. Appl. Math., 27 (1989), pp. 215--227.
12.
G. H. Golub and C. F. Van Loan, Matrix Computations, 4th ed., Johns Hopkins Stud. Math. Sci., Johns Hopkins University Press, Baltimore, MD, 2013.
13.
R. Granat, B. K\aagström, and D. Kressner, A novel parallel $QR$ algorithm for hybrid distributed memory HPC systems, SIAM J. Sci. Comput., 32 (2010), pp. 2345--2378, https://doi.org/10.1137/090756934.
14.
A. Haidar, H. Ltaief, and J. Dongarra, Parallel reduction to condensed forms for symmetric eigenvalue problems using aggregated fine-grained and memory-aware kernels, in Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, New York, ACM, 2011, pp. 8:1--8:11, https://doi.org/10.1145/2063384.2063394.
15.
A. Haidar, R. Solcà, M. Gates, S. Tomov, T. Schulthess, and J. Dongarra, Leading edge hybrid multi-GPU algorithms for generalized eigenproblems in electronic structure calculations, in Supercomputing, J. Kunkel, T. Ludwig, and H. Meuer, eds., Lecture Notes in Comput. Sci. 7905, Springer, Berlin, 2013, pp. 67--80, https://doi.org/10.1007/978-3-642-38750-0_6.
16.
M. Heinkenschloss, D. C. Sorensen, and K. Sun, Balanced truncation model reduction for a class of descriptor systems with application to the Oseen equations, SIAM J. Sci. Comput., 30 (2008), pp. 1038--1063, https://doi.org/10.1137/070681910.
17.
N. J. Higham, Accuracy and Stability of Numerical Algorithms, 2nd ed., SIAM, Philadelphia, 2002.
18.
A. Ilchmann and T. Reis, eds., Surveys in Differential-Algebraic Equations. IV, Differ. Algebr. Equ. Forum, Springer, Cham, 2017, https://doi.org/10.1007/978-3-319-46618-7.
19.
B. K\aagström and D. Kressner, Multishift variants of the QZ algorithm with aggressive early deflation, SIAM J. Matrix Anal. Appl., 29 (2006), pp. 199--227.
20.
B. K\aagström, D. Kressner, E. S. Quintana-Ortí, and G. Quintana-Ortí, Blocked algorithms for the reduction to Hessenberg-triangular form revisited, BIT, 48 (2008), pp. 563--584, https://doi.org/10.1007/s10543-008-0180-1.
21.
L. Karlsson and B. K\aagström, Parallel two-stage reduction to Hessenberg form on shared-memory architectures, Parallel Comput., 37 (2011), pp. 771--782.
22.
L. Karlsson and B. K\aagström, Efficient reduction from block Hessenberg form to Hessenberg form using shared memory, in Proceedings of PARA'2010: Applied Parallel and Scientific Computing, Lecture Notes in Comput. Sci. 7134, Springer, Berlin, 2012, pp. 258--268.
23.
J. G. Korvink and B. R. Evgenii, Oberwolfach benchmark collection, in Dimension Reduction of Large-Scale Systems, P. Benner, V. Mehrmann, and D. C. Sorensen, eds., Lect. Notes Comput. Sci. Eng. 45, Springer, Berlin, 2005, pp. 311--316; also available from http://portal.uni-freiburg.de/imteksimulation/downloads/benchmark.
24.
C. B. Moler and G. W. Stewart, An algorithm for generalized matrix eigenvalue problems, SIAM J. Numer. Anal., 10 (1973), pp. 241--256.
25.
G. Quintana-Ortí and R. van de Geijn, Improving the performance of reduction to Hessenberg form, ACM Trans. Math. Software, 32 (2006).
26.
R. Schreiber and C. F. Van Loan, A storage-efficient ${W}{Y}$ representation for products of Householder transformations, SIAM J. Sci. Statist. Comput., 10 (1989), pp. 53--57.
27.
D. S. Watkins, Performance of the QZ algorithm in the presence of infinite eigenvalues, SIAM J. Matrix Anal. Appl., 22 (2000), pp. 364--375.

Information & Authors

Information

Published In

cover image SIAM Journal on Matrix Analysis and Applications
SIAM Journal on Matrix Analysis and Applications
Pages: 1270 - 1294
ISSN (online): 1095-7162

History

Submitted: 24 October 2017
Accepted: 8 June 2018
Published online: 14 August 2018

Keywords

  1. Hessenberg-triangular reduction
  2. Householder reflectors
  3. iterative refinement

MSC codes

  1. 15A22
  2. 15A23
  3. 65Y20

Authors

Affiliations

Funding Information

Croatian Science Foundation : HRZZ-9345
Swiss National Science Foundation
High Performance Computing Center North
H2020 European Research Council https://doi.org/10.13039/100010663 : 671633

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

View Options

View options

PDF

View PDF

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media