Computing the Wave-Kernel Matrix Functions

We derive an algorithm for computing the wave-kernel functions cosh √ A and sinhc √ A for an arbitrary square matrix A, where sinhc(z) = sinh(z)/z. The algorithm is based on Padé approximation and the use of double angle formulas. We show that the backward error of any approximation to cosh √ A can be explicitly expressed in terms of a hypergeometric function. To bound the backward error we derive and exploit a new bound for ‖Ak‖1/k that is sharper than one previously obtained by Al-Mohy and Higham (SIAM J. Matrix Anal. Appl., 31(3):970–989, 2009). The amount of scaling and the degree of the Padé approximant are chosen to minimize the computational cost subject to achieving backward stability for cosh √ A in exact arithmetic. Numerical experiments show that the algorithm behaves in a forward stable manner in floating-point arithmetic and is superior in this respect to the general purpose Schur–Parlett algorithm applied to these functions.

The two fundamental solutions to (1.1) are obtained by applying the operators cosh t √ ∆ and t sinhc t √ ∆ to the Dirac delta function.These solutions are the kernels of the linear (integral) transformation that maps the external input b(x, t) and the initial data f (x) and g(x) to the general solution of (1.1).Greiner et al. [14] have explicitly computed the wave kernels for several subelliptic second-order operators.
We will focus on the algebraic second-order Cauchy problem where f , g, b, and u are vectors in C n independent of x and A ∈ C n×n is an arbitrary square matrix: Such linear second-order ODE systems are obtained (for instance) from a semidiscretization of (1.1) by the finite element method.For this algebraic system the wave kernels are the matrix functions cosh t √ A and t sinhc t √ A.
In view of these connections we will call the functions cosh √ z and sinhc √ z the wave-kernel functions.In this paper we derive a new algorithm for computing the wave-kernel matrix functions cosh √ A and sinhc √ A for an arbitrary square matrix A. We emphasize that A in (1.3) is the given matrix.Typically (for example, in [6], [19]), the matrix in (1.3) is assumed to be given in the form A 2 .Treating general A presents new challenges for the backward error analysis, as we will see.
The wave-kernel functions have the power series representations z n (2n + 1)! .
Both series have an infinite radius of convergence and hence both are entire functions (analytic in the whole complex plane).Since cosh and sinhc are even functions there are no square root terms in (1.4) and so in the matrix case there are no questions about the existence of the matrix square root or of which square root to take.In many applications A is symmetric, but nonsymmetric A arise in stability and position feedback control of circulatory systems [33, chap. 5], constrained external damping in rotatory shafts [23, p. 43], frictional contact stability and control of robot grasping arrangements [29, chap. 4], [30], and semi-Lagrangian formulation of flows [25].
The stability analysis of second-order ODE systems is done in the frequency domain assuming a time periodic external input b(t).In applications where b(t) is a non-periodic function of time we have to work in the time domain.The time integration of stiff ODE systems is a challenging task.The wave-kernel matrix functions are useful for deriving accurate time integrators suitable for stiff second-order ODE systems.
When A is a large, possibly sparse matrix there are various approaches to computing the action of a matrix function f (A) on a vector b [18, chap. 13].One is to generate approximations to f (A)b from a Krylov subspace K(A, b) [16].Krylov subspace methods reduce the approximation of f (A)b to the computation of f (H)e 1 for a much smaller upper Hessenberg matrix H, where e 1 is the first unit vector.Another approach is to apply a series approximation along with a suitable scaling strategy [4].
In this work we develop algorithms for computing the wave-kernel matrix functions based on Padé approximation.The algorithms scale the matrix (A ← 4 −s A), evaluate a Padé approximant, then undo the effect of the scaling via recurrences.The amount of scaling and the Padé degree are based on the backward error of the Padé approximant to cosh √ 4 −s A. We obtain an explicit expression for the backward error, valid for any rational approximation, involving a hypergeometric function.For Padé approximants we expand this expression in a power series and bound it in terms of quantities A k 1/k .Our technique for exploiting these quantities is a refinement of that introduced by Al-Mohy and Higham [3] and yields bounds never larger and possibly much smaller.The resulting algorithm is backward stable for computing cosh √ A and mixed forward-backward stable for computing sinhc √ A, where stability is with respect to truncation error in exact arithmetic.
Prior work on computing the wave-kernel matrix functions and their action on vectors has mainly been restricted to the case where the matrix A is symmetric positive definite [11], [12], [15], [31].An exception is Al-Mohy's recent work [2], wherein algorithms to compute the action of trigonometric and hyperbolic matrix functions are derived for any square matrix A. The wave-kernel matrix functions are included as a special case.The approach taken therein is to bound the absolute forward error of approximations based on truncated Taylor series of the matrix functions evaluated at a scaled value of the matrix A. Extending Al-Mohy's analysis to bound the relative forward error is desirable but appears difficult, because it would require a tight lower bound on the norm of the matrix function.Our approach of bounding the (relative) backward error provides a scale-independent measure and it avoids any need for consideration of condition numbers when assessing bounds.
To obtain the backward error result needed to derive our algorithm we need to understand the behavior of the inverse of the function cosh √ z.The necessary results are given in section 2.
In section 3 we derive a new bound for the norm of a general matrix power series in terms of bounds for the quantities max k≥2m A k 1/k .The backward error analysis, and its application to Padé approximants, is given in section 4. Our algorithm for computing the wave-kernel matrix functions is presented in section 5, where careful attention is given to the choice of the parameter s (the amount of scaling) and m (the Padé degree).
The Schur-Parlett algorithm [10], [18, chap. 9], designed for general matrix functions, can also be used to compute the wave-kernel matrix functions.This algorithm requires the ability to compute the derivatives at scalar arguments of the wave-kernel functions, which are given by where (a) n is the Pochhammer symbol and 0 F 1 (; a; z) is a hypergeometric function, both of which are defined in appendix A.2. Numerical experiments are given in section 6 to test the practical behavior of our algorithm and to compare it with the Schur-Parlett algorithm.
2. Fundamental regions and principal inverse of cosh √ z.We begin by developing understanding of the inverse of cosh √ z that will be needed for the backward error analysis.
A region that is mapped by a function in a one-to-one manner onto the whole complex plane, except for one or more cuts, is called a fundamental region of that function [1, p. 98].

Lemma 2.1 (Fundamental regions of cosh √
).As n runs over positive integers, the parametric curves divide the complex plane into fundamental regions of cosh √ z.
Proof.The parametric curves Γ n are motivated by the identity where λ, ρ ∈ R. The segments of the parametric curve Fig. 2.1: The fundamental regions Ω 0 (pink solid fill) and Ω 1 (green dotted pattern) are shaded in the z-plane.The function cosh √ z maps regions labelled Q i in the domain (z-plane) to regions labelled Q i in the range (w-plane).The curve S ρ (t) defined in (2.2), with ρ = π/ √ 2, is shown as a dashed line in the z-plane.Points on S ρ (t) map to the elliptic curve (cosh ρ cos t) + i(sinh ρ sin t) shown as a dashed line in the w-plane.Some salient points in the w-plane are shown by uniquely shaded markers: , , , , , , , and their pre-images are shown in the z-plane.With the aid of these marker points we can associate every curve shown in the z-plane with a corresponding curve in the w-plane.
that lie in the strict interior of the region bounded by Γ n (t) and Γ n+1 (t) are { S ρ (t) : nπ < t < (n + 1)π } and { S ρ (t) : −(n + 1)π < t < −nπ }.To fix ideas we show Γ 1 (t), Γ 2 (t), and S ρ (t) with ρ = π/ √ 2, in Figure 2.1a.Using (2.1), we find that when ρ > 0, cosh √ maps the S ρ curve segments to the strict upper and strict lower segments of the elliptic curve cosh ρ cos t + i sinh ρ sin t in a bijective manner.As ρ → 0, the S ρ curve segments converge to the line segment S 0 , and cosh √ maps the corresponding S 0 line segment to the line −1 < w < 1 in a bijective manner.By varying ρ from 0 to ∞ the S ρ curve segments will sweep out the strict interior of the region bounded by Γ n (t) and Γ n+1 (t).The image of the S ρ curve segments sweep out the entire w-plane except for two cuts (−∞, −1) and (1, ∞) along the real axis.Here w = −1 and w = 1 are branch points.
The proof that the convex region of Γ 1 (t) is a fundamental region of cosh √ proceeds in a similar fashion by considering the curve segment { S ρ (t) : −π < t < π }.
The image of this S ρ curve segment sweeps out the entire w-plane except for a cut (−∞, −1).Here only w = −1 is a branch point.
The curve Γ n (t) corresponds to both edges of the positive cut if n is even, and to the edges of the negative cut if n is odd.To maintain a bijective mapping, we will include the curve segments Γ n (t < 0) and Definition 2.2 (Principal domain of cosh

√
).We call the fundamental region Ω 0 the principal domain.It contains the origin (marked in Figure 2.1a), whose image in the w-plane is not a branch point.
The fundamental regions of cosh √ are the branches of its compositional inverse.

Definition 2.3 (Principal inverse of cosh √
).Let w belong to the complex plane with a cut along the real axis from −∞ to −1 and let z belong to the principal domain Ω 0 .The principal inverse (cosh Lemma 2.4.The principal inverse (cosh √ ) −1 is analytic at all points other than those on the branch cut along the real axis from −∞ to −1.

Proof. Since cosh
√ is entire and its derivative is nonzero 1 for any interior point a ∈ Ω 0 , we can use the Lagrange inversion theorem (see appendix A.1) to express (cosh √ ) −1 as a power series that converges in some neighbourhood of cosh √ a.Thus (cosh √ ) −1 is analytic at cosh √ a. Hence (cosh √ ) −1 is analytic at all points other than those on the branch cut.
A consequence of Lemma 2.4 is that the radius of convergence of the power series of (cosh The sum of a convergent power series of a multi-valued function might fall in a branch different from the principal branch.Should this be the case, the equality of the function to its power series will not hold.For the equality to hold the disc of convergence should not touch or cross the specified branch cut.We will use the power series of (cosh √ ) −1 about the point w = 1 and the largest disc centred at this point touches the branch cut at the branch point w = −1.Hence, the equality of (cosh √ ) −1 with its power series about w = 1 holds inside the disc |w − 1| < 2. The power series of (cosh √ ) −1 about the point w = cosh √ 0 = 1 can be shown, using the Lagrange inversion theorem, to be The preimage of the disc |w − 1| ≤ 2 in the principal domain is shown in Figure 2.2a.Note that it includes the origin and contains the disc |z| ≤ 3 (dashed line).In the next lemma we show that the power series (2.3) has a succinct hypergeometric representation.This representation is invaluable because for some rational function h(z) we will later want to evaluate partial sums of the power series of (cosh about z = 0 and we can delegate the change in expansion point to the computer algebra package Maple, which has knowledge of the hypergeometric function.√ ) −1 has the hypergeometric representation Proof.A series expansion of cosh −1 w [26, eq.(4.38.4)] about the point w = 1 is Using the equations and hence Let us now write the power series of (cosh Equations (2.5) and (2.6), and Clausen's identity [8] 2 F 1 a, b; a As we have only nonnegative integer powers of 1−w in (2.7) and 3 F 2 (1, 1, 1; 3/2, 2; 1 − w/2) converges for |w − 1| = 2 (see appendix A.2), the equality holds in the disc |w − 1| ≤ 2 without the restriction Re w > 0. Thus we obtain (2.4).
3. Bounding a matrix power series.In the design of our algorithm we will need to bound the norm of a matrix power series that represents the error in an approximation.This is a standard requirement in algorithms based on Padé approximants [3], [5], [6], [7], [20].In this section we derive a new bound for the norm of an arbitrary matrix power series We denote by • any consistent matrix norm with I = 1.
Al-Mohy and Higham [3, Thm.1.1] note that where β = max i≥ A i 1/i .The motivation for this bound is that A i 1/i satisfies ρ(A) ≤ A i 1/i ≤ A and can be much smaller than A for a nonnormal matrix, so the bound can be much smaller than In seeking a more easily computed quantity than β, Al-Mohy and Higham [3,Lem. 4.1] show that if a, b, i, j are nonnegative integers such that ai + bj ≥ 1 then We specialize this result as follows.Denote by gcd(a, b) the greatest common divisor of a and b.Theorem 3.1.Let a, b, k, and m be positive integers.Then Furthermore, α m (A) is nonincreasing in m.
Proof.Observe that as i and j run independently over the nonnegative integers the values of ai + bj run over a certain subset of the nonnegative integers.If a and b are co-prime, that is, gcd(a, b) = 1, it is well known that this subset includes all positive integers greater than ab−a−b.The number ab−a−b is called2 the Frobenius number [28] where the max operates along the columns of the matrix and produces a row vector.Choosing A to be any involutory matrix with A > 1 we get which is smaller than the bound obtained by Al-Mohy and Higham [3,Thm. 4.2] min , by a factor of A 8/33 , which can be arbitrarily large because an involutory matrix can have arbitrarily large norm (for example, the matrix 1−b b 2−b b−1 ) is involutory for any b).This improvement can lead to a saving of several matrix multiplications in our algorithm, and indeed other algorithms with a similar derivation, such as the scaling and squaring algorithm for the matrix exponential [3].
A drawback of the α m , compared with the quantities used by Al-Mohy and Higham with b = a + 1, is that they involve norms of higher powers of A, so in principle are more expensive to compute.Two factors mitigate the expense.First, we will estimate the norms without computing the matrix powers explicitly, making the overall cost O(n 2 ) flops compared with the O(n 3 ) flops cost of the whole algorithm when A is dense.Second, we will exploit matrix powers that are explicitly computed within the algorithm in order to reduce the cost further.
4. Error analysis for the wave kernels.

Approximation error. Let h(z) denote an approximation to cosh
√ z for z in a disc centered at the origin such that |h(z) − cosh √ z| → 0 as z → 0. The forward error e(z) of the approximation h(z) to cosh √ z is defined by For z in the principal domain Ω 0 , the backward error E(z) of the approximation h(z) to cosh √ z is defined using the principal inverse as As (cosh √ ) −1 is analytic everywhere except on its branch cut, E is analytic if h is analytic and does not take values on this branch cut.
For a given tolerance we wish to identify a disc centered at the origin such that |E(z)| ≤ |z| inside that disc.In order to do this we need a representation to quantify the backward error.
Proof.For any z such that |1 − h(z)| ≤ 2, the hypergeometric series converges.If z belongs to the intersection of this region with the principal domain Ω 0 , then the backward error result follows from (2.4) and (4.2).
Our attempts to identify the fundamental regions of sinhc √ were not fruitful.Without this knowledge the backward error in the approximations to sinhc √ z cannot be uniquely defined.So we construct approximations to sinhc √ z using h(z) and derive mixed forward-backward error bounds.
Proof.Clearly h(z) has no singularities in the region { z : |1 − h(z)| ≤ 2 } and by definition h(z) does not take values on the branch cut (−∞, −1).So from Lemma 2.4 we see that E(z) is analytic in this region, which leads to the identity The mixed error result (a) follows by taking derivatives in (4.3) and using (4.6).The result (b) follows by taking derivatives in (4.4) and using the identity A matrix function is completely determined by the values of the function and its derivatives on the spectrum of the matrix [18].Since the functions cosh √ z and sinhc √ z are entire, the matrix functions cosh √ A and sinhc √ A are defined for all A. The approximation h(A) is defined if the set of eigenvalues of A does not contain the singularities of h(z).Let ρ(A) denote the spectral radius of A. Lemma 4.4.Let p(z) and q(z) be polynomials such that q(0) = 1 and the coefficients of both p(z) and q(−z) are positive real numbers.If R is a positive real number such that q(−R) < 2 then for |z| ≤ R, Proof.Given that q(0) = 1 and q(−z) has real positive coefficients, the term q(−|z|) − 1 is positive and the inequality |q(z) − 1| ≤ q(−|z|) − 1 ≤ q(−R) − 1 holds in the region |z| ≤ R. In other words, q(z) is contained in a circle with center (1, 0) and radius q(−R) − 1.It follows that max{0, 2 − q(−R)} ≤ |q(z)| ≤ q(−R).If q(−R) < 2 and p(z) has real positive coefficients, then the inequality in (4.Proof.We will first prove that r m (z) is analytic in the disc |z| ≤ 3 for m ≤ 20.Let p m (z) and q m (z) denote the numerator and denominator polynomials of r m .Using Maple we have obtained symbolically the coefficients of p m (z) and q m (z) for m ∈ {1, 2, . . ., 20} and found that p m (z), q m (−z), and p m (z) − q m (z) have positive real coefficients.Choosing R = 3, we find that the first element of the sequence {2 − q m (−R)} is 3/4 and the next 19 elements are, to 4 significant digits, {.8613, .9079,.9313,.9453,.9546,.9612,.9661,.9699,.9730,.9754,.9775,.9793,.9807,.9820,.9832,.9842,.9851,.9858,.9866} .Hence 2 − q m (−R) ≥ 3/4 for m ≤ 20, and it follows from the lower bound in (4.7) that q m (z) has no zeros in the disc |z| ≤ 3 for m ≤ 20.Therefore r m (z) is analytic in the disc |z| ≤ 3 for m ≤ 20.
Likewise, the first element of the sequence Hence the first 20 elements of the sequence are less than or equal to 2. Substituting p(z) with p m (z) − q m (z) in Lemma 4.4, it follows from (4.8) that The result (4.9) follows by applying (4.10) to the spectrum of A.
For m ≤ 20 we can therefore replace the condition |1 − r m (z)| ≤ 2 with the condition |z| ≤ 3 in Lemmas 4.1 and 4.2.Likewise, we can replace the condition ρ(I − r m (A)) ≤ 2 in Theorem 4.3 with the more readily verifiable condition ρ(A) ≤ 3 for m ≤ 20.
We make the following conjecture based on similar observations for m > 20.
Conjecture 4.6.For all m, the [m/m] Padé approximant r m (z) to cosh for some coefficients c m,k , where E m (z) denotes the relative backward error.For a matrix argument A it follows from (4.12) that Using Theorem 3.1 we have where α m is defined in (3.2).Taking derivatives in (4.12) and replacing z by A it follows that and then Theorem 3.1 gives (Recall that E m (z) occurs in the error expansion for sinhc √ A in Theorem 4.3 (b).)Using Maple we have obtained symbolically the first 600 terms in the power series of E m (z) for m ≤ 20 and found that except for m = 2 the coefficients of the series have alternating signs.For m = 2, the coefficients have a structured pattern of alternating signs Note that the first 30 coefficients have alternating signs.The first term is the coefficient of z 5 and it is of the order of 10 −7 .Its product with 3 5 is of the order of 10 −4 .The 30th term is the coefficient of z 34 and it is of the order of 10 −37 .Its product with 3 34 is of the order of 10 −21 .Effectively, then, in the context of double-precision arithmetic with |z| ≤ 3, we can regard E m (z), and also E m (z), as having power series with alternating coefficients.Then  The double-angle formulas (4.17) cosh 2 hold for all A. When α m (A) > min (3, θ m ), we scale down A by a factor 4 s such that α m (4 −s A) ≤ min 3, θ m , compute approximations to cosh √ 4 −s A and sinhc √ 4 −s A and then scale up using the double-angle recurrence (4.18) If the scaling up phase is done in exact arithmetic, then from Theorem 4.3 we find Thus in exact arithmetic the approximation C s (A) to cosh √ A is backward stable and the approximation S s (A) to sinhc √ A is mixed forward-backward stable.
5. Algorithm for computing the wave kernels.The matrices r m (A) and r m (A) are obtained by solving q m (A)r m (A) = p m (A) and q 2 m (A)r m (A) = w m (A), where w m (A) := p m (A)q m (A) − p m (A)q m (A).Using Maple we have obtained symbolically the coefficients of the polynomials p m and q m , evaluated them using variable precision arithmetic, and stored them as IEEE double precision floating point numbers.To reduce cost and to avoid bringing any finite precision cancellation errors to prominence, we also evaluated symbolically and stored numerically the coefficients of the degree 2m − 2 polynomial w m (A).All these are off-line calculations, done in advance.
We will use the Paterson-Stockmeyer (PS) algorithm [18, p. 73], [27] to compute the polynomials p m (A), q m (A), and w m (A).Let σ ≤ m be a positive integer and suppose we compute and store A 2 , A 3 , . . ., A σ , which requires µ σ = σ − 1 matrix multiplications.The total number of matrix multiplications in the PS algorithm is then where m/σ is the largest integer less than or equal to m/σ and σ | m is either 1 (if σ divides m) or 0 (otherwise).
For each m, the σ that mininimizes the number of matrix multiplications in the PS algorithm to compute p m (A), q m (A) and w m (A) is shown in Table 5.1.For m ≤ 20, this cost jumps between degree m and m + 1 only for m ∈ {1-8, 10, 12, 14, 16, 18, 20}.Hence we will consider only these m in our algorithm.The matrices whose columns are the co-primes required to compute α m (A), for these m are shown in Table 5.2.
The definition of α m (A) involves norms of various powers of A defined in (3.2) We will compute only those powers needed for the evaluation of the polynomials and will use those powers to estimate the norms of the others.We use the 1-norm and estimate norms using the block algorithm of Higham and Tisseur [22], which estimates B 1 using a few matrix-vector products with B and B T .We denote a call to the estimator by normest1(A n1 , A n2 , . . ., A n k ), which means that the algorithm estimates A n1+n2+•••+n k 1 by forming matrix-vector products A n1+n2+•••+n k x as A n1 (A n2 (. . .(A n k x))) (and similarly for the transpose).
In using normest1 we want to do as few matrix-vector products as possible.We will use the powers of A stored in the PS algorithm to this end.For instance, consider m = 1, for which only A is stored.To compute α 1 (A) we estimate A 2 1/2 1 and A 3 1/3 1 (see Table 5.2) by calling normest1(A, A) and normest1(A, A, A), respectively.If we proceed to m = 2, we compute and store A 2 (see Table 5.1).To compute α 2 (A) we compute A 2 1/2 1 directly and estimate A 5 1/5 1 with the call normest1(A, A 2 , A 2 ).Note that it makes no difference to the quality of the estimate how the matrix-vector products are factored; our aim is purely to minimize the cost.
We note that Higham and Smith [21, p. 20] analysed the stability with respect to rounding errors of the double angle recurrence (4.18) for their algorithm to compute the matrix cosine.They found that the relative forward error bound is a sum of terms comprising two factors.The first factor is a power up to the sth of an O(1) scalar independent of A. These factors are innocuous if s is small and are likely to be pessimistic, otherwise.The second factor is a product of terms that depend on the norms of the intermediate C i (A), and is difficult to bound a priori.The number of such terms grows with s.So to mitigate the potential deterioration of accuracy in the recurrence, our priority is to minimize s in the scaling stage.The sharper error bounds given in (4.13) and (4.14) and choosing the pair with the larger m and smaller s when there is a choice contribute to this objective.Recall that {A : ρ(A) ≤ 3} is the set of admissible A for which the error terms E m (A) and E m (A) are well-defined.As ρ(A) ≤ α m (A), we note that A m,u := {A : α m (A) ≤ min{θ m , 3}} is a subset of the admissible set and in this subset the error terms E m (A) and E m (A) are bounded by the unit roundoff.Further, as α m (A) is nonincreasing with m by Theorem 3.1, the subset A m,u will grow with m if θ m does.Observe in Tables 4.1 and 4.2 that for m ≤ 5, θ m increases with m and θ m < 3. Hence for all m ≤ 5 the subset A m,u will certainly grow larger with m.Therefore, to avoid scaling in our algorithm we will compute α m (A) sequentially and check if A ∈ A m,u .
Taking all these aspects into account we now present our algorithm to compute the scaling s and the order m.4.2 and the co-primes listed in Table 5.3.
of the set {a, b}.If ab − a − b < 2m then from (3.1) we get the bound max k≥2m A k 1/k ≤ max A a 1/a , A b 1/b .Taking the minimum of these bounds over all co-prime a and b we obtain the lower bound in (3.3).That α m is nonincreasing in m is because the set of a and b in the minimum defining α m grows with m.Al-Mohy and Higham [3, Thm.4.2] chose b = a + 1 in (3.1), for which the coprime condition is naturally satisfied and the condition ab − a − b < 2m simplifies to (a − 1)a ≤ 2m.However, a stronger bound is obtained by not limiting the co-primes in Theorem 3.1, as the following example confirms.Example 3.2.The columns of the matrix 2 all possible co-primes a, b satisfying ab − a − b < 2m for m = 5 (we exclude pairs with a or b equal to 1, as this case simply gives A k ≤ A k ).Using the inequality max A a 1/a , A a+b 1/(a+b) ≤ max A a 1/a , A b 1/b the set of co-primes needed to compute α 5 (A) reduces to

Lemma 4 . 1 .
For all z in the principal domain of cosh √ , if h(z) is any approximation to cosh √ z such that |1 − h(z)| ≤ 2 then the backward error E(z) has the hypergeometric representation

Theorem 4 . 3 .
If A has eigenvalues in the principal domain of cosh √ and h(z) is any approximation to cosh √ z such that ρ(I − h(A)) ≤ 2 then (a) cosh √ A ≈ h(A) = cosh A + E(A), where E is given by (4.4), and (b) sinhc √ A ≈ 2h (A) = (I + E (A)) sinhc A + E(A), where E is given by (4.5).Proof.(a) and (b) follow by applying Lemmas 4.1 and 4.2 on the spectrum of A. 4.2.Padé approximants.Let r m (z) = p m (z)/q m (z) be the [m/m] (diagonal) Padé approximant to cosh √ z.Thus p m and q m are polynomials of degree at most m, q m (0) = 1, and r m (z) − cosh √ z = O(z 2m+1 ).We are not aware of a proof of the existence of r m for all m.Nevertheless, r m exists for a particular m if the m × m Toeplitz matrix with (i, j) entry 1/ 2(i − j + m) ! is nonsingular [24, p. 362].Using Maple we have verified the existence of the first 100 diagonal Padé approximations.The contours of |1 − r m (z)|/2 are shown in Figure 4.1 for m ≤ 4. Observe that in all the sub-figures the disc |z| ≤ 3 (dashed line) is contained inside the contour |1 − r m (z)|/2 = 1; we will prove that this is the case for all m ≤ 20.

Table 5 . 1 :
Parameter σ that minimizes the number of matrix multiplications µ t for each degree m in the Paterson-Stockmeyer algorithm to evaluate p m (A), q m (A) and w m (A), along with µ * = µ t − (σ − 1).

Table 4 .
1: Relative backward error bound | E m (−3)| and values of θ m in (4.15) for the [m/m] Padé approximants to cosh √ z for IEEE double precision arithmetic.

Table 5 .
2: Matrices whose columns are the co-primes required to compute α m (A).

Table 5 .
3: Matrices whose columns are the co-primes required to compute α m (A) sequentially, that is, for each m we update α m (A) ← min(α * (A), α m (A)) where α * (A) was computed in the previous step.