Nonlinearizing two-parameter eigenvalue problems

We investigate a technique to transform a linear two-parameter eigenvalue problem, into a nonlinear eigenvalue problem (NEP). The transformation stems from an elimination of one of the equations in the two-parameter eigenvalue problem, by considering it as a (standard) generalized eigenvalue problem. We characterize the equivalence between the original and the nonlinearized problem theoretically and show how to use the transformation computationally. Special cases of the transformation can be interpreted as a reversed companion linearization for polynomial eigenvalue problems, as well as a reversed (less known) linearization technique for certain algebraic eigenvalue problems with square-root terms. Moreover, by exploiting the structure of the NEP we present algorithm specializations for NEP methods, although the technique also allows general solution methods for NEPs to be directly applied. The nonlinearization is illustrated in examples and simulations, with focus on problems where the eliminated equation is of much smaller size than the other two-parameter eigenvalue equation. This situation arises naturally in domain decomposition techniques. A general error analysis is also carried out under the assumption that a backward stable eigenvalue solver method is used to solve the eliminated problem, leading to the conclusion that the error is benign in this situation.

The main idea of our approach can be described as follows. We view (1.1b) as a parameterized generalized linear eigenvalue problem, where λ is the parameter. Due to perturbation theory for eigenvalue problems, there is a family of continuous functions {g i (λ)}, defined by the eigenvalues of (1.1b), where µ is the eigenvalue, of a generalized eigenvalue problem (GEP). More formally, for a fixed value of λ the functions g i (λ) and y i (λ) can be defined, as the solution to for a given vector c ∈ C m . We explicitly introduced the normalization condition (1.2b), to uniquely define a corresponding eigenvector. The condition (1.2b) is not a restriction of generality except for the rare situation that the eigenvector is orthogonal * Department of Mathematics, KTH Royal Institute of Technology, SeRC Swedish e-science research center, Lindstedtsvägen 25, SE-100 44 Stockholm, Sweden, email: {eringh,eliasj}@kth.se to c. We prefer this condition over the standard Euclidean normalization, since the right-hand side of (1.2b) is an analytic function.
By insertion of µ = g i (λ) into (1.1a), we see that a solution to (1.1) will satisfy Note that we have now eliminated µ and (1.1b), at the cost of the introduction of a nonlinear function into the eigenvalue problem. The problem M (λ)x = 0 is called a nonlinear eigenvalue problem (NEP). In our setting it is rather a family of NEPs, since we have a different nonlinearity for each function g 1 , . . . , g m . The study of NEPs is a mature field within numerical linear algebra, and there are considerable theoretical results, as well as algorithms and software for NEPs.
The main contributions of this paper consist of a theoretical characterization of the elimination procedure (Section 2), analysis of structured perturbations corresponding to the elimination (Section 4), as well as new algorithms for (1.1) based on NEP-algorithms (Section 3). We provide software for the simulations, both for MATLAB and for Julia [5]. The Julia software is implemented using the data structures of the NEP-PACK software package [18], including adaption of theory for how to compute derivatives and projections. This provides new ways to solve (1.1), using the large number of NEP-solvers available in NEP-PACK. Some contributions are also converse, i.e., we provide insight to NEPs based on the equivalence with twoparameter eigenvalue problems. For instance, in Sections 2.2-2.3 we show how to transform certain NEPs with square-root nonlinearities to two-parameter eigenvalue problems. This in turn (using the operator determinants described below) allows us to transform the problem to a standard generalized eigenvalue problem, similar to companion linearization techniques for polynomial and rational eigenvalue problems.
We now summarize the NEP-results relevant for our approach. For a broad overview see the summary papers [34,26,46,9], as well as the benchmark collection [3] and software packages with NEP-solvers [33,11,12,18]). There is considerable theoretical works available for the NEP, in particular for polynomial eigenvalue problems. Techniques to transform polynomial NEPs to standard eigenvalue problems (known as linearization) have been completely characterized in a number of works, e.g., [24,25] and [29]. We relate our approach to this type of linearization in Section 2.2. In our derivation, we make explicit use of the implicit function theorem applied to the NEP. This has been done in the context of sensitivity analysis, leading to eigenvector free formulas for conditioning [1]. There are a number of algorithms available for NEPs, of which many seem to be applicable to (1.3). More specifically, we characterize the specialization of residual inverse iteration [30], which forms the basis of more recent methods such as the nonlinear Arnoldi method [45]. We also show how the infinite Arnoldi method [20] can be adapted to (1.3).
In Section 5.2 we illustrate how two-parameter eigenvalue problems of this type can arise by the separation of domains of a partial-differential equation (PDE). The domains are decoupling in a way that the discretization leads to a two-parameter eigenvalue problem. In this context, the elimination corresponds to an elimination of one of the domains. The elimination of an outer domain, in a way that directly leads to NEPs, by introduction of artificial boundary conditions is the origin of several standard NEPs in the literature, e.g., [40] and the electromagnetic cavity model in [44].
Relevant results for two-parameter eigenvalues can be summarized as follows. Many results for two-parameter eigenvalue problems are phrased in the more general setting of multiparameter eigenvalue problems. There are a number of recent efficient algorithms available, e.g., based on the Jacobi-Davidson approach [15,16], including subspace methods in [14]. A number of generalizations of inverse iteration are derived in [32]. Our approach is based on an eigenvalue parameterization viewpoint. Eigenvalue parameterization and continuation techniques (but with an additional parameter) have been studied, e.g., in [31].
One of the most fundamental properties of two-parameter eigenvalue problems is the fact that solutions are given by the solution to a larger linear (generalized) eigenvalue problem. This is also often used in the numerical algorithms mentioned above, and to our knowledge first proposed as a numerical method in [36]. More precisely, we associate with (1.1) the operator determinants where ⊗ denotes the Kronecker product. The solutions to (1.1) are (under certain assumptions) equivalent to the solutions to the two generalized eigenvalue problems where z = y ⊗ x. In practice, the application of a general purpose eigenvalue solver on one of the GEPs in (1.7) yields an accurate solution for small systems. The equivalence between (1.7) and (1.1) holds under non-singularity assumption; in particular the problem is singular if A 3 and B 3 both are singular; or A 2 and B 2 both are singular. See [2] for a precise characterization, and [21,17] for more recent formulations.
The following matrix is often used in theory for eigenvalue multiplicity and eigenvalue conditioning, and will be needed throughout the paper. We denote where v and w are left eigenvectors associated with (1.1a) and (1.1b) respectively. In particular, for an (algebraically) simple eigenvalue of the two-parameter eigenvalue problem (1.1), the matrix C 0 is nonsingular; see [21,Lemma 3], [15,Lemma 1.1], and [17, Lemma 1]. For a simple eigenvalue, the normwise condition number for the two-parameter eigenvalue problem is expressed as a special induced matrix norm of C −1 0 , see [17,Section 4]. 2. Nonlinearization.
2.1. Existence and equivalence. The elimination of the B-equation in the two-parameter eigenvalue problem can be explicitly characterized as we describe next. We show how the existence of a nonlinearization can be explicitly related to the Jordan structure of the (parametrized) GEP which we can also use in practice for the computation of µ = g i (λ) for a given λ. The existence of analytic functions is formalized in the following lemma. The invertibility assumption in the lemma will be further characterized in Theorem 2.3. Lemma 2.1 (Existence of implicit functions). Let λ ∈ C be given and assume that (µ, y) is such that (1.1b) is satisfied with y normalized as c T y = 1. Moreover, assume that is nonsingular. Then, there exist functions g i : C → C and y i : C → C m such that • g i and y i are analytic in λ, • g i and y i satisfy (1.2) in a neighborhood of λ, and • µ = g i (λ) and y = y i (λ). Proof. The proof is based on the complex implicit function theorem. Consider the analytic function f : C m+2 → C m+1 given by Then J as in (2.2) is the partial Jacobian of f with respect to the variables y and µ, i.e., J = ∂f /∂(y, µ). Since f (λ, µ i , y i ) = 0 and J(λ, µ i , y i ) is nonsingular, the existence of the desired functions, analytic in the point λ, follows from the complex implicit function theorem [8,Theorem I.7.6].
Under the same conditions that the implicit functions exist we have the following equivalence between the solutions to the NEP (1.3) and the solutions to the twoparameter eigenvalue problem (1.1).
The situation that the Jacobian matrix in (2.2) is singular is a non-generic situation. It turns out, as we show in the following theorem, that it is singular (essentially) if and only if the GEP (2.1) has a Jordan chain of length two or more. Therefore, our technique in general works if a solution to the two parameter eigenvalue problem (1.1) corresponds to a simple eigenvalue µ of the GEP (2.1). Theorem 2.3 (Singularity and Jordan structure). Let λ ∈ C be given and assume that c is not orthogonal to any eigenvector of the GEP (2.1). Moreover, assume that (µ i , y i ) is a solution to the GEP with y i normalized as c T y i = 1. Then J(λ, µ i , y i ) is singular if and only if there exists a vector u ∈ C m such that i.e., there exists a Jordan chain of length at least two corresponding to the GEP (2.1) and eigenpair (µ i , y i ).
Proof. We start by proving that singularity implies the existence of a Jordan chain. Assume that J(λ, µ i , y i ) is singular. Then there exists a non-trivial vector z α T ∈ C m+1 such that J(λ, µ i , y i ) z α T = 0. The first row gives The cases α = 0 and α = 0 are investigated separately. Assume that α = 0, then z = 0 and thus (2.4) implies that z is an eigenvector to the GEP. However, (2.5) gives a contradiction. If α = 0 then (2.4) gives that there exists a Jordan chain of length at least two, with u = z/α. Hence, singularity implies the existence of a Jordan chain.
To prove the converse we assume that there exists a Jordan chain of length at least two. Let z := u − (c T u)y i . Note that from construction of z (2.5) holds. Moreover, from the Jordan chain we know that (2.4) holds for the constructed z with α = 1. Hence, the vector z 1 T is a non-trivial vector in the null-space of J(λ, µ i , y i ). Thus the existence of a Jordan chain implies singularity. From a practical point of view we know that if we compute simple eigenvalues of the GEP (2.1) such that c is not orthogonal to the corresponding eigenvector, then the Jacobian is nonsingular. Hence, the nonlinearization exists. The result is formalized in the following corollary to Theorem 2.3.
Corollary 2.4. Let λ ∈ C be given. Assume that µ i ∈ C is a simple eigenvalue of the GEP (2.1), and that y i is a corresponding right eigenvector normalized as Many algorithms for NEPs depend on analyticity in a target domain. From the above theory we directly conclude the following result for the convergence radius of the implicitly constructed analytic function.
Corollary 2.5 (Convergence radius). Let λ ∈ C be given and assume that The functions g i and y i in Theorem 2.1 can be chosen such that they are analytic in the open disk with radius r centered in λ, where r is defined by r = min |λ − s| : such that g i (s) is a double eigenvalue of the GEP (2.1) .
Proof. By the definition of r, Theorem 2.3 ensures that J(λ, g i (λ), y i ) is nonsingular in all points in the open disk. Application of Theorem 2.1 to all those points establishes the result.
As discussed above, the choice of solution to the GEP (2.1) corresponds to the choice of function g i (λ). We note that from Corollary 2.4 it is clear that the existence of the nonlinearization only relies on that the chosen eigenvalue, of the GEP, is simple. Similarly from Corollary 2.5 it is clear that the convergence radius is dependent only on the specific function g i in consideration. Hence, the existence, and the convergence radius, of the NEP (1.3) only depends on the behavior of g i and not the complete eigenstructure of the GEP.

Nonlinearizations leading to quadratic eigenvalue problems.
We first illustrate the theory in the previous section with an implicitly defined function which can be derived explicitly. Consider the two-parameter eigenvalue problem for general matrices A 1 , A 2 and A 3 . The second row in (2.6b) implies that the elements in the vector y T = y 1 y 2 are related by y 2 = λy 1 . The first row in (2.6b) becomes λ 2 y 1 − µy 1 = 0. Hence, since y 1 = 0, we have µ = λ 2 and (2.6a) becomes This problem is commonly known as the quadratic eigenvalue problem, which has been extensively studied in the literature [41]. The example shows that the twoparameter eigenvalue problem (2.6) can be nonlinearized to a quadratic eigenvalue problem. Moreover, the determinant operator equation (1.7a) leads to the equation which is a particular companion linearization of (2.7). (It is in fact a symmetry preserving linearization [41,Section 3.4].) Many of the linearizations of polynomial eigenvalue problems given in [25] can be obtained in a similar fashion. Since, the second equation (1.1b) can be expressed as det(B(λ, µ)) = 0, which is a bi-variate polynomial, this example is consistent with the bivariate viewpoint of companion linearizations in [29]. Some higher-degree polynomials can be constructed analogously to above, e.g., the polynomial eigenvalue problem A 1 + λA 2 + λ m A 3 . However, the general higher-degree polynomial eigenvalue problem does not seem to fit into the class of two-parameter eigenvalue problems.

Nonlinearization leading to algebraic functions.
The previous example can be modified in a way that it leads to algebraic functions, which is also the generic situation. Nontrivial solutions to (1.1b) satisfy det(B(λ, µ)) = 0, which is a bivariate polynomial. Therefore, the functions g i (λ) are roots of a polynomial, where the coefficients are polynomials in λ, i.e., g i are algebraic functions. The generic situation can be seen from the case where m = 2: We obtain that µ is the root of a polynomial, where the coefficients depend on λ, i.e., The explicit solutions to this quadratic equation are given by We see by insertion of µ = g ± into (2.8a) that the nonlinearization of (2.8) is a NEP with an algebraic nonlinearity. The function g + is illustrated in Figure 2.1. Several general conclusions can be made from this example. Note that the variables a, b, c, d, e, f can be used for fitting of any function p(λ) where p is a polynomial of degree two. Therefore, we can now reverse the nonlinearization, and for the trivial case a = d = 0 we directly obtain the following characterization.
then (λ, x, µ, y) satisfies the two-parameter eigenvalue problem equation (2.8) with µ := p(λ) and A further consequence of the lemma is that problems of the type (2.9) can be linearized to a standard GEP using the determinant operators (1.7). More precisely, the combination of Lemma 2.6 and (1.7) shows that (2.9) can be solved by computing solutions to The fact that algebraic NEPs can be linearized was already pointed out in the conference presentation [35], for a specific case using techniques not involving two-parameter eigenvalue problems. Also note that the functions g i (λ) have branch-point singularities. This is the generic situation and we can therefore never expect that the nonlinearizations are entire functions in general. The singularities restrict the performance of many methods, as we will see in the simulations. The implications of singularities in practice is well-known in quantum chemistry, where parameterized eigenvalue problems is a fundamental tool and the singularities are referred to as intruder states [10,Chapter 14]. In that context, methods for computing the closest singularity (which limits the performance of the method) are given in [19,22].
3.1. Derivative based algorithms. Many NEP-algorithms are based on derivatives of M . We will now illustrate how to efficiently and reliably access the derivatives of the NEP stemming from a nonlinearization of a two-parameter eigenvalue problem. As a representative first situation we consider the augmented Newton method, see [34,43]. It can be derived by an elimination of the correction equation in Newtons method, and leads to separate eigenvalue and eigenvector update formulas expressed as In an implementation, one takes advantage of the fact that the same linear system appears twice, and only needs to be computed once. The iteration has appeared in many variations with different names, e.g., inverse iteration [37] and Newtons method [42].
In order to apply (3.1) we clearly need the derivative of M defined in (1.3), which can be obtained directly if we can compute the derivative of the implicitly defined function g i . Note that the functions g i (λ) (as well as the auxiliary vector y i (λ)) can be evaluated by solving the GEP (2.1), and normalizing according to c T y i = 1. Since the functions are analytic in general, their respective derivatives exist. They can be computed according to the following result, which gives a recursion that can compute the kth derivative by solving k linear systems of dimension (m + 1) × (m + 1). The adaption of the theorem and (3.1) into an algorithm results in Algorithm 1.
Theorem 3.1 (Explicit recursive form for derivatives). Let λ ∈ C be given and assume that . Let g i and y i be the functions defined in Lemma 2.1, then the kth derivative, k = 1, 2, . . . , of g i and y i are given by Proof. We again consider the analytic function f given by (2.3). By Lemma 2.1 we know that g i and y i are analytic around λ, and that f (λ, g i (λ), y i (λ)) = 0 in a neighborhood of λ. Taking the kth implicit derivative with respect to λ gives The two first terms are found directly as The third term can be calculated, by using Leibniz derivation rule for products, to be We emphasize the recursion: All derivatives up to order k−1 can be considered known since these do not depend on the higher derivatives. Collecting the known terms in the right-hand-side gives the result. Specifically, the closed form of g i (λ) means that the derivative of the NEP (1.3) can be written in closed form, as For methods only requiring the first derivative of M (λ), the above expression can be used instead of (3.2). However, that requires the computations of the left eigenvector of the GEP. We will need the expression for theoretical purposes in Section 4.
The family of methods in [20,6,28] (flavors of the infinite Arnoldi method) also requires derivative information. These methods require computation of quantities such as where x 1 , . . . , x p are given vectors. The computation requires higher derivatives of g i . However, σ is unchanged throughout the iteration and therefore the matrix in the linear system for derivative computation (3.2) is unchanged. Hence, all needed derivatives can be computed by solving an additional linear system. If m n, this will in general not be computationally demanding. We also note that these fixed-shift methods choose a branch g i in the initial solution of the GEP (2.1), and then stay on that branch. Hence, convergence properties will depend on the convergence radius of that function g i , as mentioned in the end of Section 2.1. Compute Projection methods. Many NEP-algorithms require the computation of a projected problem where V, W ∈ C n×p are orthogonal matrices. The problem (3.3) is again a NEP, but of smaller size. This can be viewed as a Petrov-Galerkin projection of the spaces spanned by the columns of V and W . The projection is sometimes called subspace acceleration (or the nonlinear Rayleigh-Ritz procedure), since it is often used to improve properties of a more basic algorithm, e.g., the nonlinear Arnoldi method [45], Jacobi-Davidson methods [7,4], block preconditioned harmonic projection methods [47], the infinite Lanczos method [27], and many more. In order to give access to these methods, we need to provide a way to solve (3.3) for our nonlinearized problem. Fortunately, the projected problem stemming from the nonlinearized two-parameter eigenvalue problem, i.e., has a structure which suggests straightforward methods for the projected problem. This is because the projected NEP has the same structure as the nonlinearized twoparameter eigenvalue problem, and can therefore be lifted back to a two-parameter eigenvalue problem, but now of much smaller size. We can then use general methods for two-parameter eigenvalue problems. This is directly observed from the fact that (3.4) is the nonlinearization of a two-parameter eigenvalue problem with projected A-matrices. It is made more precise in the following result. Corollary 3.3 (Projected nonlinearized problem). Suppose the quadruplet (λ, z, µ, y) ∈ C × C p × ×C × C m is such that c T y = 1 and suppose J(λ, µ, y) with J as defined in (2.2) is nonsingular. Then, (λ, z, µ, y) is a solution to the the twoparameter eigenvalue problem if and only if (λ, z) is a solution to (3.4) for one pair of functions (g(λ), y(λ)) = (µ, y) which satisfies (1.2).
Proof. This follows directly from the application of Theorem 2.2 on the projected problem (3.5) and the NEP (3.4).
If the projection space is small p n, and m n, we may even solve the twoparameter eigenvalue problem using the operator determinant eigenvalue equations (1.7) or [15, Algorithm 2.3].
The situation p = 1 implies that the projected problem is a scalar problem, and reduces to the so-called Rayleigh functional. There are several methods based on the Rayleigh functional, e.g., residual inverse iteration [30], and variational principle based approaches such as [39] and references therein. The fact that the projected problem is scalar and linear allows us to eliminate µ, and we find that λ is a solution to the generalized eigenvalue problem. The following corollary specifies the formulas more precisely, and the adaption of the result into the residual inverse iteration is given in Algorithm 2.
Corollary 3.4. The solution to the projected NEP (3.4) with p = 1 is given by λ, µ ∈ C and y ∈ C m , where (λ, y) is a solution to the GEP and µ is given by Proof. This is derived from a special case of Corollary 3.3 where p = 1. The relation (3.5a) with W = w and V = v can be solved for µ resulting in the relation (3.7). By inserting this relation into (3.5b) we obtain the GEP (3.6).
Compute correction u k+1 = x k − M (σ) −1 z using the factorization computed in Step 1 6 Normalize x k+1 = u k+1 / u k+1 end 4. Conditioning and accuracy. In order to characterize when the elimination procedure works well, we now analyze how the technique behaves subject to perturbations. As a consequence of this we can directly conclude how backward stable computation of g influences the accuracy (Section 4.2). 1

Conditioning as a nonlinear eigenvalue problem.
Standard results for the condition number of NEPs can be used to analyze perturbations with respect to the A-matrices. More precisely, if we define where α j are scalars for j = 1, 2, 3, and ∆λ is such that 0 = (A 1 + ∆A 1 + (λ + ∆λ)(A 2 + ∆A 2 ) + g(λ + ∆λ)(A 3 + ∆A 3 ))(x + ∆x), (4.1) then we know (see, e.g., [1]) that where v, x are the corresponding left and right eigenvectors. In the following we will establish how this formula is modified when we also consider perturbations in the Bmatrices. Note that this implies that the function g is also perturbed and we cannot directly use the standard result. We therefore define the condition number κ(λ) := lim sup ε→0 |∆λ| ε : ∆A j ≤ εα j , j = 1, 2, 3 and ∆B j ≤ εβ j , j = 1, 2, 3 , where β j are scalars for j = 1, 2, 3, and ∆λ fulfills (4.1) but with a perturbed g, i.e., µ + ∆µ = g(λ + ∆λ), such that The definitions can be used both for absolute and relative condition numbers by setting α j = β j = 1 or α j = A j , β j = B j for j = 1, 2, 3 respectively.
As an intermediate step we first consider the perturbation of µ subject to perturbations in the B-matrices and fixed perturbations in λ by analyzing κ g (λ) := lim sup ε→0 |∆µ| ε : |∆λ| ≤ εγ and ∆B j ≤ εβ j , j = 1, 2, 3 , where γ is a scalar, and ∆µ satisfies (4.3) for a given λ . The following result shows that κ g can be expressed as a sum of perturbations associated with the B-matrices and perturbations associated with λ. Lemma 4.1. Let λ ∈ C be given and suppose µ = g(λ) is a simple eigenvalue of the GEP (2.1) with w and y being corresponding left and right eigenvectors. Then, Proof. Since µ is a simple eigenvalue of the GEP (2.1), the eigenvalue and eigenvector are analytic, and therefore ∆y = O(ε) when all the perturbations are O(ε). By collecting all the higher order terms the perturbed GEP (4.3a) can thus be written as Multiplying with w H from the left, solving for ∆µ, and dividing with ε gives that An upper bound is thus found as It remains to show that the bound can be attained. This follows from considerinĝ B = wy H / w y , and inserting Using the intermediate result we can now show that the condition number κ(λ) is the sum of the standard condition number of NEPs and a term representing perturbations in g generated by perturbations in the B-matrices, i.e., κ g,B (λ).
Proof. Recall the assumptions that the NEP (1.3), i.e., M , is analytic, that λ is a simple eigenvalue of the NEP, and that µ is a simple eigenvalue of the GEP (2.1). Hence, the eigenvalues and eigenvectors are analytic, and therefore ∆x = O(ε) when all the perturbations are O(ε). By using that g(λ + ∆λ) = g(λ) + ∆µ and collecting all the higher order terms, the perturbed NEP (4.1) can therefore be written as Multiplying with v H from the left, expanding ∆µ according to (4.4), solving for ∆λ, and dividing with ε, gives that where θ g,B (λ) := −(w H ∆B 1 y + λw H ∆B 2 y + g(λ)w H ∆B 3 y)/(w H B 3 y). Based on Remark 3.2 we observe that the denominator of (4.5) is equal to εv H M (λ)x. An upper bound is is therefore It remains to show that the bound can be attained. Similar to the proof of Lemma 4.1, this follows from consideringB = wy H / w y andÂ = vx H / v x , and inserting into (4.5).

Backward stable computation of g.
The nonlinearization is based on solving a GEP to evaluate the function g(λ). We analyze the effects on the accuracy in the computed λ when the GEP is solved numerically with a backward stable method. The analysis assumes the two triplets (λ, x, v) ∈ C×C n ×C n and (µ, y, w) ∈ C×C m × C m are such that λ is a simple eigenvalue of the NEP (1.3), µ is a simple eigenvalue of the GEP (2.1), and v, w and x, y are corresponding left and right eigenvectors respectively.
From the assumption that the GEP (2.1) is solved by a backward stable method we know that µ can be characterized as the exact solution to a nearby problem. More precisely, µ solves where C 1 = −(B 1 + λB 2 ), C 2 = B 3 , with perturbations, ∆C 1 and ∆C 2 , that are proportional to the errors in our GEP solver. Specifically, there are non-negative β 1 , β 3 ∈ R such that ∆C 1 = β 1 ε and ∆C 2 = β 3 ε. Thus, the perturbation in g is precisely captured by κ g,B (λ) from Lemma 4.1, with β 2 = 0 and β 1 and β 3 given above, i.e., by the specific choice of GEP solver. Hence, by application of Theorem 4.2 with α j = 0 for j = 1, 2, 3 we can conclude that the forward error in λ, induced by the inexact but backward stable computation of g(λ) is bounded by Without loss of generality we now assume that x = v = y = w = 1.
The upper bound (4.6) is related to the condition number for multiparameter eigenvalue problems as follows. As mentioned in the introduction, the condition number for the two-parameter eigenvalue problem can be directly expressed with the inverse of C 0 defined in (1.8). First note that our assumptions imply that C 0 is invertible. Proof. By using the expression for M (λ) from Remark 3.2 we thus have Since the eigenvalues λ and µ are simple we know that w H B 3 y = 0, and that v H M (λ)x = 0. Hence, det(C 0 ) = 0.
From (4.7) we can conclude that the bound (4.6) on |∆λ| can be written as Moreover, for a nonsingular C 0 it is shown in [17,Theorem 6] that the condition number of the two-parameter eigenvalue is where the θ-norm, i.e., · θ , is an induced norm defined in [17,Equation (5)]. In our case we can explicitly bound the condition number by using bounds following directly from the definition of the θ-norm: The parameter θ 2 is the second component of the θ-vector used in the definition of the θ-norm. Hence, the bound in (4.8) can be further bounded by The typical choices of θ corresponding to the absolute respectively relative condition number of the two-parameter eigenvalue problem are |θ 2 | = 1 + |λ| + |g(λ)| and |θ 2 | = B 1 + |λ| B 2 + |g(λ)| B 3 . From the bounds in (4.9) we therefore conclude: The error generated by a backward stable method is benign for well conditioned twoparameter eigenvalue problems.

Simulations.
5.1. Random example. We generate an example similar to the example in [15], but with m n. More precisely, we let where n = 5000 and m = 20. The matrices V Ai , U Ai , V Bi , U Bi have randomly normal distributed elements and F i , G i are diagonal matrices with randomly normal distributed diagonal elements. The scalars α i and β i were selected such that the eigenvalues closest to the origin were of order of magnitude one in modulus (α 1 = β 1 = 1, α 2 = β 2 = 1/500, α 3 = β 3 = 1/50). The simulations were carried out using the Julia language    Note that this problem is of such size that the naive approach with operator determinants (1.7) is not feasible, since we cannot even store them in memory on the computers we use for the simulations 3 .
We illustrate our algorithms and compare with several other single-vector stateof-the-art algorithms in [32]. As starting values we use λ 0 = 0.15 + 0.1i and µ 0 = 35 + 0.25i, and a starting vector with an elementwise absolute error (from a nearby solution) less than 0.05. The iteration history of Algorithm 1 is given in Figure 5.3. We observe an asymptotic fast convergence for Algorithm 1, which is expected since the solution point is analytic and simple. The error is measured at Step 3 in Algorithm 1 which implies that by construction, the error in the B-equation is (numerically) zero. This is a property of the elimination in our approach. We compare (with the same starting values) with the inverse iteration Newton approach proposed in [32]. Note that this method is designed for more general problems, and not specifically our situation where m n and also multiparameter nonlinear problems. Since [32,InvIter] requires several linear solves per iteration, our algorithm is faster in this case. The comparison between the two algorithms as a function of iteration is inconclusive, as can be seen in Figure 5.3a. The convergence of our adaption of residual inverse iteration (Algorithm 2) initiated in the same way (except the starting vector is chosen as a vector of ones) is illustrated in Figure 5.4. We clearly see the expected linear convergence, since it is equivalent to residual inverse iteration for NEPs and the convergence theory in [30, is directly applicable. We compare with a proposed generalization of residual inverse iteration [32,InvIter], again noting that it has a much wider applicability domain than our approach. In this case, our method has a smaller convergence factor, intuitively motivated by the fact that we solve the B-equation exactly.
The problem can also be solved with the tensor infinite Arnoldi method [6]. More specifically, we use the implementation of the method available in the Julia package NEP-PACK [18] (version 0.2.7). By directly using Theorem 3.1 we can compute the 60 first derivatives. The convergence of the first ten eigenvalues are visualized in Figure 5.5, for two branches. The solutions are visualized in Figure 5.2.

Domain decomposition example.
We consider a PDE-eigenvalue problem, which we separate into two domains in a way that it leads to a two-parameter eigenvalue problem. Although domain decomposition (and coupling via boundary conditions) is not new for standard eigenvalue problems, the fact that this type of  domain decomposition can be phrased as a two-parameter eigenvalue problem has, to our knowledge, not previously been observed. Although the technique seems applicable in several physical dimensions, we consider a one-dimensional problem for reproducibility.

Consider the Helmholtz eigenvalue problem defined in the domain
with a wavenumber κ which is discontinuous in one part of the domain and smooth in another, as in Figure 5.6. We take a point x 1 such that κ is smooth for x > x 1 , assume that the solution is non-zero in the interface point x = x 1 , and define This means we have two separate PDEs for the two domains: These are standard linear PDEs (with robin boundary conditions) and the uniqueness of these PDEs implies an equivalence with the original PDE (5.1). See [23] and references therein for domain decomposition methods for PDE eigenvalue problems. The wavenumber is given as in Figure 5.6, i.e., it is discontinuous at several points in [x 0 , x 1 ] and with a high frequency decaying sine-curve in [x 1 , x 2 ], representing a inhomogeneous periodic medium. We invoke different discretizations in the two domains, for the following reasons. Since κ is discontinuous in [x 0 , x 1 ] spectral discretization in that domain will not be considerably faster than a finite difference approximation. We therefore use a uniform second order finite difference for (5.2) to obtain sparse matrices and one sided second order finite different scheme for the boundary condition. A spectral discretization is used for [x 1 , x 2 ] where the wavenumber is smooth. Since µ appears linearly in the boundary condition, the discretization leads to a two-parameter eigenvalue problem of the type (1.1). In our setting A 1 , A 2 , A 3 are large and sparse, and B 1 , B 2 , B 3 are full matrices of smaller size. We use the discretization parameters such that n = 10 6 and m = 30, and x 0 = 0, x 1 = 4 and x 2 = 5. In order to make the measurement of error easier, we use left diagonal scaling of the problem such that the diagonal elements of A(1.0, 1.0) and B(1.0, 1.0) are equal to one.
The eigenvalues and some corresponding eigenfunctions are plotted in Figure 5.7 and Figure 5.8. The nonlinear function g 1 of this problem is given in Figure 5.9. Clearly the function has singularities for some real λ-values. The convergence of Algorithm 1 and Algorithm 2 are again compared to [32] in Figure 5.11. We again conclude that both our approaches are competitive, although not always faster in terms of iterations, but our approach is generally faster in terms of CPU-time. The algorithms are initiated with approximate rounded eigenvectors and eigenvalues close to the eigenvalue λ ≈ 18. We note that our methods do not require a starting value for µ (in contrast to [32]) which is an attractive feature from an application point of view, since the value µ = u (x 1 )/u(x 1 ) is artificially introduced parameter and may not be easy to estimate.   We apply the tensor infinite Arnoldi method also for this problem. Since this family of methods is based on a power series expansion, one can only expect to be able to compute eigenvalues on the same side of the singularities as the shift. We therefore run the algorithm several times for different shifts, and select the shifts far away from the singularities, as described in Figure 5 runs are illustrated in Figure 5.10. Note that the convergence corresponding to one eigenvalue for the shift σ = 12 stagnates. This is because the eigenvalue is close to the singularity, and therefore difficult to compute, as can be seen in Figure 5.7. 6. Conclusions and outlook. We have presented a general framework to approach two-parameter eigenvalue problems, by nonlinearization to NEPs. Several steps in this technique seem to be generalizable (but beyond the scope of the paper), e.g., to general multiparameter eigenvalue problems essentially by successive application of the elimination. One such elimination leads to a nonlinear two-parameter eigenvalue problem as considered, e.g., in [32].
Our paper uses the assumption m n and that A 1 , A 2 and A 3 are large and sparse. We made this assumption mostly for convenience, since this allows us to apply a general purpose method for the parameterized eigenvalue problem (2.1). If, on the other hand, we wish to solve two-parameter eigenvalue problems where these assumptions are not satisfied, the ideas may still be useful. The GEP (2.1) may for instance be approached with structured algorithms (exploiting sparsity, low-rank  properties and symmetry), or iterative solution methods for the GEP, where early termination is coupled with the NEP-solver.
The generated nonlinear functions g i are algebraic functions, and can therefore contain singularities (e.g. branch point singularities as characterized in Section 2). These can be problematic in the numerical method, and therefore it would useful with transformations that remove singularities. Linearization which do not lead to singularities have been established for rational eigenvalue problems [38].
The problem in Section 5.2 is such that we obtain one large and sparse parameterized matrix A(λ, µ) which is coupled with a small and dense system. The setting matches the assumptions of the paper and is a representative example of cases where the behavior is different in the two physical domains. The example may be generalizable, to other coupled physical systems where the modeling in one domain leads to a much smaller matrix, e.g., using domain decomposition with more physical dimensions. Note however that the presented methods seem mostly computationally attractive if the discretization of one domain is much smaller. If we apply the same technique to domains of equal size, other generic two-parameter eigenvalue methods (such as those in [32]) may be more effective.