Spectral methods for multiscale stochastic differential equations

This paper presents a new method for the solution of multiscale stochastic differential equations at the diffusive time scale. In contrast to averaging-based methods, e.g., the heterogeneous multiscale method (HMM) or the equation-free method, which rely on Monte Carlo simulations, in this paper we introduce a new numerical methodology that is based on a spectral method. In particular, we use an expansion in Hermite functions to approximate the solution of an appropriate Poisson equation, which is used in order to calculate the coefficients of the homogenized equation. Spectral convergence is proved under suitable assumptions. Numerical experiments corroborate the theory and illustrate the performance of the method. A comparison with the HMM and an application to singularly perturbed stochastic PDEs are also presented.

1. Introduction. Multiscale stochastic systems arise frequently in applications. Examples include atmosphere/ocean science [45] and materials science [19]. For systems with a clear scale separation it is possible, in principle, to obtain a closed -averaged or homogenizedequation for the slow variables [56]. The calculation of the drift and diffusion coefficients that appear in this effective (coarse-grained) equation requires appropriate averaging over the fast scales. Several numerical methods for multiscale stochastic systems that are based on scale separation and on the existence of a coarse-grained equation for the slow variables have been proposed in the literature. Examples include the heterogeneous multiscale method (HMM) [62,21,1] and the equation-free approach [36]. These techniques are based on evolving the coarse-grained dynamics, while calculating the drift and diffusion coefficients "on-the-fly" using short simulation bursts of the fast dynamics.
A prototype fast/slow system of stochastic differential equations (SDEs) for which the aforementioned techniques can be applied is 1 where X ε t ∈ R m , Y ε t ∈ R n , ε 1 is the parameter measuring scale separation, σ x ∈ R m×d 1 , σ y ∈ R n×d 2 are constant matrices, and W x , W y are independent d 1 and d 2 -dimensional Brownian motions, respectively. 2 For fast-slow systems of this form, a direct numerical approximation of the full dynamics would be prohibitively expensive, because resolving the fine scales would require a time step δt that scales as O(ε 2 ). Under appropriate assumptions on the coefficients and on the ergodic properties of the fast process Y ε t , it is well known that the slow process converges, in the limit as ε tends to 0, to a homogenized equation that is independent of the fast process and of ε [56, Ch. 11]: (2) dX t = F(X t ) dt + A(X t ) dW t .
The drift and diffusion coefficients in (2) can be calculated by solving a Poisson equation involving the generator of the fast process, 3 where L y = h(x, y) · ∇ y + σ y σ T y : ∇ y ∇ y , together with appropriate boundary conditions, and calculating averages with respect to the invariant measure µ x (dy) of Y ε t : Once the drift and diffusion coefficients have been calculated, then it becomes computationally advantageous to solve the homogenized equations, in particular since we are usually interested in the evolution of observables of the slow process alone. The main computational task, thus, is to calculate the drift and diffusion coefficients that appear in the homogenized equation (2). When the state space of the fast process is high dimensional, the numerical solution of the Poisson equation and calculation of the integrals in (3) using deterministic methods become prohibitively expensive and Monte Carlo-based approaches have to be employed. In recent years different methodologies have been proposed for the numerical solution of the fast-slow system (1) that are based on the strategy outlined above, for example the Heterogeneous Multiscale Method (HMM) [62,21,1] and the equation-free approach [36]. In particular, the PDE-based formulas (4) are replaced by Green-Kubo type formulas [21, Sec. 1] that involve time averages and numerically calculated autocorrelation functions. The equivalence between the homogenization and the Green-Kubo formalism has been shown for a quite general class of fast/slow systems of SDEs [54]. See also [39,41]. While offering several advantages, time and ensemble averages, on which these methods are based, imply that accurate solutions are computationally very expensive to obtain. Based on the analysis of [21], one deduces that the computational cost needed to obtain an error of order 2 −p scales as O(2 p(2+1/l) ), where l is the weak order of accuracy of the micro-solver used.
When the dimension of the state space of the fast process is relatively low, numerical approaches that are based on the accurate and efficient numerical solution of the Poisson equation (3) using "deterministic" techniques become preferable.
Such an approach was taken in [9] for the study of the diffusion approximation of a kinetic model for swarming [12]. The computational methodology that was introduced and analyzed is based on the numerical calculation of the eigenvalues and eigenfunctions of an appropriate Schrödinger operator using a high-order finite element method. The authors showed rigorously and by means of numerical experiments that for sufficiently smooth interaction potentials the proposed numerical scheme performs extremely well; in particular, the numerical calculation of the first few eigenvalues and eigenfunctions are sufficient for the very accurate calculation of the drift and diffusion coefficients.
Methods that are based on the caclculation of the eigenvalues and eigenfunctions of the transfer (Koopman) operator have been introduced in [24,25]. In addition diffusion maps have also been applied to multiscale stochastic systems [14,13]. Techniques that are based on the transfer operator and diffusion maps are related to our approach since, unlike the HMM and the equation-free approach, they do not require the solution of the fast dynamics. On the other hand, it should be noted that these techniques are, in principle, more powerful than the methodology developed in this paper, since they enable us to identify the coarse-grained variables.
In this paper we develop further the methodology introduced in [9] and we apply it to the numerical solution of fast/slow systems of SDEs, including singularly perturbed stochastic partial differential equations (SPDEs) in bounded domains. Thus, we complement the work presented in [2], in which a hybrid HMM/spectral method for the numerical solution of singularly perturbed SPDEs with quadratic nonlinearities [7] at the diffusive time scale was developed. 4 The main difference between the methodology presented in [9] and the approach we take in this paper is that, rather than obtaining the orthonormal basis by solving the eigenvalue problem for an appropriate Schrödinger operator, we fix the orthonormal basis (Hermite functions) and expand the solution of the Poisson equation (3) (after the unitary transformation that maps it to an equation for a Schrödinger operator) in this basis. We show rigorously and by means of numerical experiments that our proposed methodology achieves spectral convergence for a wide class of fast processes in (1). Consequently, our method outperforms Monte Carlo-based methodologies such as the HMM and the equation-free method, at least for problems with low-dimensional fast processes. We discuss how our method can be modified so that it becomes efficient when the fast process has a high-dimensional state space in the conclusions section, Section 7.
In this paper we will consider fast/slow systems of SDEs for which the fast process is reversible, i.e. it has a gradient structure [55,Sec. 4.8] 5 where X ε t (t) ∈ R m , Y ε t (t) ∈ R n , α(·, ·) ∈ R m×p , W x and W y are standard p and n-dimensional Brownian motions, and V (·) is a smooth confining potential. Remark 1. Our algorithm applies to the more general situation when the fast dynamics depends on the slow one in the following way The analysis presented in this paper also generalises with minor changes for such a fast process and the effective drift (4a) becomes Hence the additional term appearing above involves the solution of a Poisson problem already computed.
SDEs of this form appear in several applications, e.g. in molecular dynamics [18,40]. Furthermore, several interesting semilinear singularly perturbed SPDEs can be written in this form, see Section 6. It is well known [55,Sec. 4.9] that the generator of a reversible SDE is unitarily equivalent to an appropriate Schrödiner operator. Our approach is to first solve this Poisson equation for the Schrödinger operator via a spectral method using Hermite functions and then use this solution in order to calculate the integrals in (4). For smooth potentials that increase sufficiently fast at infinity our method has spectral accuracy, i.e. the error decreases faster than any negative power of the number of floating point operations performed. This, in turn, via a comparison for SDEs argument, implies that we can approximate very accurately the evolution of observables of the slow variable X ε t in (5) by solving an approximate homogenized equation in which the drift and diffusion coefficients are calculated using our spectral method. For relatively low dimensional fast-processes, this leads to a much more accurate and computationally efficient numerical method than any Monte Carlo-based methodology. We remark that our proposed numerical methodology becomes (analytically) exact when the fast process is, to leading order, an Ornstein-Uhlenbeck process, since in this case, for a suitable choice of the mean and the covariance matrix, the Hermite functions are the eigenfunctions of the corresponding Schrödinger operator.
The rest of the paper is organized as follows. In Section 2, we summarize the results from homogenization theory for the fast/slow system (5) that we will need in this work. In Section 3 we present our numerical method in an algorithmic manner. In Section 4, we summarize the main theoretical results of this paper; in particular we show that our method, under appropriate assumptions on the coefficients of the fast/slow system, is spectrally accurate. The proofs of our main results are given in Section 5. In Section 6 we present details on the implementation of our numerical method, discuss the computational efficiency and present several numerical examples, including an example of the numerical solution of a singularly perturbed SPDE; for this example, we also present a brief qualitative comparison of our method with the HMM method. Section 7 is reserved for conclusions and discussion of further work. Finally in the appendices we present some results related to approximation theory in weighted Sobolev spaces that are needed in the proof of the main convergence theorem.

Diffusion Approximation and Homogenization.
In this section, we summarize some of our working hypotheses and the results from the theory of homogenization used to derive the effective SDE for the system (5). Throughout this paper, the notation |·| denotes the Euclidian norm when applied to vectors or the 2-norm when applied to matrices. In addition, we denote the components of a vector v ∈ R d by v 1 , v 2 · · · , v d . We start by assuming that V (·) is a smooth confining potential, [55,Definition 4.2]: These hypotheses guarantee that the fast process has a well defined solution for all positive times, with a unique invariant measure whose density is given by 1 Z e −V (·) , where Z is the normalization constant. Without loss of generality, we may assume that Z = 1. To these assumptions, we add which guarantee that the law of Y ε t converges to its invariant distribution e −V exponentially fast (e.g. in relative entropy), see [47]. We assume furthermore that the drift coefficient in the slow equation of system (5) satisfies (10) f (x, y) ∈ (C ∞ (R m × R n )) m , R n f (x, y) e −V (y) dy = 0, and |f (x, y)| ≤ K(1 + |x|)(1 + |y| β ) ∀x ∈ R m and ∀y ∈ R n , with β a positive integer and K a positive constant. Under Assumptions (8) and (9), we can use Proposition 17 and Lax-Milgram theorem to show that there exists for all x ∈ R m a unique φ(x, ·) ∈ H 1 R n , e −V m , where H 1 R n , e −V is the weighted Sobolev space defined in Definition 15, such that By the theory of elliptic regularity, the uniform ellipticity of L implies that φ is smooth in y, and by [51,Theorem 1], φ grows at most polynomially in y.
We also make the following assumption on the Lipschitz continuity with respect to x of the coefficients, (13) f with C(y) a function bounded from above by a polynomial, and the following assumptions on the growth of the coefficients: for positive integers m 1 , m 2 and a positive constant K. It follows from this that φ(·, y) belongs to C 2 (R m ) m for all values of y. This can be shown by using the Feynman-Kac representation of the solution of (11) that was studied in [51]: where z y t is the solution of Using the Feynman-Kac formula (15), one can show [51, p. 1073] that there exist L, q > 0 such that: (16) |φ(x, y)| + |∇ y φ(x, y)| ≤ L(1 + |x|)(1 + |y| q ), Using the previous assumptions we can prove the following homogenization/diffusion approximation result [51, Theorem 3].
Theorem 2. Let (8)-(10), (13) and (14) be satisfied. Then for any T > 0, the family of processes {X ε t , 0 ≤ t ≤ T } solving (5) is weakly relatively compact in (C ([0, T ])) m . Any accumulation point X t is a solution of the martingale problem associated to the operator: and φ(x, y) is the centered solution of the Poisson equation (11). If, moreover, the martingale problem associated to G is well-posed, then X ε t ⇒ X t (convergence in law), where X t is the unique diffusion process (in law) with generator G.
The matrix D(x) is clearly symmetric, and one can show that it is also positive semi-definite, see [56,Section 11.5]. In view of Theorem 2, writing D(x) = A(x)A(x) T we obtain the functions F(x), A(x) that appear in the homogenized SDE (2). Though the choice of A(x) is not unique, all choices lead to the same associated Fokker-Planck or backward Kolmogorov equations, hence the probability measures in path space are the same.
3. Numerical Method. In this section, we describe our method for the approximation of the effective dynamics, the analysis of which is postponed to Section 5. We start by introducing the necessary notation. We will denote by L 2 (R n ) the space of square integrable functions on R n . The notation L 2 (R n , ρ), for a probability density ρ, will be used to denote the space of functions f such that √ ρf ∈ L 2 (R n ). Weighted Sobolev spaces associated to a probability density are defined in Definition 15. whereas scales of Sobolev spaces, associated to an operator, are defined in Definition 16.
In addition to these function spaces, we will denote by P d (R n ) the space of polynomials in n variables of degree less than or equal to d, and by H α (y; µ, Σ) the Hermite polynomials on R n defined in Appendix B: and α ∈ N n . Here H α k (·) denotes the one-dimensional Hermite polynomial of degree α k , Σ ∈ R n×n is a symmetric positive definite matrix, D and Q are diagonal and orthogonal matrices such that Σ = QDQ T , S = QD 1/2 and µ ∈ R n . We recall from Appendix B that these polynomials form a complete orthonormal basis of L 2 (R n , G (µ,Σ) ), where G µ,Σ denotes the Gaussian density on R n with mean µ and covariance matrix Σ. Finally, we will use the notation h α (y; µ, Σ) to denote the Hermite functions corresponding to the Hermite polynomials (19), see Definition 21.
We recall from Section 2 that obtaining the drift and diffusion coefficients F(X) and A(X), respectively, of the homogenized equation requires the solution of the Poisson equations (11). To emphasize the fact that x appears as a parameter in (11), we will use the notations φ x (·) := φ(x, ·) and f x (·) := f (x, ·). The weak formulation of the Poisson equation (11) is to find φ x ∈ H 1 R n , e −V m such that for i = 1, . . . , m, with the centering condition We recall that in order to be well-posed the condition M(f x ) = 0 must be satisfied.
We begin by performing the standard unitary transformation that maps the generator of a reversible Markov process to a Schrödinger operator: e −V /2 : L 2 R n , e −V → L 2 (R n ). Introducing (23) H and ψ x = e −V /2 φ x , the Poisson equation (11) can be rewritten in terms of the operator (23) as: Note that, because the superscript x already indicates that x plays the role of a parameter, we have have not appended the subscript y to H. The weak formulation of this mapped problem reads: find ψ x ∈ H 1 (R n , H) such that, for i = 1, . . . , m, where H 1 (R n , H) = u ∈ H 1 (R n ) : R n |W | u 2 dy < ∞ , and such that the following centering condition is satisfied: The formulas for the effective drift and diffusion coefficients can be written as The advantage of using the unitary transformation is that the solution of this new problem and its derivative lie in L 2 (R n ), rather than in a weighted space.
To approximate numerically the coefficients of the effective SDE, we choose a finitedimensional subspaceŜ d of H 1 (R n , H), specified below, and consider the finite-dimensional approximation problem: Existence and uniqueness, up to a function in the kernel of H, if that space is contained inŜ d , of the solution to the finite-dimensional problem are inherited from the infinite-dimensional problem (25).
For a given basis {e α } |α|≤d ofŜ d , the finite-dimensional approximation of ψ x can be expanded as ψ x d = |α|≤d ψ x α e α , and from the variational formulation (29) we obtain the following linear systems: We will use the notation A αβ = a(e α , e β ) for the stiffness matrix. In view of formula (27) we see that we also need an approximation of the gradient of the solution, which we denote by This approximation can be obtained by solving (30) with the right-hand side In practice, we solved (30) by using the function solve of the C++ linear algebra library Armadillo. As d → ∞, the lowest eigenvalue of A αβ tends to zero. This could in principle induce numerical instabilities, because the system becomes ill-conditioned, but didn't cause any problem in the numerical experiments we present in Section 6. Then, by substituting the approximations of ψ x and ∇ x ψ x in (27), we calculate the approximate drift and diffusion as follows: For A d (x) to be well defined, the symmetric matrix D d must be positive semi-definite. Using the notations ψ x ξ = ψ x d · ξ and f x ξ = (f x e −V /2 ) · ξ, we can show this by noticing that, for any (29). Although ψ x d and (∇ x ψ x ) d are not uniquely defined whenŜ d contains the kernel of H, the coefficients defined in (31a) and (31b) are unique, because by the centering condition f is orthogonal in L 2 (R n ) to the kernel of H. The matrix A d (x) is not, however, uniquely determined by (31c). Using these coefficients, we obtain the approximate homogenized SDE This equation can now be easily solved using a standard numerical method, e.g. Euler-Maruyama.
Our numerical methodology is based on the expansion of the solution to (24) in Hermite functions: A good choice of the mean and covariance, µ and Σ, respectively, is important for the efficiency of the algorithm. In our implementation we choose where λ > 0 is a free parameter independent of the first two moments of e −V . This choice for the mean and covariance guarantees that our method is invariant under the rescaling . An example illustrating why this is desirable is when the mass of the probability density e −V is concentrated far away from the origin. Using Hermite functions centered at 0 would provide a very poor approximation in this case, but choosing Hermite functions around the center of mass of e −V leads to a much better approximation. Note that this is not the only choice that guarantees invariance under rescaling, but it is the most natural one.
, the eigenfunctions of the operator H (defined in (23)) are precisely the Hermite functions h α (y; m, S). Hence choosing these as a basis, i.e. e α = h α (y; m, S), leads to a diagonal matrix A in the linear systems (30), because a(e α , e β ) = λ α δ αβ , with λ α defined in (87). This choice corresponds to λ = 1 in (34). The optimal choice for the parameters µ and Σ for a general density e −V and function f has been partially studied. In particular, it was shown in [30] that O(p 2 ) Hermite polynomials are necessary to resolve p wavelengths of a sine function, when keeping the scaling parameter fixed. This result carries over to the case of normalized Hermite functions, where the associated covariance matrix would play the role of the scaling parameter. In [60], it was shown that much better results could be obtained by choosing the scaling parameter as a function of the degree of approximation. In particular, the author shows that by choosing this parameter inversely proportional to the number of Hermite functions, only O(p) functions are needed in order to resolve p wavelengths in one spatial dimension. More recently, the question of the optimal choice of the scaling parameter has also been studied in the framework of spectral methods for the Fokker-Planck equation, see [23].
Summary of the Method. In short, the method can be summarized as follows. For a given initial condition X ε (0) = X 0 , n = 0, 1, 2, . . ., a given stochastic integrator , and a chosen time step ∆t, set X n 0 = X 0 and 1. Compute the solution ψ , and go back to 1.

Main Results.
In this section we present the main results on the analysis of our numerical method, the proof of which will be presented in Section 5. We first need to introduce some new notations. We will denote by ·, · e −V the inner product of L 2 R n , e −V , defined by u, v e −V = R n u v e −V dy, and by · e −V the associated norm. We will also use the notation · k,e −V for the norm of H k R n , e −V , and · k,O , where O is a negative selfadjoint operator, for the norm of H k (R n , O), see Appendix A. We will denote by π(·) the projection onto mean-zero functions of L 2 R n , e −V , defined by We will work mostly with the Schrödinger formulation (24) of the Poisson equation. In that context, we will employ the L 2 (R n ) projection operator on {v ∈ L 2 (R n ) :M (v) = 0}, see (26), which we denote byπ(·) : where ·, · 0 denotes the L 2 (R n ) inner product. Finally, we will say that a function g ∈ L 2 (R n ) ∩ C ∞ (R n ) decreases faster than any exponential function in the L 2 (R n ) sense if and denote by E(R n ) the space of all such functions.
In addition to the hypotheses presented in Section 2, we will use the following assumptions.
Assumption 4.1. The potential W (y), introduced in (9), is bounded from above by a polynomial of degree 4k, for some k ∈ N. Furthermore, for every multi-index α, there exist constants is the potential that appears in (5b).
) m×m for all α ∈ N n and x ∈ R m . For the proof of our main theorem we will need to have control on higher order derivatives of the solution to the Poisson equation (11). To obtain such bounds we need to strengthen our assumptions on f (x, y) in (5a). In particular, in addition to (14), we assume the following: In addition, the diffusion coefficient in the right-hand side of (5a) satisfies for constants K and m 2 independent of x.
From the Pardoux-Veretennikov bounds (16), a bootstrapping argument, Assumptions 4.1 and 4.3 and the integrability of monomials with respect to Gaussian weights we obtain the bounds for s ∈ N and a constant C(s) independent of x, and where a ∨ b denotes the maximum between a and b.
Remark 4. In Assumption 4.3 we assumed that the derivatives of the drift vector in (5a) with respect to y are bounded uniformly in x. This is a very strong assumption and it can be replaced by a linear growth bound as in (14). Under such an assumption the proof of Theorem 7 has to be modified using a localization argument that is based on the introduction of appropriate stopping times. Although tedious, this is a standard argument, see e.g. [32], and we will not present it in this paper. Details can be found in [61].
Theorem 5 (Spectral convergence of the Hermite-Galerkin method). Under Assumptions 4.1 to 4.3, for any s ∈ N there exists C(s), independent of x, such that the approximate solutions ψ x d and (∇ x ψ x ) d satisfy the following error estimate: where · 0 is the usual L 2 (R n ) norm.
Using this result, we can prove spectral convergence for the calculation of the drift and diffusion coefficients.
Theorem 6 (Convergence of the drift and diffusion coefficients F d and A d ). Suppose that Assumptions 4.1, 4.2 and 4.3 hold. Then the error on the approximate drift and diffusion coefficients decreases faster than any negative power of d, uniformly in x, i.e. for all s ∈ N there exists D(s) such that Using the spectral convergence of the approximate calculation of the drift and diffusion coefficients, we can now control the distance between the solution of the homogenized SDE and its approximation (32).
As we have already mentioned, homogenization/diffusion approximation theorems are generally of the weak convergence type. Furthermore, the effective diffusion coefficient of the simplified equation is not uniquely defined -see Equation (18) and the fact that D(x) = A(x)A(x) T . Consequently, it is not possible to prove the strong convergence of the solution to the approximate SDE (32) to the solution to the homogenized SDE (20) without constraining the choice of the factorization. To establish the next result, we will therefore assume that A(x) and A d (x) are the unique symmetric positive semi-definite square roots of D(x) and D d (x).
Denoting by X(t) the exact solution of the homogenized equation and by X d (t) the approximate solution, we use the following norm to measure the error: Theorem 7. Let Assumptions 4.1 to 4.3 hold. Then the error between the approximate and exact solutions of the simplified equation satisfies for any s ∈ N and T > 0, and where β(s) is a constant depending only on s.
Now we consider the fully discrete scheme. We need to consider an appropriate discretization of the approximate homogenized equation (32). For simplicity we present the convergence results for the case when we discretize the homogenized SDE using the Euler-Maruyama method: but we emphasize that any higher order integrator, e.g. the Milstein scheme, could be used [37,49]. The following is a classical result on the convergence of X n d for which we refer to [37,49,32] for a proof.
Theorem 8 (Convergence of the SDE solver). Assume that X 0 is a random variable such that E|X 0 | 2 < ∞ and that Assumptions 4.1 to 4.3 hold. Then for any choice of T , where X n d denotes the solution of (43). Combined, Theorem 7 and Theorem 8 imply the weak convergence of the solution of (43) to the solution of the homogenized equation (20).

Convergence of the Spectral Method for the Poisson Equation.
In this section we present the proof of Theorem 5, establishing the convergence of the spectral method for the solution of the Poisson equation (21). Since the variable x only appears as a parameter in the Poisson equation, we will consider in this section that it takes an arbitrary value and will omit it from the notation. Additionally, to disencumber ourselves of vectorial notations, we will consider an arbitrary direction of R n , defined through a unit vector e, and denote by f the projection f · e.
We recall from [51,52] that there exists a unique smooth mean-zero function of φ ∈ H 1 R n , e −V satisfying the variational formulation (45) a Let S d be the finite-dimensional subset of H 1 R n , e −V defined by S d = e V /2Ŝ d , whereŜ d is the approximation space defined in eq. (33), and consider the following problem: find φ d ∈ S d satisfying: Note that, by definition of f , φ = φ · e and φ d = φ d · e. The convergence of φ d to φ can be obtained using techniques from the theory of finite elements, in particular Céa's lemma and an approximation argument. We will use the notation that was introduced at the beginning Section 4.
Lemma 9 (Céa's lemma). Let φ be the solution of (45) satisfying M(φ) = 0 and φ d be a solution of (46). Then, Proof. The proof is standard. It uses Poincaré inequality for the measure e −V dx, recalled in Appendix A, Proposition 17.
Since we will be working mostly with the Schrödinger formulation of Poisson equation, we need an analogue of Lemma 9 for the transformed PDE. We recall from Appendix A that the space H 1 (R n , H) is equipped with the norm Lemma 10. Let ψ be the unique solution of (25) satisfyingM(ψ) = 0 and ψ d be a solution of (29). Then the projections ψ = ψ · e and ψ d = ψ d · e satisfy Proof. The result follows directly by using the fact that e −V /2 is also a unitary transformation from H 1 R n , e −V to H 1 (R n , H).
Next, we focus on establishing a result that will allow us to control the right-hand side of (47). In [26,Lemma 2.3] the authors show that any smooth square integrable function such that (−∆ + W )v = g lies in the space E(R n ) introduced in (37), provided that g ∈ E(R n ) and that Assumption (9) holds. Differentiating the equation with respect to y i , we obtain: so it is clear by Assumption 4.1 that ∂ α ψ ∈ E(R n ) for all values of α ∈ N n . This implies that ψ belongs to the Schwartz space S(R n ). We now generalize sligthly [26,Lemma 3.1]. This result will enable to control the norm · 1,H on the right-hand side of (47) by a norm · k,H µ,Σ , where H µ,Σ is an operator defined in Appendix A. From this appendix, we recall that the operator H µ,Σ , with µ ∈ R n and Σ a symmetric positive definite matrix, is defined by H µ,Σ = −∆ + W µ,Σ (y), where W µ,Σ denotes the quadratic function (y − µ) T Σ −2 (y − µ)/4 − tr Σ −1 /2. Lemma 11. For every k ∈ N and v ∈ S(R n ), where C(k, µ, Σ) is a constant independent of v.
A finer version of the previous inequality could be obtained by following the argumentation in [26,Theorem 3.2], but this will not be necessary for our purposes. Lemma 11 can be used to show the following result.
Lemma 12. If W (y) is bounded above by a polynomial of degree 4k, there exists a constant C depending on k, µ, Σ, and W such that any v ∈ S(R n ) satisfies Proof. This follows from the considerations of Appendix A. First we note that To bound the second term, we use Assumption 4.1 on W , together with Lemma 11: Upon combining the results presented so far in this section, we can complete the proof of Theorem 5.
Proof of Theorem 5. In this proof, the constant C is independent of d and changes from line to line. By Lemmas 10 and 11, and the fact that the exact solution ψ and its derivatives are smooth and decrease faster than exponentials, we have: Using Corollary 23 on approximation by Hermite functions, we have for any s > 2k where we used the first estimate of (40) and the fact that ψ s,H µ,Σ = φ s,L µ,Σ . The same reasoning can be applied to ∇ x ψ. Since s was arbitrary, this proves the statement.

Convergence of the Drift and Diffusion Coefficients.
In this section we prove the convergence of the drift and diffusion coefficients obtained from the approximate solution of the Poisson equation.
Proof of Theorem 6. From the expressions of F and F d we have: where we used the fact that f is centered, thus orthogonal to (∇ x ψ x ) d −π((∇ x ψ x ) d ). Using Theorem 5 and Cauchy-Schwarz inequality we deduce that there exists for any value of s ∈ N a constant C(s) such that The error on the diffusion term can be bounded similarly: The proof can then be concluded using the last bound from (40).

5.3.
Convergence of the Solution to the SDE. The proof of Theorem 7 relies on two results. First, it relies on the existence of a constant C L , proved in Appendix C, such that the effective drift and diffusion coefficients, obtained using (17) and (18) and by calculating A(x) as the unique symmetric positive semi-definite square root of D(x), satisfy for all a, b ∈ R m . Next, by Theorem 6 and Lemma 24, there exists for every s ∈ N a constant β(s) independent of d and x such that for any x ∈ R m . Upon combining (48) and (49), Theorem 7 can be proved.
Proof of Theorem 7. The error e d (t) = X(t) − X d (t) satisfies Using the inequality (a + b) 2 ≤ 2a 2 + 2b 2 and Cauchy-Schwarz, we have The first term in the right-hand side can be bounded by using the triangle inequality with the decomposition F(X(τ )) − F d (X d (τ )) = (F(X(τ )) − F(X d (τ ))) + (F(X d (τ )) − F d (X d (τ ))), the Lipschitz continuity of F(·) and the convergence of F d to F: The second term can be bounded in a similar manner by using Burkholder-Davis-Gundy inequality, see for example [35,Theorem 3.28], and Itô isometry : Using (51) and (52) in (50), we obtain: By Gronwall's inequality, this implies: which finishes the proof.
Remark 13. Note that, as mentioned in Section 4, the convergence of the solution can still be proved if we assume that the Lipschitz continuity and convergence of the coefficients holds only locally, provided there exists p > 2 and a constant K independent of d such that the solutions of the equations With these alternative assumptions, we can show that for any δ > 0 and R > |X 0 |, and where C R and D R are the local constants for the assumptions. The proof of this estimate is very similar to the one of the strong convergence of Euler-Maruyama scheme in [32, Theorem 2.2], and will thus not be presented here. From this estimate, we deduce that the solution of the approximate homogenized equation converges to the exact solution when d → ∞.
6. Implementation of the Algorithm and Numerical Experiments. In this section, we discuss the implementation of the algorithm and present some numerical experiments to validate the method and illustrate our theoretical findings.
6.1. Implementation details. Below we discuss the quadrature rules used and the approach taken for the calculation of the matrix and right-hand side of the linear system of equations (30).
The algorithm requires the calculation of Gaussian integrals of the type: Several approaches, either Monte Carlo-based or deterministic, can be used for the calculation of such integrals. Probabilistic methods offer an advantage when the dimension n of the state space of the fast process is large, but since the HMM is more efficient than our approach in that case, in practice we don't use them. Instead, we use a multi-dimensional quadrature rule obtained by tensorization of one-dimensional Gauss-Hermite quadrature rules. For the calculation of the stiffness matrix, we can take advantage of the diagonality of A when the potential is equal to V µ,Σ := 1 2 (y − µ)Σ −1 (y − µ) + log( (2π) n det Σ). 6 Using the notation H µ,Σ to denote the same operator as in Lemma 11, and the shorthand notations H α and h α , for α ∈ N n , in place of H α (y; µ, Σ) and h α (y; µ, Σ), respectively, we have: where D is a diagonal matrix whose entries can be computed explicitly and where W µ,Σ is the potential obtained from V µ,Σ according to (9). To simplify the calculation of these coefficients, we expand the Hermite polynomials in terms of monomials: (57) H α (y; µ, Σ) = |β|≤d c αβ y β . 6 The constant log( (2π) n det Σ) in V (µ, Σ) is chosen so that R n e −V dy = 1.
With this notation, we write: The integrals I α are computed by numerical quadrature. Denoting by w i and q i the weights and nodes of the Gauss-Hermite quadrature, respectively, I α is approximated as where N q denotes the number of points in the quadrature. Only the last factor of the previous expression depends on α, so the numerical calculation of these integrals can be performed by evaluating for each grid point the value of w i (W (q i ) − W µ,Σ (q i )) G (µ,Σ) (q i ) and the values of q α i for |α| ≤ 2d. For the numerical experiments presented in this paper, we used a quadrature rule with 100 points in each dimension, so N q = 100 n . This quadrature rule was used also to approximate the exact effective coefficients from (27).
A similar method can be applied for the calculation of the right-hand side, whose elements are expressed as: By expanding the Hermite functions in terms of Hermite polynomials multiplying G 1/2 (µ,Σ) , the previous equation can be rewritten as which is a Gaussian integral that can also be calculated using a multi-dimensional Gauss-Hermite quadrature.

Numerical experiments. Now we present the results of some numerical experiments.
The Euler-Maruyama scheme is used to approximate both X(t) and X d (t) with a time step of 0.01 for T = 1, and N r = 100 replicas of the driving Brownian motion are used for the numerical computation of expectations. The ith replica of the discretized approximations of X(t) and X d (t) are noted X n,i and X n,i d respectively. In most of the numerical experiments below, the error is measured by: which is an approximation of the norm ||| · ||| used in Theorem 7.
In the numerical experiments presented in this paper, we have chosen the scaling parameter λ in (34) by trial-and-error. A natural extension of the work presented in this paper is to develop a systematic methodology for identifying the optimal scaling parameter, see also the discussion in Remark 3.

Test of the method for single well potentials.
For the two problems in this section, the scaling parameter is chosen as λ = 0.5 for all degrees of approximation. We start by considering the following problem.
and where L y = −∇V · ∇ + ∆. We have written the right-hand side of the equations for the slow processes x 0t and x 1t in this form to ensure that the centering condition is satisfied. The convergence of the approximate solution of the effective equation for this problem is illustrated in Figure 1. Here the potential is localized, so Hermite functions are well suited for the approximation of the solution, which is reflected in the very good convergence observed.
In the next example, the state space of the fast process has dimension 3: Because computing the effective coefficients is much more expensive computationally than in the previous case, we measure the error for a given value of the slow variables, by The value we chose for the comparison is x = (0.2, 0.2), for which the denominators in the previous equation are non-zero. The relative error on the homogenized coefficients is illustrated in Figure 2. In this case, the method also performs very well, although it is slightly less accurate than in the previous example.  In this case, the convergence is also super-algebraic.

6.2.2.
Test of the method for potentials with multiple wells. Now we consider multiplewell potentials that lead to multi-modal distributions. Potentials with multiple wells pose challenges to our method because of the localized nature of Hermite functions. In the examples below, we consider potentials with wells that are close to each other, so the solutions can still be approximated well by Hermite functions. The first potential that we analyze is the standard bistable potential, We consider the fast/slow SDE system: We choose the parameter λ in (34) to be λ = 0.5. The convergence of the method is illustrated in Figure 3. Although the method is less accurate than in the previous cases, a super-algebraic convergence can still be observed, and a good accuracy can be reached by choosing a high enough value for the degree of approximation. Note that the computational cost in this case is low -the numerical solution can be calculated in a matter of seconds on a personal computer.  Next we consider the tilted bistable potential (70) V (y) = y 4 /4 − y 2 /2 + 10y, which corresponds to the case γ = 1, δ = 10 in the examples considered in [9], and the fast/slow SDE The convergence of the solution in this case is presented in Figure 5, for the scaling parameter λ = 1. Due to the presence of a strong linear term, the potential is localized, see Figure 4. As a result, the convergence of the spectral method is good, though it does not appear to be super-algebraic. Finally, we consider a three-well potential in R 2 , and the following fast/slow SDE: For this fast/slow SDE, we choose λ = 0.35. A contour plot of the potential is shown in Figure 6, and the convergence graph is presented in Figure 7. In this case the error is very large for degrees of approximation lower than 10, beyond which the convergence is clear and super-algebraic. The accuracy reached with a degree of approximation equal to 30 is of the order of 1 × 10 −4 , which is good in comparison with the accuracy that can be achieved using Monte Carlo-based methods.
6.2.3. Discretization of a multiscale stochastic PDE. As mentioned in the introduction, our numerical method is particularly well-suited for the solution of singularly perturbed stochastic PDEs (SPDEs), and constitutes a very good complement to the method proposed in [2]. Let us recall how the method introduced in [2] works for a singularly perturbed SPDE  of the following form posed in a bounded domain of R m with suitable boundary conditions. We consider the set-up from [7]. In particular, the operator A in (74) is a differential operator, assumed to be negative and selfadjoint in a Hilbert space H, and with compact resolvent. It is furthermore assumed that A has a finite dimensional kernel, denoted by M. The term W denotes a cylindrical Wiener process on H and Q denotes the covariance operator of the noise, which is assumed to be positive, selfadjoint, and bounded. It is assumed that Q and A commute, and that the noise acts only on the orthogonal complement of M, denoted by M ⊥ , see [7,Assumption 2.4].
The function F (·) is a polynomial function representing a nonlinearity that has to be such that the above scaling makes sense. 7 Since A is selfadjoint with compact resolvent, there exists an orthonormal basis of H consisting of eigenfunctions of A. We denote by {λ k , e k } the eigenvalues and corresponding eigenfunctions of A. We arrange the eigenpairs by increasing absolute value of the eigenvalues, so the m first eigenfunctions are in the kernel of the differential operator, M = span{e 1 , . . . , e m }. Formally, the cylindrical Brownian motion can be expanded in the basis as The assumption that the covariance operator Q commutes with the differential operator A means that this operator satisfies Q e i = q i e i , while the assumption that the noise only acts on M ⊥ implies that q i = 0 for i = 1, 2, . . . , m.
We now summarize how the dynamics of the slow modes in (74) can be approximated by solving a multiscale system of SDEs using the methodology developed in [2].
First, we write the solution of (74) as x k e k and y = ∞ k=m+1 y k e k .
Note that x = Pu, and y = (I − P)u, where P is the projection operator from H onto M. By assumption, the noise term can be expanded in the same way, as ∞ k=1 q k e kẇk (t). Substitution of these expansions in the SPDE gives: The equations that govern the evolution of the coefficients x k and y k can be obtained by taking the inner product (of H) of both sides of the above equation by each of the eigenfunctions of the operator, and using orthonormality : i.e., the centering condition is satisfied.

Equation (75) can be written in the form
where a(x, y) and b(x, y) are the projections of F (u) on M and M ⊥ , respectively: The scale separation now appears clearly. We now truncate the fast process in (76) as y ≈ m+n i=m+1 y i e i to derive the following finite dimensional system: In [43], the authors investigate the use of the heterogeneous multiscale method (HMM) for solving the problem (77), and show that a good approximation can be obtained using this method. However, when the nonlinearity is a polynomial function of u, the function a in the system above, which also appears on the right-hand side of the Poisson equation, is polynomial in x and y. In addition, the generator of this system of stochastic differential equations is of Ornstein-Uhlenbeck type to leading order, and so its eigenfunctions are Hermite polynomials. This means that the right-hand side can be expanded exactly in Hermite polynomials, and so the exact effective coefficients can be computed. Note that although equivalent, applying the unitary transformation is not necessary in this case, as we can work directly with Hermite polynomials in the appropriate weighted L 2 space. We consider the SPDE (74), with A = ∂ 2 ∂x 2 + 1 and F (u) = u 2 ∂u 2 ∂x , posed on [−π, π] with periodic boundary conditions: The eigenfunctions of A on [−π, π] with periodic B.C. are and the corresponding eigenvalues are if i is odd and λ i = 1 − i 2 4 if i is even. For the numerical experiments, we chose q i = 1/i if 3 ≤ i ≤ 5, q i = 0 otherwise, and we used 7 modes for the truncation of the Fourier series. In this case the null space of A is two-dimensional. We consider a noise process of the form: Following the methodology outlined above, we approximate the solution by a truncated Fourier series: Substituting in the nonlinearity and taking the inner product with each of the eigenfunctions, a system of equation of the type (77) is obtained. The operator A and the nonlinearity were chosen so that the centering condition is satisfied. The homogenized equation for the slow variables (x 1 , x 2 ) reads where F(·) and A(·) are given by equations (4a) and (4b), respectively, and W is a standard Wiener process in R 2 . The Euler-Maruyama solver was used for both the macro and micro solvers, and the parameters of the HMM were chosen as Here δt is the time step of the micro-solver, N T is the number of steps that are omitted in the time-averaging process to reduce transient effects, M is the number of samples used for ensemble averages, and N , N are the number of time steps employed for the calculation of time averages and the discretization of integrals originating from Feynman-Kac representation formula (15), respectively. See [21,62] for a more detailed description of the method and a detailed explanation of the parameters in (82). In Figures 8 and 9, we compare the solutions obtained using the HMM method with the one obtained using our approach, using the same macro-solver and the same replica of the driving Brownian motion for both, and with the initial condition x i0 = 1.2 for i = 1, . . . , m. The former is denoted byX n and the latter by X n . Notice that when the value of the parameter p increases, the solution obtained using the HMM converges to the exact solution obtained using the Hermite spectral method. We now investigate the dependence on the precision parameter p of the error between the homogenized coefficients. The same error measure as in [21] is used to compare the two methods: Here F p HM M and A p HM M are the drift and diffusion coefficients obtained using the HMM with the precision parameter equal to p, while F Sp and A Sp are the coefficients given by the Hermite spectral method developed in this paper. Given the choice of parameters (82), the theory developed in [21] predicts that the error should decrease as O(2 −p ). This error is presented in Figure 10 as a function of the precision parameter p, showing a good agreement with the theory developed in [2,21].
For the SPDE described above, solving the Poisson equation associated with (77) using Hermite polynomials does recover exactly the corresponding effective parameters, and the only source of error is the macroscopic discretization scheme. This is in sharp contrast with the HMM-based method developed in [1], for which the micro-averaging process to recover the effective coefficients represents a non-negligible computational cost.
Comparison of (X n ) 1 and (X n ) 1 in (81)   7. Conclusion and Further Work. In this paper, we proposed a new approach for the numerical approximation of the slow dynamics of fast/slow SDEs for which a homogenized equation exists. Starting from the appropriate Poisson equation, the same unitary transformation as in [9] was utilized to obtain formulas for the drift and diffusion coefficients in terms of the solution to a Schrödinger equation. This equation is solved at each discrete time by means of a spectral method using Hermite functions, from which approximations of the homogenized drift and diffusion coefficients were calculated. A stochastic integrator was then used to evolve the slow variables by one time step, and the procedure is repeated.
Building on the work of [26], spectral convergence of the homogenized coefficients was rigorously established, from which weak convergence of the discrete approximation in time to the exact homogenized solution was derived. In the final section, the accuracy and efficiency of the proposed methodology were examined through numerical experiments.
The method presented, although not as general as the HMM, has proven more precise and more efficient for a broad class of problems. It performs particularly well for singularly perturbed SPDEs, and constitutes in this case a good complement to the HMM-based method presented in [2]. It also works comparatively very well when the fast dynamics is of relatively low dimension -typically less than or equal to 3 -and especially so when the potential is localized, since fewer Hermite functions are required to accurately resolve the Poisson equations in this situation. Our method also has several advantages compared to the approach taken in [9]: it does not require truncation of the domain, does not require the calculation of the eigenvalues and eigenfunctions of the Schrödinger operator, and has better asymptotic convergence properties.
The limitations of the method are two-fold; its generality is limited by the requirement of the gradient structure for fast dynamics, and its efficiency is limited by the curse of dimensionality, which causes the computation time to become prohibitive when the dimension of the state space of the fast process increases.
The extent to which some of these constraints can be lifted constitutes an interesting topic for future work. We believe that it is possible to generalize our method to a broader class of problems while retaining its efficiency and accuracy. When the fast process is not reversible, then tools from the spectral theory for non-selfadjoint problems could be used [31]. In the case of the underdamped Langevin dynamics, expansion of the solution in Hermite polynomials in both position and momentum can be performed [38]. On the other hand, high-dimensional integrals could be computed more efficiently. For example, an alternative to the tensorized quadrature approach taken in this work is to use a sparse grid method; such a method can in principle offer the same degree of polynomial exactness with a significantly lower number of nodes, see e.g. [27,34]. Furthermore, efficient preconditioning techniques can be used in order to reduce the computational cost of the numerical solution. Preconditioning can be naturally implemented for nonreversible fast processes that have a well-defined structure, such as the underdamped Langevin dynamics, [57], or the Markovian approximation of the generalized Langevin equation, [50].
Appendix A. Weighted Sobolev Spaces. In this section, we recall a few results about weighted Sobolev spaces needed for the analysis presented in Section 5. For more details on this topic, see [26,64,8,44]. Throughout the appendix, V denotes a smooth confining potential whose derivatives are all bounded above by a polynomial and such that ρ := e −V is normalized.
Definition 14. The weighted space L 2 (R n , ρ) is defined as It is a Hilbert space for the inner product given by: Definition 15. The weighted Sobolev space H s (R n , ρ), with s ∈ N, is defined as It is a Hilbert space for the inner product given by: We also define the following spaces.
Definition 16. Given s ∈ N and a positive selfadjoint operator −L on a Hilbert space H of functions on R n , we define H s (R n , L) as the space obtained by completion of C ∞ c (R n ) for the inner product: We will denote the associated norm by · s,L .
It can be shown that C ∞ c (R n ) is dense in H 1 (R n , ρ), see [63]. By integration by parts, this implies that H 1 (R n , ρ) = H 1 (R n , L), where −L is the positive selfadjoint operator on L 2 (R n , ρ) defined by L = ∆ − ∇V · ∇. For the rest of this appendix, we make the additional assumption that the potential V satisfies With this, the following compactness result holds.
Remark 18. Alternative conditions on the potential that ensure that the corresponding Gibbs measure satisfies a Poincaré inequality are presented in [42,Theorem 2.5].
Now we consider the unitary transformation e −V /2 : L 2 (R n , ρ) → L 2 (R n ), and characterize the spaces obtained by applying this mapping to the weighted Sobolev spaces.
Proposition 19. The multiplication operator e −V /2 is a unitary transformation from H s (R n , L) to H s (R n , H), where −H is the positive selfadjoint operator on L 2 (R n ) defined by c (R n ) and any exponent i ∈ N, from which the result follows by density. The space H 1 (R n , H), for H defined as above, is of particular relevance to this paper. This space can be equivalently defined by and, for u ∈ H 1 (R n , H), Appendix B. Hermite Polynomials and Hermite Functions. In this appendix, we recall the results about Hermite polynomials and Hermite functions that are essential for the analysis presented in this paper.
Hermite polynomials. In one dimension, it is well-known that the polynomials form a complete orthonormal basis of L 2 R, G (0,1) , where G (0,1) is the Gaussian density of mean 0 and variance 1. These polynomials can be naturally extended to the multidimensional case. For µ ∈ R n and a symmetric positive definite matrix Σ ∈ R n×n , consider the Gaussian density G (µ,Σ) of mean µ and covariance matrix Σ. Let D and Q be diagonal and orthogonal matrices such that Σ = QDQ T , and note S = QD 1/2 , such that Σ = SS T . The polynomials defined for α ∈ N n by form a complete orthonormal basis of L 2 (R n , G (µ,Σ) ). Note that the Hermite polynomial corresponding to a multi-index α depends on the orthogonal matrix Q chosen. When µ and Σ are clear from the context, we will sometimes omit them to simplify the notation.
In addition to forming a complete orthonormal basis, the Hermite polynomials are the eigenfunctions of the Ornstein-Ulhenbeck operator −L µ,Σ = Σ −1 (y − µ) · ∇ − ∆, which is the operator L from the previous appendix for ρ = G (µ,Σ) . The eigenvalue associated to H α (y; µ, Σ) is given by where {λ i } n i=1 are the diagonal elements of D −1 . Proof. Let us introduce, for an orthogonal matrix P ∈ R n×n and a vector u ∈ R n , the rotation operator R P and the translation operator T u defined respectively by R P f (·) = f (P T ·) and T u f (·) = f (· − u). With these notations, note that L µ,Σ = T µ • R Q • L 0,D • R Q T • T −µ and H α (·; µ, Σ) = T µ • R Q H α (·; 0, D), so it is sufficient to show that H α (·; 0, D) is an eigenfunction of L 0,D . Writing L 0,D as a sum of one-dimensional operators, we see that it is enough to prove this in one dimension, which follows from a simple recursion.
These functions form a complete orthonormal basis of L 2 (R n ). Since they are obtained from Hermite polynomials by a multiplication by G (µ,Σ) , they satisfy: Proposition 22. Given µ ∈ R n and Σ ∈ R n×n positive definite, the Hermite functions h α (y; µ, Σ) are the eigenfunctions of the operator: with the same eigenvalues as in (87).
Hermite functions inherit the good approximation properties of Hermite polynomials expressed in Proposition 20. In the following result, π refers to the L 2 (R n ) projection operator, i.e.
Appendix C. Proof of (48). In this appendix, we show the Lipschitz continuity of the effective coefficients. The proof relies on the following technical lemma.
Lemma 24. For any m ∈ N and any symmetric positive semi-definite matrices M 1 , M 2 ∈ R m×m , there exists c(m) > 0 such that where |·| is the matrix 2-norm.
Proof. To alleviate the notations, let M = M 1 and ∆ = M 2 − M 1 . The inequality is trivially satisfied when ∆ = 0; we assume now that ∆ = 0. Without loss of generality, we assume that |∆| = 1, that ∆ has an eigenvalue equal to +1, and that M is diagonal with its eigenvalues occuring in descending order on the diagonal. For x ∈ R m arbitrary, By Cauchy-Schwarz inequality, the expression in the second brackets is non-negative, so assuming that x is such that the expression in first brackets is positive, To conclude the proof, we show that for any m, M, ∆ as above, there exists x ∈ R m such that R M,∆ (x) is bounded from below by a constant depending only on m. Assume that there → 0, and let x i denote an eigenvector corresponding to the eigenvalue 1 of ∆ i . Passing to a subsequence if necessary, we can assume that x i → x ∞ . By assumption, The proof of the Lipschitz continuity of A(·) relies on Lemma 24. By virtue of this result, it is sufficient to show that, for all ξ ∈ R m with ξ T ξ = 1, (ξ T D(x)ξ) 1/2 is Lipschitz continuous with a Lipschitz constant independent of ξ. Let ξ ∈ R m be given, and let us introduce the notations f ξ = f T ξ, φ ξ = φ T ξ, and α ξ = α T ξ. Clearly, φ ξ is the centered solution of −Lφ ξ = f ξ , so ξ T D(x)ξ = R n |α ξ | 2 − Lφ ξ φ ξ e −V dy =: D 1 (x) + D 2 (x).
From the triangle inequality and (13), Likewise, using (14) and (16), Thus ξ T D(x)ξ is Lipschitz continuous with a constant independent of ξ.