Power law decay for systems of randomly coupled differential equations

We consider large random matrices $X$ with centered, independent entries but possibly different variances. We compute the normalized trace of $f(X) g(X^*)$ for $f,g$ functions analytic on the spectrum of $X$. We use these results to compute the long time asymptotics for systems of coupled differential equations with random coefficients. We show that when the coupling is critical the norm squared of the solution decays like $t^{-1/2}$.


Introduction.
We consider the linear differential equation with u t an N dimensional vector, X a random matrix with independent entries, and g a positive coupling constant. We treat centered random matrices whose entries may have different variances. This differential equation is solved by exponentiating the random matrix X, thus properties of the solution can be computed in terms of functions of X. Existing theoretical analysis often assumes matrices have iid entries as many results are more readily available in this special case. Equation (1.1) has been used for modeling in theoretical neuroscience and mathematically ecology, see for instance [27,26,5,6,7,24]. In both situations, however, the assumption that the entries have identical distributions is not realistic, as it does not allow for spatial information or for different types of species/cells. For g large or small the solution exponentially grows or decays, respectively. But there is a critical g, where the system exhibits power law decay. We show the decay rate of the solution is universal; it does not depend on the details of the distribution of X. In fact, the norm squared of the solution decays like t −1/2 . In some applications, it is natural instead to consider X with Hermitian symmetry. In this case, the order of decay is instead t −3/2 ; the slower decay in the non-Hermitian case is a signal of non-normal amplification of transients [20]. In the neuroscience literature there are many proposals for how certain systems become critically tuned, see for instances [19,21,22].
In the theory of random matrices one is often led to consider the empirical spectral distribution (ESD) of an N × N matrix, X, defined by where λ i are the eigenvalues of X. For random matrices with independent, identically distributed (iid) entries and no Hermitian symmetry, it is well known the ESD converges to a deterministic limit known as the circular law. We refer to [12] and references within for an overview. For functions f , which are analytic on the spectrum of X, f (X) can be defined by Taylor expansion. Then knowledge of the ESD of X allows one to compute traces of functions of X via the formula (1.2) 1 N Tr f (X) = f (x)µ X (dx).
In this non-Hermitian setting, it is of practical and theoretical interest to study functions that depend not just on X but also on X * . Due to the lack of the spectral theorem, no analogue to (1.2) depending just on the eigenvalues exists for computing traces of general functions of X and X * . In this note, we consider functions f and g which are analytic on a neighborhood of the spectrum of X and X * , respectively, and compute the normalized trace of the product f (X)g(X * ).
In [13,25] random matrices with iid Gaussian entries are considered, and the expectation of N −1 Tr f (X)g(X * ) is computed by diagrammatic techniques with uncontrolled error terms. The computation relies on expressing the expectation of Tr f (X)g(X * ) in terms of the overlap function of the left and right eigenvectors. This overlap carries delicate information on correlations between eigenvalues. Some of the questions addressed here, in this paper, are considered in the special case of rank 1 variance profiles in [1], where similar techniques to [13,25] were used.
In our analysis, we give mathematically rigorous proofs with high probability error bounds for the random variable N −1 Tr f (X)g(X * ), not just its expectation. Rather than treating the eigenvalue overlaps directly, we develop a method for expressing such quantities in terms of a resolvent of the linearization matrix, evaluated outside the spectrum, where the expressions are simpler and much more stable.
To these matrices we associate a variance profile matrix S, whose entry s ij = E[|x ij | 2 ] is the variance of the (i, j) entry of X. Matrices with general variance profiles were first studied by Girko with the canonical equation of type K 25 in [15], but the argument was considered incomplete [11]. In the iid case, a rigorous proof of the convergence of the ESD was given by Bai in [11], under certain technical assumptions. This work was recently extended in [8], [14] to include local spectral scales and very general variance profiles, respectively. The ESD of such matrices converges to a deterministic measure which is radially symmetric around origin and supported on a disk with radius given by the square root of the spectral radius of the variance profile.
The computation of Tr f (X)g(X * ) uses Cauchy's Theorem to reduce to studying traces of products of resolvents, (X −ζ 1 ) −1 (X * −ζ 2 ) −1 . We introduce a block matrix with an additional parameter whose blocks are linear in (X −ζ i ). By differentiating in this additional parameter the product of resolvents is expressed in terms of the entries of the resolvent of the linearized matrix. The resolvent of the linearized matrix is then studied with the matrix Dyson equation [4]. Our formulas for functions of X and X * are then applied to compute the long time asymptotics of the norm and autocorrelation of solutions to the differential equations introduced at the beginning of this paper.

Assumptions and Main
Results. Let X be a random N × N matrix with independent complex entries such that the distribution of the (i, j) entry has zero mean and variance s ij : We call the N × N matrix S whose (i, j) entry is s ij the variance profile of X.
Assumption 2.1. The distribution of the entries of the matrix X satisfy: (1) (Uniform primitivity) There are constants κ 1 > 0 and an integer K such that The spectral radius of S is normalized to one; ̺(S) = 1.
(3) (Bounded moments) For each p ∈ N, there exists a ϕ p > 0 such that We will consider κ 1 , K, and ϕ p model parameters. Our estimates will be uniform in all models possessing the same model parameters. In particular, they are uniform in the dimension, N , of the random matrix.
Under Assumption 2.1 the Perron-Frobenius Theorem implies that ̺(S) is the largest eigenvalue in magnitude of S and its associated eigenvector has positive entries. We label this eigenvector v r . Similarly, we denote by v l the Perron-Frobenius eigenvector of S * , i.e.
To the variance profile (s ij ) we associate an operator S and its adjoint S * mapping N × N matrices to N × N matrices given by Equivalently, one could define We denote by Tr N = 1 N Tr the normalized trace. We use x, y = 1 N i x i y i to denote the standard normalized inner product on C N and x = ½, x the average of a vector, where ½ is the constant N dimensional vector with every entry equal to one. Our main theorem computes the limit of the normalized trace of products of analytic functions of X and X * as N → ∞.
Theorem 2.3. Let X satisfy Assumption 2.1 and f, g be analytic functions on a neighborhood of the closed disk with radius 1 + µ centered at the origin for some µ > 0. Let γ be a positively oriented circle, centered at the origin, with radius 1 + µ. Then there exists a universal constant ξ such that for any D and some positive constant C D , which depends on the model parameters, the maximum of f, g on γ, and µ. Here γ traces the same circle as γ but is negatively oriented.
Here and in the future we will typically omit the identity matrix or operator when multiplied by a complex number. Theorem 2.3 is proved after the statement of Theorem 2.9.
Remark 2.4. It will be clear from the proofs of Theorems 2.3 and 2.9 that these results also hold if the lower bound in the primitivity condition (1) is replaced with the assumption that the matrix S is irreducible, meaning it cannot be brought into block upper-triangular form by conjugation with a permutation matrix, and that there is a k > 0 such that the entries of the Perron-Frobenius eigenvectors satisfy recovering the formula computed in [13], [25] in the Gaussian case.
In Section 4, we use Theorem 2.3 to compute the long time asymptotics of systems of linear ODEs, coupled by a random matrix, when averaged over initial conditions. This ODE is a popular model in theoretical neuroscience, see for instance [27,26,5].
Theorem 2.6. Let X satisfy Assumption 2.1 and let u t ∈ C N solve the linear ODE with non-Hermitian coefficient matrix X, coupling coefficient 0 < g ≤ 1, and initial value u 0 distributed uniformly on the N dimensional unit sphere, {u : u = 1} ⊂ C N . Then there exist universal constants ξ and δ as well as constants c, C > 0 such that for any D > 0 and some constant C D . The function J 0 is the Bessel function of the first kind. Here c, C depend only on the model parameters, E u0 denotes expectation with respect to the initial condition, and P is the probability with respect to X.
Remark 2.7. In particular, in the critical g = 1 case, using the asymptotics of the Bessel function where the limit N → ∞ holds in probability with respect to the randomness of X.
The coefficient −1 of the damping term in the differential equation is the negative of the square root of the Perron-Frobenius eigenvalue of S. Since the spectral radius of X converges to ρ(S), the condition 0 < g ≤ 1 ensures the real parts of the eigenvalues of gX will be less than or equal to 1, so the differential equation is stable. The long time asymptotics of (2.2) for the iid model were computed in [13,25]. Our results not only give a rigorous proof, but also prove universality. The asymptotics depend on the variance profile only through its spectral radius ρ(S) and Perron-Frobenius eigenvectors.
In Section 4 we also show that as a consequence of the square root behavior at the edge of the spectrum for Hermitian random matrices the long term asymptotics instead decays like t −3/2 .
We also compute the autocorrelation function, averaged with respect to the Brownian motion and indices, of the solution to with 0 < g < 1 and B t is a standard Brownian motion, independent of X, in stationarity. In the case that all matrix entries have the same variance this was computed in [23], here the case that the (i, j) and (j, i) entries are correlated was also considered. In the absence of these correlations, we show the autocorrelation exhibits universal decay, to leading order it only depends on 1 − g.
Theorem 2.8. Let X satisfy Assumption 2.1. Then there exist universal constants ξ and δ as well as c, C > 0 such that the autocorrelation function of the solution for any D > 0 and some constant C D . Here c, C depends only on the model parameters, and P is probability with respect to X.
The computation for general analytic functions in Theorem 2.3 will follow from Cauchy's Theorem and Theorem 2.9 below concerning the behavior of the trace of the products of two resolvents at spectral parameters ζ 1 , ζ 2 outside the support of the eigenvalues. To quantify the distance from the expected location of the spectrum we introduce the quantity where we use the shorthand ζ := (ζ 1 , ζ 2 ) ∈ C 2 . We will always work on the sets of the form for some δ > 0. The constant 2 can be replaced with any finite constant greater than 1.
Theorem 2.9. Let X satisfy Assumption 2.1. There exist universal constants ξ > 0 and δ > 0 such that for any D > 0 and some constant C D .
Proof of of Theorem 2.3. Corollary 3.6 will show that with high probability there is a disk of radius 1 + N −δ centered at the origin such that outside the disk the resolvent (X − ζ 1 ) −1 is bounded. Thus for N sufficiently large, all eigenvalues of X lie inside the circle γ of radius 1 + µ. We can thus apply Cauchy's Theorem, (2.6) Tr where γ is positively oriented and γ is the same circle oriented negatively. Then applying Theorem 2.9 yields where |ǫ N | f | γ ∞ g| γ ∞ N −ξ with probability at least 1 − C D N −D for any D and N sufficiently large.
We prove Theorem 2.9 at the end of this section, by combining several technical lemmas. Section 3 is devoted to proving these technical lemmas. In Section 4 we prove Theorem 2.6 and 2.8. We now outline the proof of Theorem 2.9, listing the technical lemmas in the reverse order of the actual proof. This motivates better the study of our matrix Dyson equation. However, the actual proofs will avoid any forward referencing to yet unproved lemmas. Studying polynomials of matrices by instead considering larger block matrices has been successfully implemented to solve a number of problems in random matrix theory, see for instance [10,16]. Recently this program has been extended to also studying rational functions of random matrices [17]. We will adopt this idea to the present problem.
Furthermore the spectrum of non-Hermitian matrices, in contrast to the Hermitian setting, can be quite unstable and many complex analytic techniques fail. To handle this problem we follow the Hermitization trick of Girko and reduce the problem to studying the spectrum of a family of Hermitian random matrices.
We begin our analysis with a linearization of (X − ζ 1 ) −1 (X * − ζ 2 ) −1 , which we then modify in order to compute its limit.
In the spirit of [13] and the Bethe-Salpeter equations, we will express the limit of Tr N (X − ζ 1 ) −1 (X * − ζ 2 ) −1 in terms of the limit of the resolvent of a larger block matrix whose blocks are linear in (X − ζ i ) for i = 1, 2. The resolvent of linearized matrix is studied by the matrix Dyson equation. In order to construct this matrix we which, by the same logic, is a linearization, as the (1, 1) As the matrix L ζ α is non-Hermitian it can be sensitive to perturbations. Therefore, we consider its Hermitized version as well as the resolvent with z = E + iη, η a non-negative number. Here and in the future we use boldface to denote 4N × 4N matrices.
Let E i ∈ C 4N ×N for i = 1, 2, 3, 4 be the 4 × 1 vector of N × N matrices with the identity matrix in the i-th component. For the following lemma we introduce x y to mean there exist a constant C, depending only on the model parameters, such that x ≤ Cy.
To study the resolvent G ζ α (z) we introduce the matrix Dyson equation (MDE): where the self-energy operator S : C 4N ×4N → C 4N ×4N is given by for any 4 × 4 block matrix R ∈ C 4N ×4N with R ij ∈ C N ×N in its (i, j) block, and In the following lemma we use the local law from [9] to show that the resolvent, G ζ α (z), is close to the solution of the matrix Dyson equation. We will prove this lemma in Section 3.4 as well as give a lower bound on the spectrum of |H ζ α |, showing Ψ ζ = 1 with high probability. Lemma 2.11. There exist universal constants δ > 0, p ∈ N such that for any ǫ, D > 0 and some constant C ǫ,D > 0.
In Section 3.1 we solve the MDE at α = z = 0 and then use this solution to show the equation is stable at this point. This stability is then used to show the solution is analytic in α and that the self-consistent density of states (defined below in (3.1)) associated to the MDE is bounded away from zero. In Section 3.2 we prove the following lemma and give properties of the solution to the matrix Dyson equation in a neighborhood of zero.
Lemma 2.12. Let ζ ∈ Ξ ∞ and (α, 0) ∈ Υ ζ , then With this regularity we expand the matrix Dyson equation in a series in α and express the order α terms through the order 1 terms, leading to an equation for ∂ α | α=0 Tr N E * 3 M ζ α (0)E 1 which we explicitly solve in Section 3.3, yielding the following formula: We conclude this section by putting together Lemmas 2.10 through 2.13 to prove Theorem 2.9.
Proof of Theorem 2.9. Using the triangle inequality By Lemma 2.10, for all ζ ∈ Ξ δ in Lemma 3.5 we will show P Ψ ζ = 1 : ∀ζ ∈ Ξ δ ≥ 1 − C D N D for any D with some constant C D . By Lemma 2.11, for every ζ ∈ Ξ δ we have for any D, ǫ > 0. By Lemmas 2.12 and 2.13, Choosing α = N −c ′ with a sufficiently small positive c ′ > 0 completes the proof.
To the solution of the MDE we associate an N -vector of positive semidefinite matrix-valued measures, v z,α j ∈ C 4×4 on the real line by where m ζ,α j is the 4 × 4 matrix whose (k, l) entry is the (j, j) entry of the (k, l) block of M ζ α (z). In [9], Proposition 3.2, the existence of these measures and that they are compactly supported is shown.
We define the self-consistent density of states to be the scalar-valued measure Tr v ζ,α j (dτ ).
The distance between two solutions to the MDE with nearby parameters can be bounded by showing that the MDE is stable. This stability is quantified by the norm of the inverse of the stability operator In [9] this operator was used to prove the local law and control the smoothness of the solution to the MDE in its parameters. Note that L α depends on ζ and z but we omit this dependence from the notation.

Exact solution at 0.
At α = 0, z = 0, |ζ 1 |, |ζ 2 | > 1 we define which solves (2.10). In this section we bound the stability operator associated to this solution. We will use this stability and the implicit function theorem in the next section (Lemma 3.3) to control the solution to the matrix Dyson equation in a neighborhood of zero. This shows that M ζ 0 (0) in fact agrees with the extension to the real line of the unique solution with positive imaginary part to (2.10). Alternatively, the analysis given in [8,14] could also be used to show this is the correct solution in the limit as ℑ(z) → 0. We now introduce some notation and conventions. Products and fractions of vectors are taken componentwise, e.g. (vw) i = v i w i . Given an N dimensional vector v, we define diag(v) to be the N × N matrix with v placed along the diagonal. Given a 4N × 4N matrix L we define the linear operator acting on 4N × 4N matrices C L [R] := LRL. Given an N × N matrix, T , and 1 ≤ k, l ≤ 4, define E kl [T ] to be the 4N × 4N block matrix with T in its (k, l) block and zeros elsewhere. We use the shorthand E kl = E kl [I]. The norm, · , when applied to matrices will denote the usual operator norm induced by the Euclidean metric and when applied to operators acting on matrices will denote the induced operator norm. The norm · sp on operators acting on matrices will denote the norm induced by the Hilbert-Schmidt norm on matrices.
We define the unitary matrix and the transformation matrix The following identities are easy to verify: where we introduce the operator The operator F first appeared in [4]. In the current context we only use it in the limit ℑ(z) → 0.
By direct calculation one sees that the eigenvalues with largest magnitude for the non-trivial irreducible components of the self-adjoint operator F are ± 1 |ζi||ζj | , for i, j each equaling 1 or 2, with corresponding eigenmatrices Now using that C U is a unitary operator we have the following bound . This bound in spectral norm can immediately be extended to a bound in the norm induced by operator norm on matrices by using inequality (4.55) in [4].
Note that the final estimate used the uniform lower bound on the left and right Perron-Frobenius eigenvectors of S.

Solution near 0.
In the following lemma we show the solution to the matrix Dyson equation can be analytically extended from the upper half plane to a neighborhood of zero. We will continue to denote this extension by M α ζ (z).
(2) the inverse of the stability operator satisfies the bound (3) when α and z are real, the solution M ζ α (z) is real. Before proving this lemma we show that Lemma 2.12 follows as an immediate corollary.
Proof of Lemma 2.12. Lemma 3.3 shows that M ζ α (0) is analytic in α and has a Taylor expansion in Υ ζ : The derivative in the second coordinate in the direction R is The second inequality of (B.1) is satisfied as the norms of operators are bounded by Once again the implied constant can be made sufficiently small with appropriate choice of c D . Thus by the implicit function theorem we conclude for each Additionally, the solution depends analytically on D and satisfies the bound Below we will show that M ζ α (z) has positive imaginary part when ℑ(z + A ζ + α) > 0, which by uniqueness of the solution to the MDE implies M ζ α (z) = M ζ α (z) and concludes the proof of part (1). Let S be the stability operator, associated to M ζ α (z). We now use (3.9) to estimate its inverse in D by perturbation theory. Taking the inverse of and choosing c M and c D such that M ζ 0 (0) − M ζ α (z) is sufficiently small gives the bound: which will prove part (2) once we verify M ζ α (z) = M ζ α (z) in Υ ζ . We now show the derivative along the imaginary axis of M ζ 0 (0) is positive, implying ℑ( M ζ α (iη)) is positive on the imaginary axis in a neighborhood of 0 and thus agrees with M ζ α (z) in this region and then the upper complex plane by analytic continuation. Taking the derivative in η of (2.10) gives: then using the formula for M ζ 0 (0) verifies the claim. For the remainder of the proof we use M ζ α (z) to denote the analytic extension of the solution to the MDE to Υ ζ .
To prove part (3) we follow the argument in the appendix of [9]. We reproduce it here using our bounds on the stability operator. From the matrix Dyson equation we see that the difference M ζ α (0) − M ζ α (z) satisfies: since the operator L 0 is invertible. Applying L −1 0 to (3.11) and using the implicit function theorem, as above but applied to the subspace of self-adjoint matrices, shows M ζ Proof. By Lemma 3.3, part (3), the imaginary part of M ζ α (z) is zero when (α, z) ∈ Υ ζ . Therefore at any such z, the self-consistent density of states, ρ α,ζ , is zero and we conclude the desired bound.
Proof of Lemma 2.13. Since this proof is performed at z = 0 we omit the z-dependence from our notation. By differentiating (2.10) with respect to α then setting α equal to zero and rearranging we arrive at By substituting M ζ 0 = −(A ζ ) −1 and using that only the (2, 4)-entry is mapped by L −1 0 C M ζ 0 to the (3, 1)-entry, a short calculation gives the expression taking the trace leads to the desired expression.
Recall S[v r ] = v r and S * [v l ] = v l are left and right eigenvectors for S. Furthermore define the spectral projection corresponding to the complement of the Perron-Frobenius eigenvalue.
then we further expand In particular, if the variance profile is constant, we have,

Local Law.
We now use the local law from [9] to show that when dist(supp(ρ α,ζ ), 0) is bounded away from zero, M ζ α (z) is indeed a good approximation of G ζ α (z). We first show that Corollary 3.4 can be combined with [9] to show there are no eigenvalues of H ζ α near zero. Lemma 3.5. There is a universal constant δ > 0 such that for any D and some positive constant C D .
Proof. For any fixed ζ ∈ Ξ δ and α such that (α, 0) ∈ Υ ζ , Corollary 3.4 gives a universal constant such that supp ρ α,ζ is contained in the interval [κ∆ 2 ζ , ∞). Combining this with an appropriate choice of δ in Ξ δ , Theorem 4.7 in [9] gives This argument holds for every element of a regular grid Γ with polynomial size in N in the set of (ζ, α) such that ζ ∈ Ξ δ , (α, 0) ∈ Υ ζ . Thus by a simple union bound we have On this grid, Γ, the lower bound on the spectrum of H ζ α implies the norm of the resolvent, G ζ α , is O(∆ −2 ζ ) with high probability.
Finally to extend the bound to arbitrary α, ζ, we take the nearest ( Choosing the grid to be sufficiently fine, we have the bound G ζ α (0) ∆ −2 ζ whenever this bound holds on the grid. Using that Spec |H ζ α | ⊂ [ G ζ α (0) −1 , ∞) concludes the proof. Proof of Lemma 2.11. For any fixed ζ ∈ Ξ δ and α such that (α, 0) ∈ Υ ζ from the local law Lemma B.1 (ii) of [9] there exists a p > 0 such that To extend the proof to all ζ ∈ Ξ δ and α such that (α, 0) ∈ Υ ζ we follow the same grid argument as in Lemma 3.5.
The necessary Lipschitz continuity of Tr N (E * 3 M ζ α (z)E 1 ) was established in Lemma 3.3 part (1). We also have the following corollary of Lemma 3.5.
Corollary 3.6. There is a universal constant δ > 0 and a constant C > 0 such that for any D and some positive constant C D .

Bound on Linearization.
Proof of Lemma 2.10. From the block structure of H ζ α we deduce the following block structure of G ζ α (iη) We use the η > 0 regularization of the inverse to ensure all inverses exist in our derivation. In the final estimate we will invoke the assumption that Spec |H ζ α | ⊂ [κ∆ 2 ζ /2, ∞) and safely set η = 0. We compute the (3, 1) block of G through To estimate the difference when α > 0 is small and to justify setting η = 0 we use the estimates: Bounding the normalized trace by the operator norm and applying the above estimates yields the desired bound for the resolvent on the event Spec |H ζ α | ⊂ [κ∆ 2 ζ /2, ∞).

Proof of Theorem 2.6 and 2.8.
In this section we consider differential equations ∂ t u t = −u t + gXu t with 0 < g ≤ 1 and initial value u 0 distributed uniformly on the sphere, and where for each t, B t is a vector of standard Brownian motions, independent of X.
In the first equation we consider the behavior of the squared norm in the large N limit, when averaged over initial conditions The case g = 1 is of particular interest, as the damping term and coupling term are balanced. Before continuing, we briefly consider the behavior when g = 1 of such an expression if X is replaced with W , a Wigner matrix, meaning W is Hermitian and the entries on and above the diagonal are independent, identically distributed random variables with variance 1/(4N ). Wigner's semicircle law asserts that the empirical spectral distribution is supported on [−1, 1] with density (2/π) 1 − |x| 2 .
In this case, as W = W * the limiting analogous expression for (4.1) can be expressed as an integral against the limit of the empirical density. At large times, the behavior decays like t −3/2 . Indeed, by the semicircle law, as t → ∞, we have: where in the second line we do the change of variables x → 1 − x and extend gain the error term by extending the bounds of integration to infinity.
From this calculation we see the asymptotic behavior of u t in the Hermitian case is governed by the behavior of its density at the edge. For random Wigner type matrices with a variance profile the square root behavior at the edge is universal [3]. Thus the t −3/2 scaling is universal in the Hermitian case.
We now return to the non-Hermitian setting.
Proof of Theorem 2.6. By Cauchy's Theorem, where γ is a circle that enclosed all the eigenvalues of X, traversed clockwise and γ is the same circle traversed counterclockwise. We apply Theorem 2.9 and then the decomposition in (3.13), computing each term separately. For t ≤ N δ we have where |ǫ N | < N −ξ with probability 1 − N −D for any D.
The first term in the integral is: where J 0 is the Bessel function of the first kind. For the second term in the integral we will use Lemma A.1 to deform the contour into the region with modulus less than one. where we have used that all the poles of the integrand lie in the disk centered at the origin of radius 1 − 2ǫ and that |e t(gζ1+gζ2−2) | e −2(1−g−gǫ)t , where ǫ is obtained in Lemma A.1.
We now turn to differential equation in stationarity and compute the autocorrelation function.
Proof of Theorem 2.8. The following computation of R(0) shows that the solution (2.4) has finite variance with respect to B t for large N with high probability taken with respect to X.
The autocorrelation of the this solution is: Once again we apply the decomposition in (3.13). The first term is The subleading term is bounded as before, giving faster exponential decay.
We conclude that with P the orthogonal projection onto (x, 0), (0, x) ⊥ . Using the spectral gap of the matrix (see for instance Lemma 5.6 in [2]) for some ǫ ∼ 1 implies (A.1).

Appendix B. Implicit Function Theorem.
Lemma B.1 (IFT). Suppose C A and C B are equipped with norms that we both denote by · and let the linear operators mapping between these spaces be equipped by the induced operator norms. Let ǫ A , ǫ B > 0, a 0 ∈ C A , b 0 ∈ C B and T : B A ǫA (a 0 ) × B B ǫB (b 0 ) → C A be a continuously differentiable function with invertible derivative ∇ A T (a 0 , b 0 ) with respect to the first argument at the origin and T (a 0 , b 0 ) = 0. Suppose that where the supremum is taken over (a, b) ∈ B A ǫA (a 0 ) × B B ǫB (b 0 ). Then there is a unique function f : The function f is continuously differentiable. If T is analytic then so is f .