Continuum Approximation of Invasion Probabilities

In the last decade there has been growing criticism of the use of Stochastic Differential Equations (SDEs) to approximate discrete state-space, continuous-time Markov chain population models. In particular, several authors have demonstrated the failure of Diffusion Approximation, as it is often called, to approximate expected extinction times for populations that start in a quasi-stationary state. In this work we investigate a related, but distinct, population dynamics property for which Diffusion Approximation fails: invasion probabilities. We consider the situation in which a few individual are introduced into a population and ask whether their collective lineage can successfully invade. Because the population count is so small during the critical period of success or failure, the process is intrinsically stochastic and discrete. In addition to demonstrating how and why the Diffusion Approximation fails in the large population limit, we contrast this analysis with that of a sometimes more successful alternative WKB-like approach. Through numerical investigations, we also study how these approximations perform in an important intermediate regime. In a surprise, we find that there are times when the Diffusion Approximation performs well: particularly when parameters are near-critical and the population size is small to intermediate.

1. Introduction. Invasion events are fundamental in population biology. In the study of disease there are interesting examples both at the cellular and wholeorganism level: while studying the onset or recovery of an infection, one might look at the probability that a single virion can infect a target cell and proliferate [9]; in epidemiology, the goal might be to estimate the probability that a newly introduced pathogen will become endemic in a naïve host population [4]. The same multi-scale interest in invasions appears in the study of population genetics: at the multi-organism scale one studies the probability that a novel allele can fix in a population, possibly affecting the population's overall fitness [17,21,25]; at the cellular level, the invasion of a mutation in a stem cell population has been studied as an important first step in certain cancers [27]. What distinguishes the dynamics at different scales are the statedependent rates of events. While multi-organism models tend to feature movement and direct interaction among individuals [7], cellular dynamics may be more spatially static with indirect interactions (competition for resources, or signaling at a distance, for example) [29].
It is natural to model these population dynamics using continuous-time, discretestate-space Markov chains. It is easy to encode nonlinear interactions through statedependent transition rates and there are numerous straightforward simulation techniques available that can be exact, but costly [13], or inexact (with respect to boundary interactions), but efficient [3]. It is also straightforward to write down difference equations whose solutions represent the probability of invasion from a small number of individuals, or the mean extinction time starting from a large population size. Often when the population size is small or intermediate, and when the parameters of the problem form a system that is near-critical, we show numerically that the Diffusion Approximation captures invasion probabilities much better than the Exponential Approximation. Rather than declare one approximation or another the "winner" we think it would be most useful for practitioners to keep in mind the tradeoffs involved in choosing which method to implement in their own work.
1.1. Mathematical Framework. For a given population scale N , consider a continuous-time Markov chain X N (t) that takes its values in the set of integers n ∈ N. We consider the class of models with transitions of size one and the transition rates from the state n are given by n → n + 1 at rate λ N (n) n → n − 1 at rate µ N (n) with λ N (0) = µ N (0) = 0. While λ N (n) = 0 is allowable for some n > 0, we require that the death rate satisfies µ N (n) > 0 for all n ∈ N. It follows that 0 is the only absorbing state. Assumption 1.1 (Transition rate shape functions). There exist functions λ : R + → R + and µ : R + → R + such that λ N (n) = N λ n N and µ N (n) = N µ n N .
We will refer to x * as the minimal rate-balanced point. When the system is discrete, rate-balanced points are not always precisely achieved. For a given system size N , we define the discrete analogue N * as follows.
If the transition rate shape functions are differentiable at zero with λ (0) > 0 and µ (0) > 0, then we call the system asymptotically linear.
Example 1 (Stochastic epidemics). For a population size N , and constants b > 0 and d > 0, the size of the infectious population in a stochastic (non-density dependent) Susceptible-Infectious-Susceptible system is defined by the transition rates n → n + 1 at rate bn (N − n) N n → n − 1 at rate dn. The minimal rate-balanced point x * = 1 − d/b, and for the process X N , Example 2 (Density-dependent mortality). In this case, the population scale N plays the role of the carrying capacity of an environment. Again, let d > 0 and let b > 0 be the growth rate of the population when scarcity of resources is not a factor. Then we consider the dynamics set by the rate shape functions λ(x) = bx and µ(x) = dx 2 for x ∈ [0, ∞). We note that this is the standard logistic growth model with the intrinsic growth rate parameter b = r and death rate d = r/K for a carrying capacity K. The resulting transition rates are similar to the Logistic Branching Process presented by Amaury Lambert (2005) [20]. The rate-balanced value for this system is x * = b/d, and for the process X N , we have N * = bN/d Example 3 (Resource-constrained birth). Let b > 0, d > 0 and a > 0. We define our dynamics according to the rate shape functions Example 4 (General nonlinear single term model). Let b, β, d, δ > 0 with β = δ. We define our dynamics according to the rate shape functions λ(x) = bx β and µ(x) = dx δ for x ∈ [0, ∞). If β < δ we say that λ has the leading order. If β > δ we say that µ has the leading order. In this case, x * = (b/d) β−δ , and for the process X N , While conducting our asymptotic analysis, we will often prefer an alternative assumption for the transition rate functions that features explicit series expansions for the birth and the death rates. Assumption 1.3 (Series representation for the rate shape functions). Let λ and µ be as in Assumption 1.1. Additionally, we assume that there exist constants b, β, d, δ, > 0 with {b n } n∈N , {d n } n∈N ⊂ R, and an integer m ∈ N, such that For simplicity in the analysis, we also assume that β and δ are rational numbers and m is chosen such that βm and δm are integer values.

Approximations for Invasion
Probabilities. The conditions of Assumption 1.1 ensure that as N → ∞, the stochastic processes X N (t) converge pathwise to an associated ODE. Kurtz [19] defines the sense of this convergence rigorously as follows: Let λ and µ be locally Lipschitz functions. Define Y N (t) = 1 N X N (t), and suppose Y N (0) → x 0 . Define x(t) to be the solution tȯ A visual demonstration of this result is shown in Figure 1 for the stochastic Susceptible-Infectious-Susceptible epidemic model (Example 1 above). Conditioned on not hitting zero, the solutions converge to the heteroclinic connection between zero and the minimal rate-balanced point x * . In our case, zero is an unstable fixed point of the limiting ODE. Thus, there is an apparent contradiction since, although a solution with initial condition zero should be zero for all t ≥ 0, the fraction of solutions that avoid extinction does not go to zero as N → ∞. This conflict is resolved by noting that the theorem above only applies for a fixed time window. It is indeed true that for any fixed time t, X N (t) → 0 as N → ∞. However, it is also true that, conditioned on successful invasion, the hitting time for N * (Definition 1.2) tends to infinity as well.
We call the probability of hitting N * before 0 the invasion probability and adopt the following notation. Definition 1.4. For the process X N (t), we define the hitting time τ N as follows: We define the associated extinction probabilities q N (k), k ∈ {0, 1, . . . , N * } by From this we define the asymptotic probability of invasion p invasion (k) (conditioned on k individuals at time zero) to be It is possible to obtain an exact solution for these invasion probabilities, and we provide a formula in Section 2. In 1983, Frank Ball [4] introduced a slightly different definition of an invasion probability (which he referred to as the probability of a true epidemic) and, in the asymptotically linear case, proved that the probability of a true epidemic converges to the survival probability of an associated branching process. To define this associated branching process, consider the number of offspring that any given individual gives birth to before expiring. In the asymptotically linear case, the leading order terms of the birth and death rates are b and d, respectively. Therefore, the number of births before the individual dies has a Geometric distribution with "success probability" d/(b + d).
Definition 1.5 (Branching Process Approximation). Suppose that the transition rate shape functions λ and µ satisfy Assumptions 1.1 and 1.3 and that the leading order exponents satisfy β = δ = 1. Define the discrete time stochastic process {Z n } n≥0 to be Z 0 = k and for N ∈ N, where ξ i,j ∼ Geom(d/(b+d)), (i, j ∈ N) are independent and identically distributed.
Then, we define Ball [4] proved that p invasion (k) = p branch (k) in this setting. What the Branching Process Approximation lacks is a clear interpretation in asymptotically nonlinear cases (e.g. Example 4) and a means to consider what the invasion probability is for finite N . For these cases we consider the Diffusion Approximation and the Exponential Approximation. The derivations of these approximations are given in Sections 3 and 4, respectively. In what follows we introduce the small parameter , which corresponds to the inverse of the population scale parameter N . Table 1 Summary of asymptotic invasion probabilities (see Definition 1.4). * These results are only proved for the asymptotically linear case (i.e. when β = δ = 1). † We define the Exponential Approximation to be zero for subcritical regimes because the asymptotic limit tends to zero, but the approach to zero is through negative values.
Definition 1.6 (Diffusion Approximation). Let > 0 be given. Then for x ∈ (0, x * ), let u (x) be the solution to the boundary value problem with u (0) = 1 and u (x * ) = 0. Then, setting = 1/N , we define the Diffusion Approximation for invasion probabilities to be the relationship Definition 1.7 (Exponential Approximation). Let > 0 be given. Then for x ∈ (0, x * ), define w (x) to be where r (x) satisfies . Then the Exponential Approximation for invasion probabilities is defined by the relationship In Table 1 we summarize the results in Sections 3 and 4. Consistent with asymptotic analyses of the Diffusion Approximation for corresponding extinction probabilities, we find that the Diffusion Approximation disagrees with the Branching Process Approximation in the large N limit. Remarkably, when the leading orders of the transition rate shape functions λ and µ do not match, the Diffusion Approximation reduces to two cases, ignoring all detailed information contained in the rate functions. In contrast, the Exponential Approximation does provide the right large N limit. We note however, that the Exponential Approximation is only well-defined in the supercritical case.
In Section 5 we investigate the problem numerically and find mixed results. In the large N limit, the Exponential Approximation agrees with the Markov chain model, even in the asymptotically nonlinear cases we consider. However, when the population size parameter N is of small or intermediate size, say N = 50, it is common that the Diffusion Approximation more faithfully represents the Markov chain model, especially when the parameter set is near-critical.

Exact solution for invasion probabilities.
Proposition 2.1. For fixed N ∈ N, let X N (t) be a CTMC with transition rate shape functions λ and µ as defined in Assumption 1.1. Let q N (n) N * n=0 be the extinction probabilities as defined in Definition 1.4. Then the collection q N (n) N * n=0 satisfies the system of difference equations (8) λ N (n)q N (n + 1) + µ N (n)q N (n − 1) − (λ N (n) + µ N (n))q N (n) = 0, with q N (0) = 1 and q N (N * ) = 0.
Proof. Though the proof appears in standard texts, we include it for completeness. Fix N ∈ N and let an initial condition n ∈ {1, . . . , N * − 1} be given. Define T := inf{t > 0 : X N (t) = n}, the first time the process changes states. Then, applying the Strong Markov property, we have Applying the q N (n) notation from Definition 1.4 and multiplying through by λ N (n) + µ N (n), we have (8), where q N (0) = P {X(τ N ) = 0 | X N (0) = 0} = 1 and q N (N * ) = 0.
The following argument for finding the exact solution is similar to one presented on birth-death processes in Norris [23], for example, and has a similar form to the exact solutions for mean extinction times presented by Doering et al [11,12].
Recalling Definition 1.4, we have the boundary conditions q 0 = 1 and q N * = 0. It follows that Noticing that From this recursion we obtain and since q 0 = 1, the result follows.

Motivations for the Diffusion Approximation.
There are two points of view on the derivation of the Diffusion Approximation for invasion probabilities. They yield the same result. The first point of view (Motivation 1, below) is common in the physics literature [6]. It involves substituting a smooth function u into the difference equation (8) and then converting it to a differential equation by writing out Taylor expansions and matching terms. In a second point of view (Motivation 2, below), we define a Stochastic Differential Equation whose infinitesimal first and second moments are defined to match that of the original birth-death chain in a sense discussed in [26] and [2]. Then the same Diffusion Approximation results from computing the probability that this SDE hits the rate-balanced state before going extinct, starting from the value k = k/N , where k is the number of initially introduced individuals.

Motivation 1: Taylor series approximation
To begin this approach, we assume that u(x) is a smooth function and appeal to the following formal argument. We start with the difference equation λ n q n+1 − (λ n + µ n )q n + µ n q n−1 = 0 and rewrite it using the rate shape functions according to Assumption 1.1, λ n = N λ(n/N ) and µ n = N µ(n/N ), along with q n = u(n/N ): Writing x = n/N and = 1/N , We then apply a Taylor series approximation to the u(x + ) and u(x − ) terms: Substituting these into (9) and neglecting the higher order terms, we have which is the form seen in Definition 1.6.
Motivation 2: Hitting probabilities for an SDE approximation Let = 1/N and y = n/N , and consider the following calculation for the infinitesimal mean: In the last equality we used Assumption 1.1: λ N (n) = N λ(n/N ) and µ N (n) = N µ(n/N ). Note that the infinitesimal mean does not, in fact, depend on N . On the other hand, a factor of 1/ √ N does appear in the calculation of the infinitesimal second moment: Using these values, we define the SDE approximation as follows.
Proof. The proof follows quickly from stochastic calculus. For a general continuous sample-path SDE with infinitesimal generator L, the probability u(y) that the process hits a value a before b starting from y ∈ [a, b] satisfies the boundary value problem L u (y) = 0, for y ∈ [a, b], with u(a) = 1 and u(b) = 0 [18]. It remains to note that the generator of Y (t) is 3.2. Analysis of p diffusion (k). It is useful to introduce the functions We note that ψ(x) in (12) can be expressed exclusively in terms of the ratio ρ(x) : However, in order to retain the ability to directly interpret results in terms of explicit birth and death rate parameters, we generally perform our analyses and present our results using the rate shape functions λ and µ directly, as they are characterized in Assumption 1.
1. An exception arises in Section 4, where we find that it is most natural to analyze the Exponential Approximation in terms of the ratio of rate functions ρ(x).
Proposition 3.3. Let u (x) be the Diffusion Approximation for the extinction probabilities, satisfying (3) of Definition 1.6. Then Proof. It we let h(x) = u (x) then the ODE (3) becomes which has general solution h(x) = C 1 e −Ψ(x)/ . Integrating to get u (x) = x 0 h(y) dy and enforcing the boundary conditions u (0) = 1 and u (x * ) = 1 yields the solution (13).
We note that a similar formula can be found in Pakmadan et al. [24].
We summarize the asymptotic behavior of the Diffusion Approximation as follows.
Theorem 3.4. Suppose that the transition rate shape functions λ and µ each admit a series expansion as defined by Assumption 1.3. If the leading order exponents β and δ are equal, then for k ∈ N, Otherwise, if β = δ, then Remarkably, the Diffusion Approximation essentially ignores the detailed structure of the rate functions when they have different leading orders.
The integrals that appear in (13) are in a form compatible with applying Watson's Lemma [28]. In fact, Watson's Lemma can be applied directly to the denominator. The subtlety in the present analysis is that, in the numerator, the upper limit of integration depends on . To proceed, we follow Laplace's method and introduce a substitution. Suppose first that we are in the supercritical case, as defined by Assumption 1.1, so that for all x ∈ (0, x * ) we have that ψ(x) > 0. It follows that Ψ(x) is increasing on that interval and therefore invertible. Letting t = Ψ(y) we have (14) x 0 e −Ψ(y)/ dy = .
The proof follows from a combination of three lemmas. In Lemma 3.7 we assume that f (t) admits a series expansion and then perform an asymptotic analysis on the ratios that appear in integrals of the form (14) and (16). In Lemmas 3.5 and 3.6 we establish that ψ(x) and f (t) admit series expansions when λ(x) and µ(x) do, and provide the coefficients and powers of the first few terms. After presenting these lemmas and their proofs, we complete the proof of Theorem 3.4.
Lemma 3.5. Suppose that the transition rate shape functions λ and µ satisfy Assumption 1.3. Then there exists an expansion for ψ(x) of the form and 0, otherwise, and the value of c is given in the following table.
Proof. Substituting the expansions (1) in for λ and µ, we have (20) ψ We can factor out x β = x δ from the numerator and denominator. Then (20) simplifies to where ζ ± n := (bb n ± dd n )/(b ± d). Following the notation introduced in Proposition A.2 of the Appendix, we have c n x n/m so that in the notation of (18), we have c = 2(b − d)/(b + d) and κ = 0. The remaining coefficients can be expressed in terms of { ζ + n } and {ζ − n }: In this case we can factor bx β = dx δ from the numerator and denominator of (20), which leaves The form of ξ + n follows from Proposition A.2 in the Appendix. It follows that ψ(x) can be written in the form of (18) with the leading order exponent being κ/m and Case 3: β < δ In this case we begin by factoring out bx β from the numerator and denominator of equation (20). After cancellation we have From this form and Proposition A.2, we see that κ = 0 and c = 2. Furthermore, the existence of the sequence c n follows from our assumption that (β − δ)m is an integer.
Case 4: β > δ In this case we begin by factoring out dx δ from the numerator and denominator of equation (20). After cancellation and some rearrangement we have Similar to Case 3, we see that κ = 0 and c = −2.
From here, the proof proceeds in two steps. First we need to rewrite the integral Ψ(x) presented in (12) in a form that is compatible with Lemma 3.7.
Lemma 3.6. Suppose that the shape functions λ and µ satisfy Assumption 1.3 and we write ψ as described in Lemma 3.5. Furthermore, let κ ∈ Z + be as defined in Lemma 3.5. Then f (t) admits a series expansion of the form where m is the positive integer defined in Assumption 1.3 and Proof. It will be convenient to write the series expansion for ψ in the form with the convention that c 0 = 1. For the main part of the proof, we will take the c > 0 case. At the end, we will explain how the argument changes when c < 0. Integrating term-by-term, we have It will be useful to define h(x) := Ψ(x m ) with a series expansion written as h n x n+κ+m , where h n = cc n m n + κ + m .
Since c > 0, Ψ(x) is increasing on [0, x * ) and therefore h(x) is increasing on [0, x , and so on. We now turn our attention to the integral of interest, introducing the substitution z = y 1/m , we have Then, taking t = h(z), and observing that d dz h(z) = d dz Ψ(z m ) = ψ(z m )mz m−1 , we have To determine the series expansion for f (t) := 1/ψ h −1 (t) m , we need to analyze the expansion for the denominator: It is helpful to rewrite h −1 (t) as follows Recalling that ψ(x) = cx κ/m + cc 1 x (κ+1)/m + O(x (κ+2)/m ), we observe Then by applying Proposition A.1 twice, we obtain Proposition A.2 then yields After simplifying, the stated results for a, α and a 1 hold. When c < 0, we replace ψ and Ψ with ψ r and Ψ r respectively. The leading term of ψ r is then −c, which is positive and exact same procedure holds. Lemma 3.7. Suppose that f : R → R admits a series expansion of the form (21) with a > 0, α > −1, {a n } ∞ n=1 ⊂ R, κ ∈ Z + and m ∈ N. Furthermore, let g : R → R be a continuously differentiable function that is monotonically increasing or decreasing for all t ∈ (0, t * ), with g(0) = 0. Define Our method is to replace f (t) with its series expansion and consider the integration term-by-term. Our results will be expressed in terms of the incomplete gamma function Writing G for g(k ) and g(t * ) respectively, it follows that the numerator and the denominator appearing in In taking the ratio, the coefficients a α+1 cancel, and neglecting higher order terms, we have When α = 0, this takes the form For the plus-sign case, we have Since g is continuous, let M := max t∈[0,t * /2] g (t). Then by the Taylor Remainder Theorem, since g(0) = 0 we have for some ξ ∈ (0, t * /2). It follows that g(k ) ≤ kM for all t ∈ [0, t * /2]. So for the numerator we have To study the denominator, we again make the substitution y = t/ and obtain Noticing that the integral on the right hand side diverges to infinity, we rewrite the expression and apply l'Hôpital's rule to find that the denominator diverges to infinity as → 0. It follows that the ratio tends to 0 as → 0.
We introduced the notation f and g to emphasize that the contribution to the final value comes from the limit of integration, not the function f in the integrand. Noticing the familiar form of this expression, we apply Lemma 3.6 with g set to Ψ to rewrite p diffusion (k) in a form compatible with Lemma 3.7.
4. Exponential approximation. Doering et al [11,12] demonstrated that making a WKB-type ansatz of the form q n ≈ σ n e −Vn for some functions σ and V , can be an accurate method for constructing a continuum approximation for solving Kolmogorov equations. In the motivation that follows we provide a different analytical justification for this observation than has been presented elsewhere. We do this by transforming the system of equations defined by (8) into an equation for ratios instead of differences, then applying a Taylor series expansion technique similar to what is presented as Motivation 2 for the Diffusion Approximation.
Since q N (0) = 1, we can write We can rewrite the exponent on the right-hand side to look like a Riemann sum: We note that the above form of the extinction probability is identical to that of the mean extinction time found in Doering et al. (2007) [12], where in our case ∆V n = − ln(a N (k)). Mimicking the continuum approximations and Taylor expansions presented as motivation for the Diffusion Approximation, we introduce the function r (x) defined below. For x = n/N and = 1/N , we will write (27) a N (n) ≈ r n N = r (x).
ln r (y) dy and the heuristic assertion is that (26) becomes We note that this is the same form as presented in Definition 1.7 but it remains to characterize the function r (x). Indeed, following a Taylor series expansion technique, we will show that r (x) should have the form Dividing (25) through by µ N (n) and multiplying by a N (n + 1) we have λ N (n) µ N (n) + a N (n)a N (n + 1) = λ N (n) µ N (n) + 1 a N (n + 1).
Now we perform a Taylor expansion on the r 0 (x + ), r 1 (x + ), and r 2 (x + ) terms.
Organizing the terms by powers of , we find the following system of equations: The first of these equations yields two solutions for r 0 (x): r 0 (x) = ρ(x) and r 0 (x) = 1.
To show that the former must be true, consider that if r 0 (x) ≡ 1, then r 0 (x) ≡ 0. Substituting into the 1 equation, it follows that r 1 (x) = 0. Continuing in this way, we find r i (x) = 0 for all i ≥ 1. This corresponds to the solution q N (n) = 1 for all n ∈ {0, . . . , N * }, which does not satisfy the boundary condition q N (N * ) = 0. Therefore, we adopt the solution r 0 (x) = ρ(x). Feeding this into the order 1 equation allows us to solve for r 1 (x). Similarly, substituting these solutions into the 2 equation yields a solution for r 2 (x) and so on:

Analysis of p exp (k)
. For the analysis that follows, we define what we call an nth order exponential approximation. Let (30) r ,n (x) = r 0 (x) + r 1 (x) + · · · + n r n (x) where the functions r j (x) are defined according to the procedure given in the preceding section. Then, we define Theorem 4.1 (Leading Order Exponential Approximation for Supercritical Systems). Suppose that the transition rate shape functions λ and µ form a supercritical system, as defined in Assumption 1.1. Further suppose that λ and µ each admit a series expansion, as defined by Assumption 1.3.
Remark 4.2. When λ and µ form a subcritical system this method of approximation is not well defined. In this regime, the function w ,n (k ) is often greater than one, meaning that 1 − w ,n (k ) is negative and not a reasonable estimate for a probability. For example, note that this is what happens to equation (31) when β = δ and b < d. The proposed value for p invasion (k) is negative and cannot be an invasion probability. A practitioner could take the Exponential Approximation to be zero whenever the system of study is subcritical.
The proof of Theorem 4.1 is an immediate results of the following lemma. Lemma 4.3. Suppose that the shape functions λ and µ each admit a series expansion as defined by Assumption 1.3, then w ,0 (k ) has the form: Proof. First we find a series expansion for ρ(x). From the definition of ρ in equation (29) and the series expansions for λ and µ assumed in Assumption 1.3, we have By applying Proposition A.2 and separating out the leading order terms, we obtain Collecting ordered terms yields the expansion To consider the leading order approximation we use one term of r (x), i.e. r (x) = ρ(x). Then by plugging equation (33) into Definition 1.7, we have: which can then be expressed in terms of having k individuals introduced Considering the first term in the product of equation (34), Then using an expansion for ln(1 + x) and integrating the result, the third term in the product becomes Thus the third term in the product of equation (34) can be rewritten as Rearranging and integrating the exponent from the second term in equation (34), we have Therefore, the second term becomes The result follows by rewriting equation (34) using equations (35), (36), and (37).
In Lemma 4.3, we show that the the Exponential Approximation exhibits a nontrivial dependence on k, the number of individuals introduced in the population. This dependence results from the relationship between the number of introduced individuals and the population scale N . We explicitly included the number of individuals introduced as a fraction of the population (k ) in the error term to record that error terms do not necessarily approach zero with if the number of individuals introduced is also allowed to vary.
To conclude our analysis, we return our attention to the functions r n (x) contained within equation (30). While we do not currently have a complete result for analyzing higher-order exponential approximations p exp,n (k) with n > 0, we would like to report the following recursion formula that can be used to generate these terms.
Proof. We begin by recalling the equation We then rewrite r (x) as a series in powers of Substituting in Taylor expansions for the r i (x + ) terms we obtain Since r 0 (x) = ρ(x) this reduces to To solve for r n (x) we will consider the n terms. First we consider the term n i=0 n j=0 i+j r (j) i (x) j! and notice that the n terms are contained in the series Now considering the second term in (39), we proceed by isolating the n−i terms from the double summation in that product Thus, the n terms in (39) can be expressed as It follows from (39) that Then for the first term we have After also separating out the r n (x) term from the second series we have The final result is obtained by solving for r n (x).

Numerical observations for invasion probability approximations.
In the preceding sections, our main analytical results for the Branching Process, Diffusion, and Exponential Approximations focus on the large population (N → ∞) limit. In this section, we validate our asymptotic results numerically and then shift our attention to consider the behavior of the approximations when the population size is small or intermediate. We challenge the methods in their ability to best approximate the exact solution as defined and calculated in Section 2. We focus our investigation on the specific examples presented in the introduction, and obtain exact solutions for the approximations whenever possible. We use numerical integration when it is not possible to obtain an exact solution for an approximation. In this way, we verify our analytical results and explore how the conditions of a system (e.g. population size and subcritical vs. supercritical dynamics) have an impact on whether a particular approximation method should be deemed fit for use. We set the death rate parameter (d) to be one by default.

Diffusion approximation methods fail for large populations when dynamics are supercritical.
To complement our analysis in the previous section, we used exact solutions when possible and numerical integration when necessary (Simpson's method coded in R) to evaluate the Diffusion Approximation and Exponential Approximation for invasion probabilities. In Figure 2, the results are shown for Examples 1, 2, and 3 for a range of population sizes and parameter value choices. We chose the exponents and coefficients so that the dynamics are supercritical and the Exponential Approximation is well defined. In each panel, we highlight that as the total population size becomes large, the values calculated numerically approach their corresponding analytically determined limit (indicated by "*"). , Example 2 density dependent mortality (β = 1 < δ = 2, middle row), and Example 3 resourceconstrained birth (β = δ = 1, bottom row). In all panels, we set the death rate parameter d = 1. For the resource-constrained birth example, we set a = 1. Numerical integration was used to calculate the values for both approximation techniques for Example 3.
For the supercritical systems displayed in Figure 2 the Diffusion Approximation yields a different answer than the exact solution in the large population size limit. This discrepancy between the Diffusion Approximation and the exact solution confirms our asymptotic analysis and is most apparent when the leading coefficients of the birth and death rates are dissimilar. As displayed in the last column of Figure 2, the Diffusion Approximation fails to match the exact solution when the dynamics are far from critical. For population sizes greater than 100 in Figure 3, the Diffusion Approximation fails to match the exact solution for both near critical and far from critical dynamics. The inability of the Diffusion Approximation to approach one is predicted by our asymptotic result that p diffusion (k) = 1 − e −2k when β < δ. This can be seen in the middle row of Figure 2 and the top row of Figure 3. These plots also highlight the phenomenon that, if the leading order exponents satisfy β < δ, then in the large population limit, the Diffusion Approximation completely ignores the parameters of the rate functions.
By contrast, the Diffusion Approximation succeeds in characterizing the large population size behavior when the dynamics are dominated by the death rate (µ is leading order). This is seen in the bottom row of Figure  x x 3. Mismatched Leading Orders. Invasion probabilities for Example 4, in which the birth and death rates are single terms with mismatched exponents. In the top row, the leading order is in the birth rate equation λ(x). In the bottom row, the leading order is in the death rate equation µ(x). Since the Exponential Approximation is invalid in the subcritical case, we have omitted the approximation from the bottom row, where β > δ.

Diffusion approximation methods can work well for small populations that exhibit near critical dynamics.
In the examples we have studied, the Diffusion Approximation consistently outperforms the Exponential Approximation when the population is small and the parameter set is near critical. This result is first demonstrated in Figure 2. Moving from left to right in this multi-panel figure, the parameter characterizing the birth rate (b) moves farther away from the death rate parameter (d) which is set to one by default. As such, the first column shows that the Diffusion Approximation closely approximates the exact solution for the invasion probability for small population sizes.
The Diffusion Approximation's transition from exceptional to poor performance is even more clearly demonstrated when studying Example 4. In the top row of Figure  3 we see that the Diffusion Approximation captures non-monotonic features of the exact solution that the Exponential Approximation misses entirely. In this example, the dynamics are dictated by the birth rate since λ features the lower leading order term. When the dynamics are near critical b = 0.7 and b = 1.1, there is a range of small population sizes where the Diffusion Approximation tracks the exact solution.
5.3. When leading order terms match, higher order terms matter: for small, but not large population sizes. When the leading order of the "birth" and "death" rates are the same and their leading coefficients are equal, subtleties in the outcomes are determined by the first pair of mismatched coefficients. This result is displayed prominently in Figure 4 for which we chose both rate functions to be asymptotically linear (β = δ = 1), to share the same leading coefficient (b = d = 1), but to have different values for the coefficients b 1 and d 1 (see Assumption 1.3).  Fig. 4. Identical Leading Order Terms. Comparison between the diffusion and exponential approximations when both rate equations are asymptotically linear (β = δ = 1) and the leading coefficients match (b = d = 1). In both panels, the invasion probability tends to zero as the population size (N ) becomes large. Differences between the left (supercritical) and right (subcritical) panels are driven by the leading pair of terms with mismatched coefficients (b 1 = d 1 ).
In the left panel of Figure 4, b 1 > d 1 and the system is supercritical (as defined in Assumption 1.1). In this case, both the Diffusion Approximation and Exponential Approximation are well defined. As the population size becomes large, the approximations correctly predict that the invasion probability approaches zero. The corresponding analytical results are presented in Theorems 3.4 and 4.1, respectively, with their asymptotic limits indicated in the plot as black and gray "*"s. Limitations of the numerical integration procedure prevent us from displaying the Diffusion Approximation for the full range of population sizes (i.e. up to 10,000). The main difference between the two approximations in this regime is the rate of convergence to zero, with respect to increases in population size. Consistent with the Diffusion Approximation's success for relatively small population sizes, it initially tracks the exact solution.
In the right panel of Figure 4, b 1 < d 1 and the dynamics are subcritical. In Theorem 4.1, we noted that the Exponential Approximation does not hold when b < d. As a direct consequence of equation (32), we further observe that the Exponential Approximation will be invalid for sufficiently small population sizes. In particular, when d = b and β = δ, we have Since the Exponential Approximation is 1 − w (k ), this returns an invalid value less than zero. From a practical point-of-view, one could simply define the Exponential Approximation to be zero in such a circumstance.

5.4.
Approximation success depends on the initial number of individuals introduced in the population. When more than one individual is initially introduced in a population, the probability of invasion increases. In Figure 5, we display results for each approximation method along with the exact solution for the probability of extinction when 1 ≤ k < N x * individuals are initially introduced in the population. By definition, for all larger values of k, the invasion probability is one.
As shown in the top row of Figure 5, when the leading order terms match, the Diffusion Approximation performs well and tracks the exact solution over the full range of initial numbers of individuals introduced. Typically the Exponential Approximation with two terms (dark gray dashed line) better approximates the exact solution than the Exponential Approximation (light gray dashed line). However, when k is close to N x * the Exponential Approximation with two terms sharply turns up, away from the exact solution. This numerical result is in line with expectations from our analytical results in equation (28) since the additional higher order term is undefined for ρ(x) = 1, i.e. when λ(x) = µ(x).
In special cases, it is possible to compute the Exponential Approximation by hand using Definition 1.7. We validated the Exponential Approximation for the epidemic extinction probability (Example 1) by comparing the exact result for the leading order approximation, with the numerical integration result plotted in Figure 5. There was not a visible difference between the value found by hand and the numerically obtained value.
We also compared the exact and numerical values for the Exponential Approximation with the analytical approximation found in Lemma 4.3. During this investigation, we found that since the range of potential number of introduced individuals scales with the total population size, it is important to keep track of the k parameter in the error term in equation (32). As k approaches N x * , the error term becomes significant (even for large population sizes). For fixed k, the population size can be chosen large enough to yield an approximation with the desired level of accuracy.
6. Conclusion. Applied mathematicians, physicists, and biologists use a variety of techniques to approximate Markov chain models for population processes. The widespread use of SDEs, in particular, raises the question of whether these approximations faithfully reproduce results for basic probability questions, especially, hitting times and splitting probabilities. In this work we focused on studying the probability that the lineage of a few newly introduced individuals will invade a larger population. We analyzed the Diffusion Approximation and a simple Exponential Approximation rigorously in the large population limit and numerically for finite size populations. We have were able to show that both population size and the state of sub-versus super-criticality play an important role in determining which approximation method performs better.
Similar to recent analogous results for mean hitting times, we found that, in the large population limit, the Diffusion Approximation does not agree with the asymptotic invasion probability of the original Markov chain system (Theorem 3.4). In fact, when the leading order terms are mismatched (see the case when β = δ in Theorem 3.4 and Example 2 in the bottom row of Figure 5), the Diffusion Approximation takes on a value that does not depend on the parameters of the Markov chain's rate functions at all. Interestingly though, when the dynamics are near critical and the population of interest is small, we found that the Diffusion Approximation often performs quite well. This can be seen throughout the figures generated by our numerical investigation, which is described in Section 5. By contrast, for supercritical systems that are far from critical, the Exponential Approximation nearly matches the exact solution for large populations, while the Diffusion Approximation visibly misses the mark. The rigorous expression of this observation can be found in Theorem 4.1. There, we show that for asymptotically linear supercritical systems the Exponential Approximation provides the correct limiting result. This is displayed for a stochastic epidemic model (Example 1) and a resource-constrained population model (Example 3) in Figures 2 and 5.
Our work highlights that invasion probabilities are an important testing ground for evaluating approximation methods. Unlike mean hitting time calculations, the simplicity of the fundamental difference equations determining invasion probabilities permits the explicit evaluation of limits. At this stage, it remains an open question to completely understand which approximation is best for a given set of circumstances. It is possible, but not at all clear, that taking higher order approximations of the Exponential Approximation or a fully formulated WKB-type approximation would succeed under all conditions. In any case, it remains true that careful consideration should be taken when choosing an approximation method for evaluating properties of continuous-time, discrete state-space Markov chains.
Proof. The proof begins by factoring out the leading term in the given series ∞ n=0 a n x n/m+γ ν = (a 0 x γ ) ν 1 + ∞ n=1 a n a 0 x n/m ν .
Then make the substitution y = ∞ n=1 an a0 x n/m and find the Taylor expansion of (1 + y) ν at 1, (1 + y) ν = 1 + νy + ν(ν − 1) 2 They by plugging in the series for y and collecting terms of the same power, we have 1 + ∞ n=1 a n a 0 x n/m ν = 1 + νa 1 a 0 x 1/m + ν a 0 a 2 + (ν − 1)a 2 The form of the new series is found by multiplying each term by the original leading order term (a 0 x γ ) ν .
Proposition A.2 (Series Expansion II). Suppose a power series is given in the form, ∞ n=0 a n x n/m , with a 0 = 0, a n ∈ R for n ≥ 1, and m > 0. Whenever x is such that ∞ n=1 a n x n/m < |a 0 |, then 1 ∞ n=0 a n x n/m = ∞ n=0 a n x n/m , where a 0 = 1 a 0 , a 1 = − a 1 a 0 and a 2 = a 2 1 − a 2 a 0 .
Proof. Suppose there is a given power series that satisfies the conditions of the proposition. Further suppose that the inequality ∞ n=1 a n x n/m < |a 0 | holds. By rearranging our original expression, we find a familiar form 1 ∞ n=0 a n x n/m = 1 a 0 1 + 1 a0 ∞ n=1 a n x n/m .
Since we have assumed that ∞ n=1 a n x n/m < |a 0 |,  The first few coefficients of the resulting series are obtained by distributing the (a 0 ) −1 in the last line.
Let t = f (x). Then the inverse of the series f can be expressed as x = ∞ n=1 a n t n/ν , as t → 0 + , where Proof. We first rewrite f (x) as follows a n x n .
Setting t = f (x), we have Choose a 1 = a −1/ν 0 so that it cancels the leading coefficient of t 1/ν . From here we use powers of t 1/ν to find an expansion of the form x ∼ ∞ n=1 a n t n/ν . By collecting ordered terms, we find the corresponding two term approximation, a 1 t 1/ν + a 2 t 2/ν = x + a 1 νa 0 + a 2 a 2/ν 0 We then choose a 2 = − a 1 νa 1+2/ν 0 so that the coefficient for the x 2 term is zero. Increasingly precise approximations of x can be found by keeping track of the higher order t terms and choosing each a n to appropriately cancel out these terms.