STATIONARY DISTRIBUTIONS AND CONDENSATION IN AUTOCATALYTIC REACTION NETWORKS

. We investigate a broad family of stochastically modeled reaction networks by look- ing at their stationary distributions. Most known results on stationary distributions assume weak reversibility and zero deﬁciency. We ﬁrst explicitly give product-form stationary distributions for a class of mostly non-weakly-reversible autocatalytic reaction networks of arbitrary deﬁciency. We provide examples of interest in statistical mechanics (inclusion process), life sciences, and robotics (collective decision making in ant and robot swarms). The product-form nature of the stationary distribution then enables the study of condensation in particle systems that are generalizations of the inclusion process.


Introduction.
Understanding the dynamics of reaction networks (CRNs) is of central importance in a variety of contexts in life sciences and complex systems, including molecular and cellular systems biology, which are some of the most vital areas in bioscience. Two approaches are used to model CRN systems, either a deterministic or a stochastic model. The first is realized as a vector with concentrations of each molecular species as a state space governed by a system of ordinary differential equations (ODEs), whereas the second is described by a continuous-time Markov chain acting on discrete molecular counts of each molecular species. Typically the stochastic model is used for cases with low molecular numbers where stochasticity is essential for the proper description of the dynamics. Au contraire, the deterministic model is used for cases with many molecules in each species and where it is assumed that coupled ODEs well approximate the concentrations.
The study of the dynamics of the deterministic model, mass-action kinetics in particular with complex balanced states [26,16], is a well-studied subject going back more than 100 years [46,36]. Understanding of such, and more general ODEs, from chemical reaction network theory developed to more subtle questions, like, e.g., multistationarity, persistence, etc. [23]. Conversely, the stochastic system is analyzed via the master equation. No analytic solutions are known for most systems, even concerning stationary distributions. Consequently simulation methods and approximation schemes of different exactness, roughness, and rigor were developed in order to understand such systems [21,18,8,37], making a systematic investigation of fundamental effects of noise and statistical inference a demanding job. Our results are a step towards the rigorous analysis of the product-form stationary distribution of non-weakly-reversible ergodic stochastic CRNs of arbitrary deficiency. We exhibit product-form stationary distributions π N for a large class of autocatalytic mass preserving CRNs, including the models in [43,40,6,5,33,34] (which were studied via simulation and approximations) and generalizing results of [28,24]. The emerging infinite family of product-form functions in the stationary distributions (with Poisson form as a particular case) is also possibly interesting from the view of natural computation, where it extends the range of designable probability distributions of CRNs; see, e.g., [11,44]. We illustrate the occurrence of such CRNs in interacting particle system theory and life sciences for collective decision making processes in ant or robot swarms.
The relation between the deterministic and the stochastic model as well as their differences are a focus of current research [6,5,28,43,40,10,3]. Kurtz [35] linked the short term behavior of the adequately scaled continuous-time Markov chain to the dynamics of the ODE model. These results are based on the classical mean field scaling which assumes that the system is well mixed. Then the probability that a set of molecules meet in a small volume is proportional to the product of the molecular concentrations x i /V , where x i denotes the absolute number of molecules of type S i , and where V is the volume which is assumed to be large. Within this modeling framework, the orbits of the continuous-time Markov chain describing the stochastic CRNs converges as V → ∞ towards the orbits of the mass-action ODEs. This convergence was also considered recently from the point of view of large deviation theory [1]. The insight of the complex balanced deterministic model was recently transferred to the stochastic model: A deterministic system is complex balanced if and only if the stochastically modeled system has product-form of Poisson type [3,10], where the parameter of the Poisson distributions are given by the stable equilibrium values of the related deterministic mass-action dynamic.
CRNs with stochastic behavior differing from the behavior of the deterministic CRN due to molecular discreteness and stochasticity were identified. The mathematical analysis is based on approximations [43,40,6,5,33,34] in the ergodic case, or on the analysis of absorbing states for absolute concentration robust CRNs [4,2,15]. In the ergodic case such behavior appeared in the literature as noise-induced bi-/multistability [6,5], small-number effect [43,40], or noise-induced transitions [27]. Our setting includes examples from [6,5,33] and some examples of [43,40]. Hence we shed light on such instances by providing product-form stationary distributions and enabling exact analysis for the class of autocatalytic CRNs (see Definition 3.1 and Remark 3.3). We inspect them asymptotically when the total number of molecules N is large. Taking inspiration from previous works on particle systems [19,20,24,7], we consider non-mean-field transition mechanisms where particles (or molecules) are located at the nodes of a graph. Particles located at some node i (or of type S i ) can move to nearest neighbor nodes j. Within this new modeling framework, the rate at which a particle moves from site i to site j (or that a molecule of type S i is converted into a molecule of type S j ) is related to the absolute numbers x i and x j of species S i and S j . While a classical mean field scaling with V = N would lead to convergence of π N as N → ∞ towards a point mass centered at the positive equilibrium of the deterministic mass-action ODE, the new scaling regime leads to the emergence of condensation: The stationary distribution π N of autocatalytic CRNs can under some conditions converge towards limiting probability measures with supports located on the faces of the probability simplex. In other words, the set of molecules concentrates as N → ∞ on a strict subset of the set of species. We investigate the asymptotic behavior of the product-form stationary distribution π N , putting emphasis on the cases of up to molecularity three in our model with respect to three different forms of condensation. We observe that monomolecular autocatalytic CRNs (see Definition 3.1 and Remark 3.3) and complex balanced CRNs do not satisfy any form of condensation. We generalize a theorem from [24] to allow more general product-form functions and prove, for the up to bimolecular case, a weak form of condensation and a weak law of large numbers. In the three-molecular and higher case, we show that such systems exhibit the strongest form of condensation.

Reaction networks.
A reaction network is a triple G = (S, C, R), where S is the set of species S = {S 1 , . . . , S n }, C is the set of complexes, and R is the set of Complexes are made up of linear combinations of species over Z ≥0 , identified with vectors in Z n ≥0 . Reactions consist of ordered tuples (ν, ν ) ∈ R with ν, ν ∈ C. Such a reaction consumes the reactant ν and creates the product ν . We will typically write such a reaction in the form ν → ν . We will often write complexes ν ∈ Z n ≥0 in the form ν = n i=1 ν i S i . Accordingly we slightly abuse notation at times for complexes by identifying ν with n i=1 ν i S i . We usually describe a reaction network by its reaction graph which is the directed graph with vertices C and edge set R. A connected component of the reaction graph of G is termed a linkage class. We say ν ∈ C reacts to ν ∈ C if ν → ν is a reaction. A reaction network G is reversible if ν → ν ∈ R whenever ν → ν ∈ R (different to reversibility of stochastic processes), and it is weakly reversible if for any reaction ν → ν ∈ R, there is a sequence of directed reactions beginning with ν as a source complex and ending with ν as a product complex. If it is not weakly reversible we say it is nonweakly-reversible. The molecularity of a reaction ν → ν ∈ R is equal to the number of molecules in the reactant |ν| = i ν i . Correspondingly we call such reactions unimolecular, bimolecular, three-molecular, or n-molecular reactions. Alternatively we say a reaction has molecularity one, two, three, or n. The stochiometric subspace is defined as and for v ∈ R n , the sets (v + T ) ∩ R n ≥0 are stochiometric compatibility classes of G. The following invariant has proven to be important in the study of complex balanced CRNs. The deficiency of a reaction network G is defined as where is the number of linkage classes.
For each reaction ν → ν we consider a positive rate constant κ ν→ν ; the vector of reaction weights is defined by κ ∈ R R >0 and the CRN with rates is denoted by (G, κ). For examples of reaction networks see section 3.3

Deterministic model.
Here we review the main notions connected to the deterministic model. This setting is usually termed deterministic mass-action kinetics. The system of ODEs associated with the CRN (G, κ) with mass-action kinetics is where for a, b ∈ R n ≥0 we define a b = Si∈S a bi i with convention 0 0 = 1. The system then follows this ODE started from initial condition x 0 = x(0) ∈ R n and the dynamics of x(t) ∈ R n models the vector of concentrations at time t.
Note that if a CRN is detailed balanced or complex balanced, then it is necessarily weakly reversible. Also deficiency zero weakly reversible CRNs are complex balanced independent of the rate [17].

Stochastic model.
Here we introduce the main notions connected to the stochastic model. The setting we focus on is usually termed stochastic mass-action kinetics. The progression of the species follows the law of a continuous-time Markov chain on state space Z n ≥0 . The state at time t is described by a vector X(t) = x ∈ Z n ≥0 which can change according to a reaction ν → ν by going from x to x + ν − ν with transition rate λ ν→ν (x), corresponding to the consumption of ν and the production of ν . The Markov process with intensity functions λ ν→ν : Z n ≥0 → R ≥0 can then be given by Accordingly, the generator A is given by for h : Z n → R. We focus on the usual choice, stochastic mass-action kinetics, where the transition intensity associated with the reaction ν → ν is This uniform sampling scheme corresponds to the mean field situation where the system is well-stirred in the sense that all particles move randomly and uniformly in the medium. The transition intensities with constants κ ν→ν model the probability that such molecules meet in a volume element. The study of these models goes back to [32,47]. In the following we fix a CRN (G, κ) and introduce the main terminology from stochastics.
Definition 2.2 (decomposition of state space). We say the following: -G is almost essential if the state space is a union of irreducible components except for a finite number of states.

Stationary distribution and product-form stationary distribution.
The stationary distribution π Γ on an irreducible component Γ describes the long-term behavior of the Markov chain in the positive recurrent case. Then π Γ is unique and corresponds to the limiting distribution (see [41]). Note that on a finite irreducible component the stationary distribution always exists. 1 Let X(t) denote the underlying stochastic process associated with a reaction network on a finite irreducible component Γ. Then given that the stochastic process X(t) starts in Γ, we have The stationary distribution is determined by the master equation of the underlying Markov chain: for all x ∈ Γ. Inserting the rate functions following mass-action kinetics gives where c ∈ R n >0 is a point of complex balance and M Γ is a normalizing constant. So each deterministic complex balanced CRN has its stochastic counterpart with product-form stationary distribution of Poisson-type. One can prove (see, e.g., [3]) that, for zero deficiency CRNs, a network is complex balanced if and only if it is weakly reversible. This explains why most results on product-form distributions assume zero deficiency. We will go beyond this setting in the forthcoming sections. On the other hand by [10, Theorem 5.1] any almost essential stochastic reaction network with product-form stationary distribution of Poisson-type is deterministically complex balanced. Notice that since complex balanced implies weakly reversible, these results do not apply to non-weakly-reversible CRNs.

Reaction vector balance CRNs.
The notion of reversibility plays a fundamental role in Markov chain theory.
Note that the notions of reversible and detailed balanced for CRNs are not the same as the same terms used for Markov chains. A definition similar to detailed balanced (see Definition 2.1) for the stochastic model of CRNs was recently termed as reaction vector balanced [9,30].
Rewriting (2.6) as we see that π is reaction vector balanced if and only if the Markov chain transition rates given by q(x + a, x) are reversible. If a CRN is detailed balanced (Definition 2.1), many results are known. Detailed balance implies complex balanced, so the stochastic model has product-form stationary distribution of Poisson-type. However, more is known; by [47, Lemma 3.1, p. 157] and [30] for reversible reaction networks this is the case if and only if the corresponding stochastic model is Whittle stochastic detailed balanced, which implies its reversibility as a Markov chain.

Generalized balanced CRNs.
In section 4, Remark 4.3 and Example 4.4, we indicate how to combine complex balanced and autocatalytic CRNs. This context was not considered before hence we review notions of [9] to adapt and encapsulate it into our setting.
Remark 2.7. The notion of generalized balanced covers 1. reaction balanced with index given by reactions, i.e., the tuples of subsets are complex balanced with index given by complexes, i.e., the tuples of subsets are defined for C ∈ C, 3. reaction vector balanced with index given by a ∈ Z n , i.e., the tuples of subsets are defined for a ∈ Z n , [9], including combinations and other possibilities (see, e.g., Remark 4.3).
The following proposition generalizes [9, Theorem 4.3] and follows from the same principle applied to the system of equations defining the master equation.
Proof. We have to check that the master equation is satisfied for all x ∈ Γ, so consider a fixed x ∈ Γ. By definition we have a decomposition of the reactions of the form Since the original master equation (2.2) is comprised of these equations we conclude that π is a stationary measure for (G, κ) on Γ.
3. Autocatalytic CRNs. The class of autocatalytic reaction networks we study is a relatively broad class of mass-preserving non-weakly-reversible CRNs of arbitrary deficiency. It is inspired by both the inclusion process [24] and the misanthrope process [14] with nontrivial intersection (also see section 5.1). Therefore it naturally generalizes both models studied in CRN literature [33,6,5,28,43,40] and some models of homogeneous and inhomogeneous interacting particle systems on finite lattices [24,38].

Notations.
All reactions in autocatalytic CRNs will have a net consumption of one S i and a net production of one S j and will be of the following form: where m ≥ 1. We use the following notation for the reaction rates for such reactions: where n i,j is the highest integer m with a reaction of the form Denote the collection of reactions net consuming one S i and net producing one S j by

Autocatalytic reaction network.
. . , S n } satisfies the following rules: 1. All reactions are of the form (3.1).
2. If there is a reaction net consuming one S k and net producing one S l , then . Set for convenience n k := n j,k = n l,k , and denote the normalized rates by where n k is the highest integer with a reaction of the form Remark 3.2. All autocatalytic CRNs are mass preserving, meaning that every reaction ν → ν of the CRN satisfies i∈S ν i = i∈S ν i . Hence the stochastic dynamics are confined to irreducible components of the form Γ N := {x ∈ Z n ≥0 ||x| = N }; similarly the deterministic ODE dynamics are restricted to corresponding stochiometric compatibility classes. Furthermore (1) of Definition 3.1 means that only monomers are exchanged in a particle system interpretation, while (4) and (5) ascertain that the CRN has a product-form stationary distribution (see proof of Theorem 4.1). Remark 3.3. Note that this expression was already used in different contexts. A definition of autocatalytic CRNs can be found for weakly reversible CRNs in [22] where it is utilized in the study of persistence and siphons for such CRNs. Other definitions of autocatalytic reaction and autocatalytic set can be found in numerous references, most of them focusing on the framework of origin of life (see, e.g., [25,45]), whose examination in this context can be traced back to Kauffman [31].

Examples of autocatalytic reaction networks.
Here we introduce the notions and illustrate applications and the model. For the CRNs here on two species, conditions (1)-(4) of Definition 3.1 are easily seen to be satisfied, and condition (5) is trivial. For a frameworks of interest for autocatalytic CRNs with more species we refer to section 5.1.
All examples (see Table 1) are autocatalytic CRNs (Definition 3.1). Example (A) is reversible and of deficiency 0 and coincides with motif E of [40]. Example (B) contains asymmetric transitions, is non-weakly-reversible with deficiency 1, and corresponds to motif F of [40]. Example (C) is non-weakly-reversible with deficiency 2 and is a generalized model of [6,33,43], which also appears as a special case of motif I of [40].
We next remark on some applications of autocatalytic reaction networks. Examples (C) and (D) have found applications in several interdisciplinary fields. Example (C) can model a colony of foraging ants collecting food from two sources [5]; it was exploited for decision-making processes in a swarm of agents [33] and apart from that corresponds to the Moran model on two competing alleles with bidirectional mutation [39,28]. Example (D) was introduced as a high-density model for decision-making processes in swarms of agents and ants [33]. Then the trimolecular reactions of example (D) model the majority rule, where the majority convinces the minority to change its opinion in collective decision-making systems (or food source in ants). We provide the stationary distribution in closed form in Theorem 4.1 for all autocatalytic CRNs, leading to exact known stationary behavior in all examples above.

A nonstandard product-form stationary distribution.
Here we derive product-form stationary distributions for autocatalytic CRNs (see Definition 3.1). This class of CRNs and Theorem 4.1 are stimulated both by the inclusion process [24] and the misanthrope process [14] and contain models studied in the CRN literature [33,6,5,28,43,40] and models of homogeneous and inhomogeneous interacting particle systems on finite lattices [24,38]. For a proof in the misanthrope case see, e.g., [13].
Proof. First remark that condition (5) of Definition 3.1 holds if and only if for each i, j such that R i,j = 0 (of (3.3)) there exists a c(i, j) > 0 such that .
We show that π is reaction vector balanced for any irreducible component Γ by separating the master equation into parts according to reaction vector balance (2.6). According to conditions (1) and (2) given in Definition 3.1, we can partition the set of reactions using the various sets R i,j and R j,i (of (3.3)) and, hence, subdivide the master equation according to this partitioning. Let i, j be such that Claim 4.2. π as defined in (4.1) satisfies the respective (2.6) associated with R i,j for all x ∈ Γ ⊂ Z n ≥0 . Proof. In the following we omit the coefficients x l for l = i, j in the equation from π, since the other coordinates are equal and we prove π has product form. We only get reactions R i,j on the left side and reactions R j,i on the right side of (2.6): We must thus check that the f i solve Observe that this equation vanishes on both sides for ( where one can reduce the second identity to the first on the domain we consider. Set (both for i, j) .
By shortening fractions this is equivalent to so this ansatz solves the equation. Observe that along (4.4) x i + x j is the same on the left-hand and on the right-hand sides, so any functions are also solutions to (4.4). However we have to choose the product-form functions compatibly taking into account all i, j with R i,j = ∅. Now we shall show that for all i, j with R i,j = ∅ we find a d(i, j) > 0 such that we arrive at the same product-form functions and that they correspond to f i . For this we use (4.3) to set Then the g i (m) can be written as With this we write as required. Notice that the f i (m) as the resulting product-form functions are welldefined and do not depend on specific pairs i, j, using both conditions (4) and (5) from Definition 3.1.
Hence we conclude that this CRN has a product-form stationary distribution of the form given above. Remark 4.3. Notice that autocatalytic CRNs considered in Theorem 4.1 can be combined with complex balanced CRNs to obtain a bigger class of CRNs for which the stationary distribution is known and of product form. This is thanks to the product form and Proposition 2.8. The incoming reactions in the autocatalytic part which are also part of a complex balanced CRN are, however, restricted to be monomolecular.
We give an example to outline this and indicate the principle.
Example 4.4. In this example the CRN is composed of the upper part which is reaction vector balanced and corresponds to reactions between S 1 , S 2 and the lower part which is complex balanced and corresponds to reactions between S 1 , S 3 .
The stationary distribution is is a point of complex balance of the lower CRN (i.e., complex balanced for the CRN that consists only of reactions between S 1 , S 3 ). Since the balance equation for the upper CRN is reaction vector balanced, while the lower is complex balanced, the CRN is, overall, generalized balanced on Γ N , N ≥ 2 (see Definition 2.6).

Asymptotic behavior of product-form functions of Theorem 4.1.
The product-form functions which appear in Theorem 4.1 are of the form Here we use of the following notations (for 1-/2-/3-molecular incoming reactions): ). We study asymptotic growth behavior and the problem of normalizability of the different product-form functions. These considerations will be related to condensation in section 5.2. Identifying the product-form functions with sequences, the latter is equivalent to the existence of a finite positive radius of convergence of the associated power series. We say a sequence (a n ) n ∈ R N ≥0 is normalizable if there is c > 0 such that ∞ n=0 a n c n < ∞. We omit the proof of the following lemma which is standard. Lemma 4.5 (asymptotic growth behavior).
1. lim n→∞ The same quotient of product-form functions coming from molecularity higher than 3 also diverges. In particular the limits of (1), (3), and (4) do not depend on the parameter λ i .
Via a ratio test we get that only the power series of the functions h(n) have a finite positive radius of convergence (λ i β 2 i ) −1 of the associated power series. g(n) has an infinite convergence radius and p(n) has a convergence radius of zero. 1.

(
n m=0 φ m p(m)) n does not converge for any 0 < φ 4. Product-form functions coming from molecularity higher than 3 are also not normalizable as in 3.
We have three different behaviors with respect to normalizability, g(m)φ m can be normalized for any 0 < φ, h(m)φ m can only be normalized up to 0 < φ < 0(λ i β 2 i ) −1 , whereas p(m)φ m cannot be normalized independently of the value 0 < φ. Due to the conservative nature of mass-preserving CRNs or conservative interacting particle systems (IPS), rescaling all the product-form functions by the same φ does not change the distribution (for stochastic particle systems this parameter φ is called fugacity [24,42]).

The classical mean field scaling.
Denote by |ν| = Si∈S ν i the number of molecules involved in a reaction, and designate by V the scaling parameter usually taken to be the volume times Avogadro's number. Then in some situations it is reasonable to rescale the transition rates of the stochastic model according to the volume as corresponding to the following change of the reaction rate, This way of rescaling the transition rates is adopted by considering the probability that a set of |ν| molecules meet in a small volume element to react [3,35]. The above mean field scaling assumes that a particular molecule of type S i will meet a molecule of type S j with a probability proportional to the concentration of type S j molecules. Kurtz [35] linked the short term behavior of the properly scaled continuoustime Markov chain to the dynamics of the ODE model. Within the classical scaling regime, Theorem 4.1 becomes the following.
Theorem 4.7. Let (G * , κ) be autocatalytic (see Definition 3.1). Then the associated stochastic CRN, with rate function as in (4.6), possesses the product-form stationary distribution with product-form functions with the stochastic dynamics confined to irreducible components Γ as specified in Remark 3.2.
It is then natural to check the large V behavior of the stationary distribution given in Theorem 4.7. Recently, large deviation theory has been developed for some class of strongly endotactic mean field CRNs in [1], but these results do not apply to autocatalytic networks. We will consider a non-mean-field regime in section 5 and just illustrate the mean field scaling limit by an application of [12,Theorem 3.1] to the following example. Consider the autocatalytic CRN, Let V = N be the total number of molecules, and let X(t) be the number of molecules of type S 2 at time t ≥ 0, which is a birth and death process evolving in the set {0, . . . , N}. It is easy to see that the conditions for application of [12,Theorem 3.1] are satisfied. Then the rescaled process Y N (t) = X(t)/N approaches the dynamics of the mass- Focusing on Y N , the stationary distribution given in Theorem 4.7 translates into a probability measure π N defined on the unit interval [0, 1], which satisfies a large deviation priciple for this invariant probability measure [12,Theorem 3.1], where the stationary distribution concentrates exponentially fast as N → ∞ on the set of minimizers of the free energy function, which are precisely the linearly stable equilibria of the associated deterministic mass-action dynamic.
One can check that for generic constants α 1 1,2 , α 1 2,1 , and α 3 2,1 the mass-action ODE has a single stable equilibrium which is located in the positive orthant; this follows since it is enough to confirm it for dy/dt = F (y) as above (polynomial in one variable). Section 5 considers a different scaling regime for autocatalytic processes in which condensation occurs. In the above example, the stationary distribution converges to the point mass δ 0 centered at y = 0; see Theorem 5.12 and Corollary 5.10.

Application: Condensation in particle systems.
We investigate the asymptotic behavior of mass preserving autocatlytic networks when the total number of molecules N is large. When considering large volume limits, CRN theory usually considers the classical mean field scaling limit; see section 4.3. We focus on a different mechanism that leads to a CRN (or a particle system) where molecules do not move at random in a mean field regime, but are located at the nodes of a graph. Molecules located at some node i (or of type S i ) can move to nearest neighbor sites j. In this modeling framework the rate at which a molecule of type S i moves to site j (or is converted into a molecule of type S j ) will be a function of the absolute number of particles of type S i and S j , so that the rate constant κ ν→ν will be independent of N .
This will model the autocatalytic effect where the move of a molecule from site i to site j is a consequence of the attraction of molecules of type j on molecules of type i. In this setting, a new phenomenon appears: Under some conditions, the molecules will concentrate on a subset of the set of species, leading thus to condensation on a subset of the state space. We first illustrate this phenomenon by considering the socalled inclusion process. We then study condensation by investigating the asymptotic behavior of the product-form stationary distribution π N , putting emphasis on the cases of up to molecularity three. We introduce three different forms of condensation and investigate the limiting distributions for autocatalytic CRNs. We observe that monomolecular autocatalytic CRNs (see Definition 3.1 and Remark 3.3) and complex balanced CRNs do not satisfy any form of condensation. We prove for the up to bimolecular case a weak form of condensation and a weak law of large numbers. In the three-molecular and higher cases we show that such systems exhibit the strongest form of condensation.

Condensation in inclusion processes.
The inclusion process, introduced in [19,20], is a particle system which is dual to the Brownian energy model where every particle of type S i can attract particles from type S j at rate p ji , where the p ij are the transition probabilities of a Markov chain. When p ij = p ji , one speaks of a symmetric inclusion process. This particle system evolves in Z S ≥0 , where S is the set of species. It is defined as a time-continuous Markov chain of generator L of the form where h denotes any function. In the homogeneous case this is a special case of the misanthrope process on a finite lattice [14]. The symmetric inclusion process defines in fact a stochastic reaction network for the set of reactions R ij given by with α 1 i,j = p ij m 2 and α 2 i,j = p ij . R ij is thus a multispecies version of example (C) of section 3.3. The authors of [24] studied such processes and provided interesting results on asymmetric CRNs. Notice that such CRNs can be autocatalytic when the Markov chain of transition probabilites p ij is reversible and when conditions (4) and (5) of Definition 3.1 are satisfied. Such processes are mass conservative. Let N be the total number of particles, and let π N be the stationary distribution associated with the process restricted to the irreducible component Λ N = {x ∈ Z S ≥0 ; i∈S x i = N }. The authors of [24] provide an interesting one dimensional process, called an asymmetric inclusion process, where p ii+1 = p and p ii−1 = q on the state space S = {1, . . . , n} with factorized stationary distributions as in Theorem 4.1 with λ i = (p/q) i of (5) of Definition 3.1.
A new interesing phenomenon appears in such a process: In the limit N → ∞ and when p > q, the process condensates on the right edge, that is π N (X n ≤ (1−δ)N ) −→ 0 for all δ ∈ (0, 1). The authors argued that at first sight one might be tempted to think that this is just a consequence of the asymmetry p > q, and proved that this argument is not correct since a CRN having the same coefficient α 1 i,j but vanishing second-order coefficient α 2 i,j ≡ 0 would have a Poissonian product-form stationary distribution and no condensation would occur.
Building on this work, the authors of [7] considered a reversible inclusion process which is reversible as a Markov chain with λ i p ij ≡ λ j p ji (as in (5) of Definition 3.1), where the diffusion constant m N depends on the total number of particles N in such a way that m N ln(N ) −→ 0 as N → ∞. They proved that the process condensates on the set of species where the stationary distribution λ attains its maximum value. We will extend these results to autocatalytic CRNs of arbitrary molarity.

Condensation in autocatalytic reaction networks.
Consider a sequence of random vectors (X N ) N ∈N indexed by N , where X N = (X 1 , . . . , X n ) N takes values in Denote the corresponding sequence of discrete probability distributions by π N , i.e., We use this setting to first make some general observations and statements. Let x i and proj i ( We allow the following abuse of notation for simplicity, where q : R → R is a function, and write Following [24,42,7] we introduce three notions for condensation. Definition 5.1. In the setting of (5.2) we define the following notions of condensation: Next, we show (C1) is the strongest and (C3) is the weakest notion given in definition 5.1; the simple proof is omitted. ( (2) holds this implies (C2); and if (3) holds this implies (C3).
Remark 5.4. If a random vector X = (X 1 , . . . , X n ) takes value in Z n ≥0 , then conditioning on the sum being N gives a sequence of random variables as in (5.2).
Both inclusion processes on Z S ≥0 (or other conservative IPS) and mass-preserving CRNs (see Remark 3.2) are continuous-time Markov chains with positive recurrent stationary stochastic dynamics confined to finite sets of the form (5.1), indexed by N . We consider this setting as in the beginning of section 5.2 with n = |S|. We only treat product-form stationary distributions and assume they are given by a family of (product-form) probability distributions of the form along sets (resp., irreducible components for CRNs) of the form (5.1) where for simplicity w i (0) = 1 and μ i > 0. For a fixed mass-preserving CRN (G, κ) we denote the stationary distributions on the irreducible component with total molecule number equal to N by π G,N (x). If we make a more general statement we stick to the notation π N (x). Observe also that the definitions of condensation are independent of the product-form assumption of (5.3). We first check that for Poisson product-form stationary distributions we have no condensation. One can reduce the statement to two species using the multinomial theorem, from which it is easy to deduce the conclusion. Remark 5.7. Note that the limit of the quotient (5.4) exists for w(m) if and only if the limit for q(m) exists.
The conditions (1) and (2) of Theorem 5.8 only require the product-function for coordinate i * to dominate the others, which gets rid of the assumption of both existence of limit (5.4) and normalizability. So for a big class of product-form function when paired with asymmetric μ i the stochastic dynamics show a weak form of condensation as in Definition 5.1 (C3).
be a family of probability measures given by product-form functions w i for x ∈ Z n ≥0 , |x| = N , and where Z N is the normalizing constant defined by Assume there is an S * i ∈ S such that 1. μ i * > μ j when j = i * ; 2. for all S j ∈ S and all α > 0 there is c α,j ∈ R >0 and an M c ∈ N such that for any M > M c and all r ∈ {0, . . . , M} we have Then π N condensates on the subset S * = {S i * }, that is, for all δ ∈ (0, 1), i.e., we have a weak form of condensation as in Definition 5.1 (C3) and we have a strong law of large numbers X i * N → 1 a.s. as N → ∞. Proof. Let δ ∈ (0, 1) be arbitrary; we will show the equivalent statement that probability of the complement to this set goes to zero. We want to estimate We first use the inequality .
We assume that μ 1 = max Sj ∈S\S i * μ j . We will recursively apply the second hypothesis of Theorem 5.8 |S| times with a fixed α > 0 chosen such that So we can apply |S| times the second hypothesis in the form w j (x j )w i * (r) ≤ c α,j e α(xj +r) w i * (x j + r) ≤ c α,j e αN w i * (x j + r).
We will not write explicit dependence on the constants c α,j (where S j ∈ S) and just write c for their assembly, since the results derived are asymptotic and hold up to multiplication by constants. With this we derive the inequality Applying this inequality together with Then utilizing this inequality at the same time as a rough inequality for the number of integer points on the simplex we get We exploit that for to obtain the following inequality Since the terms in the sum do not depend on m we estimate to upper bound the number of terms in the sum and get Now observe ( μ1 μ i * ) δ e α|S| < 1 by (5.5) and the other factor is a polynomial in N so that we conclude that this expression goes to zero for N → ∞. Since X i * ≤ N we use the Borel-Cantelli lemma applied to sums of and conclude X i * N → 1 a.s. as N → ∞. The finiteness of the series follows by combination of the direct comparison test and the ratio test for sequences applied to the final inequality term we derived in (5.6).
Remark 5.9. Let (a n ) n∈N be a sequence of positive real numbers with If b = 1, then this implies that for all α > 0 there exists c α such that for all m ∈ Z ≥0 If 0 ≤ b < 1, then this implies that for all α > 0 there exists c α such that for all m ∈ Z ≥0 a m ≤ c α e αm .
These conclusions follow directly from (5.7) by limited growth (arguments, e.g., as used in the ratio test for series); also see [24, 3.2 Generalizations].
As an application of Theorem 5.8 and Remark 5.9 we show that some asymmetric autocatalytic CRNs exhibit condensation as in Definition 5.1 (C3) if they have at least bimolecular reactions.
Corollary 5.10. Let (G * , κ) be an autocatalytic CRN with highest molecularity denoted by n * . Assume n * ≥ 2 and that there is an S i * ∈ S with incoming reaction of molecularity n i * = n * such that for all other species S j ∈ S of the same molecularity n j = n * we have κ) shows a weak form of condensation as in Definition 5.1 (C3) and we have a strong law of large numbers X i * N → 1 a.s. as N → ∞. Proof. By assumption and Theorem 4.1 we have that for x in the corresponding irreducible component. It is enough to show that we can find product-form functions μ i , w i (m) such that the conditions of Theorem 5.8 are satisfied with μ m i w i (m) = λ m i p i (m) for all m ∈ N. We distinguish the following cases: (n * = 2) Let λ j β 2 j = max{λ i β 2 i |S i ∈ S \ S i * }, and we can assume λ j β 2 j = 0 since otherwise the statement is trivial. Then for the species S k with β 2 k = 0 we choose and for species with β 2 k = 0 we choose a small > 0 such that To prove (2) we first recall the asymptotic description of the Gamma function following Wendel's inequality from [29]. This gives Γ(x + y) Γ(x) x y for y ≥ 0, x → ∞.
Applying this to our product-form functions w k (m) of species with β 2 k = 0 gives for a constant c > 0. In particular we have that the limit lim m→∞ w k (m + 1) w k (m + 1) = b exists in both cases, if β 2 k = 0, then b = 0, and if β 2 k = 0, then b = 1. From this and Remark 5.9 it is easy to see that (2) is satisfied. (n * > 2) The same principle applies to cases with higher molecularity; condition (2) is then a special case of Lemma A.2.
Remark 5.11. Theorem 5.8 also allows an interpretation along remark 5.4 with a condensation phenomenon for a family of independent random variables Y 1 , . . . , Y n with values in Z ≥0 , as in [24, 3.2 Generalizations] but without assumption of existence of limit (5.4) and normalizability.
Next we show that the stationary distribution asymptotically concentrates on the disjoint singleton sets {x ∈ Γ N |x j = N } ⊂ Z n ≥0 , where S j is a species with maximal product-form function f j = f (maximal as in the sense below). This confirms the existence of the strongest version of condensation as in the Definition 5.1 (C1) in the three-or higher molecular autocatalytic CRN. By Lemma 5.2 this implies all other forms of condensation. The result is similar to [7, Proposition 2.1], where they studied a reversible inclusion process whose diffusion constant decreases along irreducible components. Moreover the proof of [7, Proposition 3.2] follows a similar strategy.
Proof. We will repeatedly use that for S i ∈ S \ {S 1 , . . . , S k } We are done by combining Lemma A.2 with (5.9) by distinguishing the cases f = f 2 or λ 1 β n1 1 > λ 2 β n2 i . 2. |S| = n → |S| = n + 1: Assume S \ S n+1 has j species with product-form function f . We denote for a subset of species A ⊆ S to write the partition function as follows: We apply the induction hypothesis on the Z N −i,S\Sn+1 and get (1)].
Factorizing out f (N ) we obtain We apply Lemma A.2 and get (1) .
Now if f n+1 is also a maximal product-form function f , we obtain Z N = f (N )(j + 1 + o(1)), otherwise, using identity 5.9 we stay with Z N = f (N )(j + o (1)).
Remark 5.13. Both Corollary 5.10 and Theorem 5.12 are based on the asymptotic analysis of the product-form functions in the stationary distribution. Hence the results carry over to mass-preserving CRNs which have a complex balanced (hence Poisson product-form function) and an autocatalytic part (see Remark 4.3, Example 4.4, or Definition 2.6). The asymptotic analysis of the product-form should also be related to corresponding open CRNs.