Computation and optimal perturbation of finite-time coherent sets for aperiodic flows without trajectory integration

Understanding the macroscopic behavior of dynamical systems is an important tool to unravel transport mechanisms in complex flows. A decomposition of the state space into coherent sets is a popular way to reveal this essential macroscopic evolution. To compute coherent sets from an aperiodic time-dependent dynamical system we consider the relevant transfer operators and their infinitesimal generators on an augmented space-time manifold. This space-time generator approach avoids trajectory integration, and creates a convenient linearization of the aperiodic evolution. This linearization can be further exploited to create a simple and effective spectral optimization methodology for diminishing or enhancing coherence. We obtain explicit solutions for these optimization problems using Lagrange multipliers and illustrate this technique by increasing and decreasing mixing of spatial regions through small velocity field perturbations.


Introduction
Analysing complicated flows through their transport and mixing behavior has been and still is attracting a great amount of attention [MMP84, RKLW90, Wig92, HP98, Are02, JW02, Wig05, SLM05, FP09, Thi12, FPG14, KK16, KR18,HKK18], both from geometric and probabilistic points of view. Non-autonomous time-aperiodic dynamics poses additional difficulties, especially the case of finite time, where asymptotic notions cannot be applied.
The current work has two main contributions.
(i) Extending the work from [FK17], that deals with the time-periodic case, to the situation of general aperiodic finite-time dynamics. We detail a method to compute finite-time coherent sets [Fro13] of aperiodic flows that does not require any time-integration of trajectories.
(ii) A technique to find a small perturbation of the underlying aperiodic vector field in a prescribed ball in a space or subspace of vector fields, which optimally enhances or destroys the existing finite-time coherent sets. This extends optimization results in [FS17], which considered the time-periodic setup of [FK17], to (a) aperiodic dynamics and to (b) infinitedimensional velocity field space.

Augmentation
The key construction in [FK17] is the representation of a τ -periodically forced flow on phase space X ⊂ R d as an autonomous flow on time-augmented phase space τ S 1 ×X, where S 1 denotes the unit circle. On this time-expanded phase space, the time coordinate is simply advanced at a constant rate: x (t) = v(t, x(t)) θ (t) = 1, Finite-time coherent sets on the time interval [0, τ ], as introduced in [FSM10,Fro13], were extracted from singular functions of the transfer operator P 0,τ , where the transfer operator is the linear operator describing the evolution of distributions under the dynamics subject to a small random perturbation. The crucial observation is that singular modes of P 0,τ are eigenmodes of P * 0,τ P 0,τ , where P * 0,τ denotes the dual of P 0,τ , and that for area-preserving dynamics (corresponding to divergence-free velocity fields v) the dual is the transfer operator of the time-reversed dynamics, i.e., the (again, slightly stochastically perturbed) flow governed by the time-reflected velocity field (t, x) → −v(τ − t, x). The dynamic interpretation of this operator-based characterization is that (finite-time) coherent are those sets that are to a large extent mapped back to themselves by a noisy forward-backward evolution of the dynamics. This operator-based framework not only gives a qualitative framework for coherence; the singular values of P 0,τ provide quantitative bounds for coherence [Fro13,FPG14]-namely the closer the singular value is to one, the less mixing occurs between the coherent set and its exterior under the noisy dynamics.
By concatenating the "forward-time" and "backward-time" velocity fields on time intervals [0, τ ] and [τ, 2τ ]-see (9) below-we will construct a system on the augmented space [0, 2τ ] × X that mimics the forward-backward evolution of the dynamics. In this way, the eigenmodes of the (Fokker-Planck-) generatorĜ of this augmented system yield the eigenmodes of P * 0,τ P 0,τ ; i.e., the singular modes of P 0,τ , and from these singular modes the desired finite-time coherent sets. Again, the corresponding eigenvalues of the generator can be used to give quantitative bounds on coherence/mixing, see Theorem 8. We note that a numerical approach to extract coherent sets from P 0,τ by solving the Fokker-Planck equation has been described in [DJM16]. In contrast to [DJM16] we do not require time-integration over [0, τ ], which is especially advantageous once we consider the optimization of coherence and mixing.
Connecting the spectral properties of the generator of the augmented-space system with the (finite-time) dynamical properties of the original system is a generalization of the results from [FK17], where it has been done for time-periodic velocity fields on infinite time intervals. Through time-reflection of the finite-time problem we construct a time-periodic one, to which we apply the concepts of [FK17]. We also remove an assumption from [FK17] (the "niceness") on the so computed sets, thus strengthening the approach. There are several further non-trivial adjustments needed to fit the theory of [FK17] to this time-reflected setting, and the necessary details are covered in sections 2 and 4.
The interplay between the spectra of the dynamics in augmented space and the non-autonomous dynamics in the original space has strong connections to the correspondence between evolution semigroups [How74] and two-parameter evolution families, as elaborated in, e.g., [CL99]; see also [EN00, Section VI.9] for a general introduction. We mention that by a similar construction, spatio-temporal dynamical patterns were extracted in [GD17] by considering the generator of the Koopman operator (the adjoint of the transfer operators considered here) associated with the augmented-space dynamics.

Manipulation of coherence and mixing
There are several different ways to measure mixing and mixedness under (stochastically perturbed) dynamics, such as considering dispersion statistics or the change of variation in a concentration field; see e.g. [Pro99,LH04,TDG04,Thi08]. Multiscale norms of mixing measure how "oscillatory" a concentration field is [MMP05]; see [Thi12] for a review. The most widely used approaches to the problem of mixing optimization search for switching protocols between some fixed velocity fields in order to optimize some topological [BAS00] or other mixing measure [MMG + 07, CAG08,OBPG15]. Other strategies include optimising the diffusion component of the dynamics [FGTW16], the optimal distribution of concentration sources [TP08] and geometric dynamical systems techniques [Bal15]. An interesting theoretical result is that arbitrary mixedness under advection-diffusion can be achieved in finite time solely by sufficiently increasing the strength of the (otherwise fixed) advective flow [CKRZ08]. If there are no restrictions to the choice of the velocity field, one can choose the one that is optimally mixing the actual concentration at every time instance [LTD11]. We also note that a related problem to mixing enhancement arises in statistical mechanics [LNP13] where the convergence toward the stationary distribution should be accelerated, e.g., to increase the efficiency of sampling.
Instead of focusing on one fixed concentration field, we will bound the mixing characteristics of a flow in terms of the objects that most inhibit mixing: coherent sets. As we mentioned above, finite-time coherent sets are characterized by the singular vectors of the transfer operator, and, equivalently, by the eigenvectors of the generator of the augmented-space process, while the corresponding eigenvalue delivers an upper bound on transport between the coherent set and its exterior. Thus, we can quantitatively access the mixing behavior of a flow on finite time through the spectrum of the augmented generatorĜ, and can target these eigenvalues if we want to enhance or diminish mixing.
Given a default velocity field v and non-autonomous perturbations u ∈ U from an admissible space U of divergence-free velocity fields, our approach considers "small" u that change a targeted eigenvalue µ ofĜ (thus also the singular value σ of P 0,τ ) locally optimally. This procedure can be iterated to obtain a larger perturbation in a gradient-method fashion. Optimizing singular values of the transfer operator P 0,τ directly is difficult as it would necessarily involve the variation of the nonlinear dynamics under the velocity field u. Instead, a linearized optimization of the eigenvalue of the generatorĜ leads to a very simple optimization problem (44), which can be solved via a linear system of the same dimension as U . Moreover, the theory holds for infinite-dimensional perturbation spaces U as well, since Fréchet differentiability of the transfer operator and its spectrum with respect to perturbing velocity fields has been established in [KLP19].

Overview
This work is structured as follows. In section 2 we introduce the L 2 -function based formalism to study advection-diffusion systems by the Fokker-Planck equation and its evolution operator, the transfer operator, in forward and backward time. In section 3 we consider purely advective transport between a family of sets and its exterior in terms of fluxes through the boundary of the family; this is a geometric analogue of the operator-based considerations that follow. Section 4 introduces the new reflected augmented generator needed to handle aperiodic, finitetime driving. We state formal connections between the spectrum of the reflected augmented generator and the reflected transfer operator, and provide spectral-based bounds on the maximal possible coherence of sets in phase space under the aperiodic dynamics. Section 5 contains a numerical demonstration of the efficacy of our trajectory-free approach. Section 6 describes the formal setup of the optimization problem designed to manipulate the position of the dominant spectral values of the reflected augmented generator, culminating in an explicit expression for the optimal time-dependent local perturbation of the velocity field. Section 7 specializes the infinite-dimensional results of section 6 to the numerical setting via discretization and includes a variety of examples of coherence reduction and enhancement. We conclude in section 8.

Advective-diffusive dynamics
Let X ⊂ R d be a bounded and open set with compact and smooth (piecewise C 4 ) boundary. We consider the time interval [0, τ ] and the dynamics with reflecting boundary conditions for v ∈ C (1,1) ([0, τ ] × X; R d ) 1 and (w t ) t≥0 being a standard Wiener process in R d . The initial point x 0 is distributed according to some initial density f 0 ∈ L 2 (X). The evolution of the density of the governing equation (1) where ∂ ∂n is the normal derivative on the boundary. Associated to (2) is an evolution operator P 0,t : L 2 (X) → L 2 (X) that transports a density f 0 ∈ L 2 (X) at time 0 to the solution density of (2) at time t. The evolution operator P 0,t is an integral operator with stochastic 2 kernel k(t, ·, ·) : X × X → R + that satisfies [Fro13, Assumptions 1 and 2].

Construction of a forward-backward process
For simplicity of presentation we assume that the velocity field v(t, ·) is divergence free for all t ∈ [0, τ ]. We note that the remaining arguments in this section may be carried through for general velocity fields. Denote by ·, · H the canonical scalar product of a Hilbert space H. Following [Fro13] in the volume-preserving setting 3 , coherent sets over the time interval [0, τ ] are extracted from the eigenfunctions of P * 0,τ P 0,τ corresponding to large eigenvalues, where P * 0,τ is the L 2 -adjoint of P 0,τ , defined to be the unique linear operator satisfying for all f, g ∈ L 2 (X). The eigenvalues of P * 0,τ P 0,τ (the singular values of P 0,τ ) are known to lie in the interval [0, 1] (cf. [Fro13,p.3]). The rationale behind the operator P * 0,τ P 0,τ is that P 0,τ describes evolution in forward time, P * 0,τ describes evolution under the time-reversed dynamics, and coherent sets are characterized exactly by the property that they are "stable" under a noisy forward-backward evolution of the dynamics.
differentiable in t and cotinuously differentiable in x. 2 Doubly stochastic if the flow is volume-preserving. 3 The operators P0,τ and P * 0,τ are replaced by normalised versions for nonzero divergence velocity fields; these are denoted by L and L * in [Fro13].
The adjoint operator P * t,τ is the solution operator to the Kolmogorov backward equation [PS08]: The operator P * t,τ maps a density g τ at time t = τ backward in time according to (4) to produce a density g t at time t < τ . We may simplify (4) using volume preservation: Thus we may write (4) as Reversing time in (4) to obtain an initial value problem we get (2) and (7) we see that the natural evolution of the adjoint problem (the Kolmogorov backward equation) corresponds to the forward problem (the Kolmogorov forward equation) of the time reversed dynamics. We wish to construct the process over the time interval [0, 2τ ] that corresponds to the operator P * 0,τ P 0,τ . We view P * 0,τ as evolution on the time interval [τ, 2τ ] and we therefore shift (7) by τ time units, definingṽ(t, x) :=v(t − τ, x) = −v(2τ − t, x) to obtain a forward problem on [τ, 2τ ]: We denote the solution operator of this problem asP τ,t (= P * 2τ −t,τ ). Finally, we concatenate the two forward problems (2) and (8) to make a single process over [0, 2τ ]. We mark objects that live on this extended interval [0, 2τ ] with a hatˆ. Define the velocity fieldv using the reflection map The resulting velocity fieldv exhibits discontinuities in 0, τ and 2τ whenever it does not vanish there, but one-sided derivatives exist 4 In what follows, we will solve the Fokker-Planck equation over the interval t ∈ [0, 2τ ]; more precisely on (0, τ ) ∪ (τ, 2τ ) with L 2 -continuous concatenation at t = τ (see Prop. 5). Let us summarize the above construction with the following proposition.

Cumulative flux from a reflected family of sets
Before proceeding with the operator-based description of finite-time coherence, in this section we analyse the reflected dynamics by its flux through the boundary of a moving (possibly coherent) set. Our intention behind connecting this to a flux in augmented space (i.e., space-time) in Proposition 3 is partially to set the stage for the augmented-space operator-based description in section 4. Apart from strengthening the intuition for the forward-backward construction that is used in here, the results of the section 3.2 are not formally needed for the rest of the paper.

Augmentation and reflection
For a family of sets {A t } t∈[0,τ ] , A t ⊂ X, we consider the augmented set in the augmented state space X := [0, τ ] × X. Let n(x) denote the unit outer normal on ∂A at x ∈ X and v the augmented velocity field defined by v(x) = (1, v(θ, x)) .
We define a reflected family of sets and the augmented reflected set see Figure 1.

Outflow flux
We consider a family of d-dimensional sets {A t } t∈[0,τ ] , A t ⊂ X ⊂ R d satisfying the following assumptions (the boundaries are piecewise smooth in space and differentiable in time): Assumption 2.
1. There exists a co-dimension 1 parameterisation set R ⊂ R d−1 such that for each t ∈ [0, τ ] there is a bijective function a(t, ·) : R → ∂A t with a being piece-wise smooth, 2. The mapping b(t, x) := ∂a ∂t (t, r) is well defined for all t ∈ [0, τ ] and all x ∈ ∂A t , where a(t, r) = x.
The cumulative outflow flux under the vector field v(t, ·), t ∈ [0, τ ] from a family of sets {A t } t∈[0,τ ] is given by where (·) + denotes the positive part, S(x) is the d − 1 dimensional surface measure and n t (x) is the outer normal unit vector. The result [FK17, Theorem 2] shows that (14) is equal to the instantaneous outflow flux from the set A, defined by with S(x) denoting the d-dimensional surface measure. We extend this result to the reflected velocity fieldv(t, ·), t ∈ [0, 2τ ], generated by a general aperiodic velocity field.
In particular, recalling ζ from (10), for every t ∈ [0, 2τ ] the boundary ∂Â t has a parametrizationâ(t, ·) = a(ζ(t), ·) : R → ∂Â t and 5 C 1 in t and C 0 in r At t = τ the right-and the left-sided partial derivatives of a with respect to t exists but they may not be equal. The family of normal vectors is mirrored in time (see Figure 1):n t (x) = n ζ(t) (x) for t ∈ [0, 2τ ].
Proposition 3. The cumulative outflow flux from the family of setsÂ t , t ∈ [0, 2τ ] under the vector fieldv(t, ·), t ∈ [0, 2τ ], is equal to the cumulative absolute flux in and out of the family of Furthermore, the time-integrated flux (17) is equal to the instantaneous absolute flux in augmented space: Proof. Let us first prove the first equality. Therefore we do not need objects of the augmented setting yet. We split the integral and see that the first integral is already a part of what is needed. So we only need to treat the second integral (19). We use Fubini's theorem and substitute using ζ with g ζ(t) (r) being the Gram determinant. (It is important to note that we need and use substitution in one dimension, the time dimension, because we need the sign we get from substitution, which we would not get in higher dimensions.) Combining the two calculations above we get the desired result. The second equality involving the objects of the augmented setting follows analogously to [FK17, Theorem 2].

Coherent families of sets and the generator on augmented phase space
In this section we create a so-called spectral mapping theorem for our reflected augmented process (Proposition 5) and derive a bound for the finite-time coherence of a family of sets {A t } t∈[0,τ ] in terms of the second eigenvalue of a generator (the infinitesimal operator) of our augmented reflected advection-diffusion process (Theorem 8). To do this we build on discretetime theory from [Fro13] with the periodic continuous-time theory from [FK17].

The evolution operator for the reflected process
It is well known that P s,s+t , t > 0, is a compact, integral preserving, real and positive operator on L 2 (X) while t → P s,s+t f is continuous as a mapping from [0, ∞) to L 2 (X) for any fixed f ∈ L 2 (X) 6 . FurthermoreP 0,2τ = P * 0,τ P 0,τ is a self-adjoint operator on L 2 (X) with simple largest eigenvalue λ 1 (P 0,2τ ) = 1. Following [Fro13] one has that the second eigenvalue λ 2 (P 0,2τ ) satisfies where in the volume-preserving 7 setting µ 0 and ν τ are both simply the Lebesgue measure. We now consider the problem (11) introduced in section 2 as a time-periodic problem on 2τ S 1 × X (we extendv periodically). Following the considerations of section 2 the evolution operator P s,s+t starting from time s, w.l.o.g. s ∈ [0, 2τ ], flowing for time t ≥ 0 to s + t = kτ + r, k = s+t τ ∈ N ∪ {0} and (s + t) mod τ = r ∈ [0, τ ), is given bŷ The situation when t is exactly 2τ is of particular importance: Note thatP s,s+2τ is self-adjoint for s = kτ , k ∈ N ∪ {0}.

The time-augmented generator and evolution family
We now turn to the augmented reflected system inX := 2τ S 1 × X, and note that (ŵ t ) t≥0 is a standard Wiener process in R d ; in particular it is not constructed by time reflection. We define augmented versions ofx t ,v, ε, andŵ t , denoting them with bold symbols: andŵ t is a d + 1 dimensional standard Wiener process. The augmented system is Considering the time-periodic version of problem (11) withv onX we formulate a Fokker-Planck equation in augmented space. To avoid confusion with the (new) state variable θ, we write dependence on time t as a subscript of the augmented function, i.e., f t :X → R for all t ≥ 0. The augmented Fokker-Planck equation is Note there is no diffusion in the θ direction, as per the definition of ε.
We will now consider the augmented Fokker-Planck equation as a linear differential equation in the space L 2 (X). Its right-hand side is given by the so-called (augmented) infinitesimal generatorĜ : D(Ĝ) ⊂ L 2 (X) → L 2 (X), with domain D(Ĝ) defined as the subspace of L 2 (X) on which the generator is well-defined in terms of semigroup theory [Paz83,EN00]. The augmented Fokker-Planck equation (24) in augmented space then reads as We may also write (24) and (25) in terms of the non-autonomous ("unaugmented") dynamics: whereĜ(θ) is the right-hand side operator of (11) at time t = θ ∈ 2τ S 1 , i.e., the time-θ differential operator of the Fokker-Planck equation (on [0, 2τ ]).

Remark 4.
(a) The periodicity in θ and the boundary conditions from (25) are, as is common in semigroup theory, encoded in the domain ofĜ. This domain D(Ĝ) enforces continuity conditions in θ at 0, τ, 2τ .
(b) For the purposes of the current work we will not require the well-posedness of the problem (25). We will only need the operatorĜ defining the right-hand side of this equation, and its relation to the transfer operator familyP s,s+t , s < t. This will be the focus of section 4.3. For additional theory for these augmented problems, objects, and related results we refer the reader to [CL99].
(c) Any non-constant solution f : (t, θ, x) → f t (θ, x) to (25) has three input variables, t, θ and x, and may have different regularity properties in each variable. In the following we will focus on eigenfunctions ofĜ that are of course constant in t.
We now comment on the crucial connection between solutions of (25) andP s,s+t , neglecting the issue of solvability. Note that the stochastic augmented differential equation (23) allows for evolving the non-autonomous equation (1) from any initial time s by setting θ 0 = s. In an analogous manner, the augmented Fokker-Planck equation (25) with initial condition f 0 evolves every initial condition f 0 (s, ·), s ∈ [0, 2τ ),-i.e., a configuration of initial conditions-by the non-autonomous reflected Fokker-Planck equation (11). More precisely, the following holds for the evolution of (25): In the terminology of semigroup theory [CL99,EN00] the solution operators of (25), here formally denoted by (e tĜ ) t≥0 , form an evolution semigroup (or Howland semigroup), and it is given exactly by (27). Informally, the action of e tĜ in the context ofP θ,θ+t can be described as follows. On the left-hand side of (27), e tĜ takes the initial configuration f 0 (on all ofX) and evolves the entire configuration for the time duration t, to obtain f t . The result is then evaluated at the θ + t mod 2τ fiber. The θ + t mod 2τ fiber of f t corresponds to the θ fiber of f 0 evolved for time t due to the constant drift in the θ variable: −∂ θ , see (26). That equation (27) indeed gives the solutions to (25), is a conseqence of [CL99, Theorem 6.20] adapted to the concatenation of the forward and backward evolutions described by the reflected system (11). As we will not require a result of this generality, we omit the details. For our purposes it will be sufficient to consider the special case, where f 0 = f is an eigenfunction ofĜ. This is done next.

Eigenfunctions of the time-augmented generator
Analogously to [FK17, Lemma 22] the following result holds. It can be obtained from (27) by noting that every eigenpair (µ, f ) ofĜ gives a solution to (25) by f t = e µt f . However, to highlight the intuitive connection between the augmented generator and the non-autonomous problem, we prove the following proposition by other means.
Proposition 5. Let f be an eigenfunction ofĜ corresponding to the eigenvalue µ ∈ C. One has thenP s,s+t f (s, ·) = e µt f (s + t mod 2τ, ·) for all s ∈ 2τ S 1 and t ≥ 0.
Proof. We will prove (28) following ideas from [FK17]. First we apply Theorem 19 piecewise on [0, τ ] and [τ, 2τ ] concerning well-posedness and regularity. Therefore we consider the original problem (2) and the reflected, shifted, time-reversed problem (11). Now theorem 19 guarantees for any initial condition f 0 ∈ L p (X), p ∈ (1, ∞), the unique existence of a function f with the regularity and the properties Further f (t) ∈ D(Ĝ(t)) holds for all t ∈ (0, 2τ ]. Now we can proceed as in [FK17,Lemma 22]. Let µ ∈ C and f ∈ D(Ĝ) withĜf = µf . According to the construction above (26) we know and this implies, in accordance with (25), for all θ ∈ 2τ S 1 \{0, τ } NowP θ,θ+t is the evolution operator to the evolution equation Therefore the function e −µtP θ,θ+t f (θ) solves (29) uniquely and 19 guarantees continuity in θ. Therefore we can connect the eigenfunctions of the augmented reflected generatorĜ with the evolution given byP s,s,+t for all s, t as stated in the claim.
Let µ be an eigenvalue ofĜ with an eigenfunction f . Inserting s = 0 and t = 2τ into Proposition 5 yieldsP This is a spectral mapping theorem type of result, as it connects the eigenvalues and eigenfunctions of the evolution operatorP s,s+2τ with those of an associated (infinitesimal) gen-eratorĜ. We refer the reader to standard literature on classical results for operator semigroups [Paz83,EN00].
Remark 6. Theorem 19 guarantees that for initial conditions f s ∈ D(G(s)) the solution P s,s+t f s = f (t) is in t a continuous mapping to the domain of the generator, D(G(s + t)) = D p . Theorem 20 further gives for each eigenfunction f that f : θ → f (θ, ·) ∈ C(2τ S 1 ; D p ) and that f ∈ C(X). This regularity is utilized in the proof of Theorem 8 below.

Coherent families of sets
In the specific case where the velocity field v is periodic in time, [FK17] shows that the families of sets have an escape rate (see [FK17,Definition 8]) of at most Re(µ 2 ), where µ 2 is the first nontrivial eigenvalue ofĜ corresponding to the eigenfunction f . Because we consider the dynamics on a finite time interval, this notion of escape rate is replaced by the concept of a coherence ratio [Fro13]. In the general setting of aperiodic v we will quantify the coherence of families {A ± θ } θ∈[0,τ ] and provide a construction of highly coherent families with associated rigorous coherence bound.
It was shown in [FK17, Appendix A.6] that for a family of sets with sufficient regularity (called "sufficient niceness" therein) the quantity (32) is well defined. Here we will alleviate this requirement entirely by showing regularity of the augmented eigenfunctions f , and showing that this is sufficient to prove the desired results.
Theorem 8 makes a link between the coherence of a particular family of sets defined by zero super/sublevel sets of an eigenfunction ofĜ and the corresponding eigenvalue µ. It shows that the probability of a trajectory remaining in a family of sets constructed from the positive and negative parts of eigenfunctions ofĜ decays no faster than the rate given by the corresponding eigenvalues. This result extends to aperiodically driven continuous-time systems, similar results for autonomous systems in discrete time [FS10] and continuous time [FJK13], and periodically driven dynamics in continuous time [FK17].
where |A| denotes the non-normalised Lebesgue measure of the set A. In particular, eigenfunctions at eigenvalues µ ≈ 0 yield families of sets with high coherence ratio.
Proof. See Appendix B. We note that by Corollary 21 f is continuous onX.
Intuitively, the left hand side of (33) quantifies the likelihood of escape from the family of sets, and the right hand side of (33) is a (scaled) measure of mixing; the bound says that the likelihood of escape is less than the mixing incurred over the same time duration. The bound is not intended to be sharp; we remark that one could optimise the level set cutoff to improve the ratio ρ m , as has been done in previous work on coherent sets [FP09].
In Section 5.3, we consider the dominant 6 eigenvectors ofĜ and apply sparse eigenbasis approximation (SEBA) [FRS18] to find a sparsity-inducing rotation of this eigendata and separate individual slow escape / slow mixing subdomains. The following proposition generalises Theorem 8 so that it may apply to vectors formed from linear combinations of eigenvectors.
The proof Proposition 9 is deferred to Appendix B.
In the computations performed in the next sections we will numerically approximateĜ, compute its upper spectrum and associated eigenfunctions, and plot super/sublevel sets of the eigenfunctions. In Section 5.3 we will additionally apply SEBA, plot the sparse basis functions, and one of the superlevel sets.

Numerical discretization
We use the "Ulam's discretization for the generator" approach developed in [FJK13] for autonomous flows and extended in [FK17, Sections 7.2 and 7.3] for nonautonomous flows. In brief, referring to the above papers for further details, the Ulam discretization for the generator yields a matrix that may be interpreted as a rate matrix of a finite-state, continuous-time Markov chain, with the states corresponding to a partition of τ S 1 × X into hypercubes (hyperrectangles) in R d+1 . The entries which correspond to rates between boxes adjacent in the temporal coordinate direction (in which the time evolution is a rigid rotation of constant velocity 1) are given by 1/h, where τ S 1 is discretized into intervals of length h. The entries in the remaining d space directions are computed from the rate of flux out of the hypercube faces by numerical integration of the component of the velocity field normal to (and pointing out of) the face; see the expression for G drift n in [FK17, Section 7.2]. The entries of the rate matrix corresponding to diffusive dynamics (in the d space coordinates only, there is no diffusion in the time coordinate) are computed from a finite-difference approximation of the Laplace operator; see the expression for G diff n in [FK17, Section 7.2]. We then set G n := G drift n + G diff n . The matrix G n can also be interpreted as a rate matrix for a finite-state Markov chain; it has an eigenvalue 0 and its spectrum is confined to the left half of the complex plane.
The reflected velocity fieldv from (9) may be substituted for the velocity field v used in [FK17] and the methodology of [FK17] employed; this is the approach taken in the numerical experiments below.
Remark 10. The computation of G drift n uses only the outward-pointing velocity field values on the faces of the partition elements-similarly to how the outward flux is defined through the positive part of the inner product in (15)-, discarding the inward-pointing parts. Because of the reflected structure ofv, a slightly more efficient implementation would be to store the evaluations of the velocity field normal to hypercube faces in both directions (not only in the outward-pointing direction). The outward-pointing components would be used on the time interval [0, τ ], while the inward-pointing components would be used on the time interval (τ, 2τ ), where they are outwardpointing because of the sign flip in (9)-similarly as it happens in the proof of Proposition 3. This would reduce by half the computational effort in evaluating the velocity field components normal to the hypercube faces. However, the assembly of the generator matrices is relatively fast anyway, and we have not tried to optimize our implementation of Ulam's method for the generator in this reflected setting.

Example: Periodically driven double gyre
We consider the periodically driven double gyre system [SLM05]: x and the parameters are A = 0.25, Ω = 2π, and γ = 0.25, implying the forcing period 1, on the spatial domain X = [0, 2] × [0, 1]. This system has been a standard example of coherent sets [FPG14]. The purpose of this section is to show that our method reliably computes the singular functions and values of P 0,τ ; further analysis is deferred to later sections. In particular, we will revisit this example in context of optimal manipulation of these coherent sets in sections 7.3 and 7.4. The augmented reflected generator approach with a resolution of 40 × (100 × 50) and noise intensity ε = 0.1 gives us the non-trivial dominant eigenvectors ofĜ at time t = 0 shown in Figure 2. Ordered by ascending magnitude, the 3rd, 4th and 6th eigenvalues (Table 1) and eigenvectors (not shown) correspond to features also detected in [FK17], where they were connected to complex non-companion eigenvalues-the concept of companion eigenvalues will µ 1 0 µ 4 −0.35061 σ 1 1 σ 4 0.24599 µ 2 −0.09033 µ 5 −0.44766 σ 2 0.69674 σ 5 0.16685 µ 3 −0.34938 µ 6 −0.45702 σ 3 0.24720 σ 6 0.16072 Table 1: Eigenvalues (µ k ) ofĜ ordered in ascending magnitude and corresponding approximate singular values (σ k ) of P 0,τ according to (31).
be introduced around (35) below. These features become less coherent, i.e., their respective real eigenvalues decrease compared with the others, as the length τ of the time interval increases.

Example: Bickley jet
We now apply the reflected augmented generator approach to a perturbed Bickley Jet [RBBV + 07].
The following model describes an idealized zonal jet in a band around a fixed latitude, assuming incompressibility, on which two traveling Rossby waves are superimposed. The velocity field v = (− ∂Ψ ∂y , ∂Ψ ∂x ) is induced by the stream function A n cos (k n (x − c n t)) .  9]. For good numerical tractability, we resolve our reflected space-time manifold with a spatially somewhat coarse 108 × (120 × 36) grid, that is uniform in space (120 × 36) leads to square boxes needed for isotropic diffusion) and sufficiently finely resolved in time (108). We choose ε = 0.1. The system described above is equipped with homogeneous Dirichlet boundary conditions instead of homogeneous Neumann conditions on ∂X. This leads to a slightly different spectral structure of the generator, which now generates a semigroup of sub-Markovian operators. Its leading eigenvalue is strictly less than zero. We expect Theorem 8 to hold in the case of Dirichlet boundary conditions. One possible theoretical justification would require "close" the open system that is represented by the homogeneous Dirichlet boundary conditions by introducing a virtual "external" state, then apply the Neumann theory to that system. The details would lead beyond the scope of this work, and will be discussed elsewhere.
We highlight that the computations we are about to perform here are different to those performed in [FK17, Section 7.6] in at least two respects. Firstly, the Bickley jet under investigation is aperiodically driven, in contrast to the periodically driven Bicklet jet in [FK17,Section 7.6] (we use slightly different parameters in the velocity field). Secondly, we wish to find functions that decay the least under finite-time evolution, in contrast to the problem considered in [FK17], which sought functions that decayed at the slowest time-asymptotic (t → ∞) rate under periodic driving. In particular, even for a periodically driven Bickley jet, the finite-time question considered in the present paper is different to the infinite-time question addressed in [FK17]. Thus, even though we chose a flow interval of length 9 as in [FK17], the problem in consideration is different.
Analogously to [FK17, Section 7] our time-augmentation produces companion eigenvalues. Companion eigenmodes denote eigenmodes that are "higher order harmonics" of existing eigenmodes differing only in temporal modulation, and encoding the same coherence information; see below. For more details on the companion eigenvalues for the Ulam-discretization we refer to [FK17, Section 7.3]. We will use and verify the relations derived there. Therefore we calculate the eigenvalues and vectors ofĜ with the smallest magnitude instead of largest real part using eigs(G,10,'SM') in Matlab. Table 2: Eigenvalues (µ k ) ofĜ ordered in ascending magnitude and corresponding approximate singular values (σ k ) of P 0,τ according to (31). The eigenvalues µ 7 , µ 8 , µ 10 correspond to companion modes. They do not yield purely real singular values σ 7 , σ 8 , σ 10 through the exponentiation (31), because the numerically computed companions (35) contain a bias induced by discretization; see [FK17, Section 7.3] for further details. Table 2 shows a gap after the first and sixth eigenvalue. Let us first discuss the leading 6 eigenvectors. Figure 3 shows the eigenvectors corresponding to the dominant (i.e., smallest real part) 6 eigenvalues. The first eigenvector, the quasistationary (or conditionally invariant) distribution highlights (in blue, see Figure 3(a)) parts of the domain that get pushed out of the region X of consideration. The red regions are those parts of phase space that remain longest in X under the diffusive dynamics (1). This effect is due to the outflow conditions. Note that this example is in this sense explorative, as our theory in the previous sections was only considering reflecting and not outflow boundary conditions. The second eigenvector indicates an upper/lower separation. The other four eigenvectors show combinations of coherent vortices.
To investigate which elements of the spectrum ofĜ contain genuinely new dynamical information, we checked that the complex eigenvalues ofĜ from µ 7 to µ 50 are all companion  eigenvalues equal to (recall from section 5.1 that h is the temporal grid spacing) for an eigenvalue µ ofĜ and a k ∈ Z, as derived in [FK17,Section 7.3]. Under the assumption that the eigenvector w is sufficiently smooth in time and time is sufficiently resolved (i.e., w t ≈ w t−h ), each eigenpair (µ, w) of the discretized generatorĜ has an approximate companion pair (µ−µ (k) , wψ k ), where wψ k is understood as pointwise multiplication and ψ k (t) = ω kt , which only varies in time but not in space. For additional verification we can check the correlation of the companion eigenvectors c m n (k) := ψ k w m , w n ψ k w m 2 w n 2 as in [FK17, Section 7.5]. For instance, by looking at µ 1 in Table 2 and noting that µ (±1) = 0.01015 ± 0.34887i (h = 18/108), we find that µ 7 , µ 8 ≈ µ 1 − µ (±1) are candidates for companion eigenvalues for µ 1 . The small difference in the numerical values of µ 1 − µ 7,8 and the shift µ (±1) , around 2.3 · 10 −3 in magnitude, is due to the first eigenvector not being constant in time, i.e., merely w t ≈ w t+h . Nonetheless, the complex eigenvalues µ 7 and µ 8 are companion eigenvalues to µ 1 . This is supported by the correlation for the corresponding eigenvectors is a companion to a real eigenvalue with smaller magnitude. The correlations using (36) yield results similar to those stated in the special case above.

Vortex isolation by sparse eigenbasis approximation
The space-time signatures of six coherent vortices in the Bickley flow are captured in the leading six singular vectors shown in Figure 3 (note only the initial time slice is displayed). In order to isolate these six vortices in space-time, we apply an orthogonal rotation and some sparsification to the six-dimensional subspace of R 108×(120×36) spanned by the leading six (space-time) eigenvectors shown in Figure 3. The orthogonal rotation is chosen so as to construct an approximating basis of six sparse vectors. To find such a sparse approximating basis, we applied the SEBA (Sparse EigenBasis Approximation) algorithm (see [FRS18, Algorithm 3.1]). The six sparse basis vectors ϕ k , k = 1, . . . , 6, produced by this algorithm are shown in Figure 4, and each of these vectors strongly isolates a single vortex. We emphasise that we input the full space-time vectors to the SEBA algorithm, but in Figure 4 display only the initial time slice. In Figure 5 we seed particles inside the calculated vortical features (in the super-level set {ϕ k (0, ·) > 0.4} if ϕ k is scaled to have maximum-norm 1) and evolve them forward in time to visualize the coherence. In addition to the deterministic evolution we also visualize a stochastic evolution using ε = 0.1 as in our calculations above. Both simulations use a fourth-order Runge-Kutta (-Maruyama) scheme with step size 9 4·108 = 1 48 . Figures 5(b) and (c) demonstrate the coherence of the single vortex in Figure 5(a).
We wish to apply Proposition 9 to further demonstrate that the positive parts of the ϕ k (0, ·) represent coherent sets. Because the ϕ k (0, ·), k = 1, . . . , 6 do not exactly span the leading six-dimensional eigenspace ofĜ, Proposition 9 does not directly apply. Nevertheless, using the linear combinations of eigenfunctions f i that result in the SEBA-features ϕ k (0, ·) we found that  the hypotheses of Proposition 9 were satisfied with the following modification: wherever the contributions (34) were negative, the corresponding α i in (34) were set to zero. Any α i that needed to be treated in this way was very close to zero. We do not depict these slightly modified linear combinations as they are still very close to the SEBA-features in Figure 4.

Optimization
Having developed an efficient means of computing singular vectors of P 0,τ as eigenfunctions of the augmented generatorĜ, we turn our attention to manipulating these eigenfunctions. These manipulations will be used to control the mixing properties of aperiodic flows. Theorem 8 provides a construction of a family of coherent sets {A ± t } t∈[0,τ ] from eigenfunctions ofĜ, with a coherence guarantee controlled by the corresponding eigenvalues. Our goal now is to either enhance or diminish the coherence of a family of sets related to an eigenvalue µ k by small timedependent perturbations u(t, x) of the velocity field v(t, x). This will be achieved by optimally manipulating v(t, x) to increase or decrease the real part of the second eigenvalue µ 2 ofĜ.
For m ≥ 2 let H m ((0, τ ) × X, R d ) denote the Sobolev space of vector fields on (0, τ ) × X whose weak derivatives of order up to m are square integrable 8 . As we have previously assumed that div x v = 0 to simplify our presentation, for consistency we consider the subspace D 0 of H m ((0, τ ) × X, R d ) consisting of spatially divergence-free vector fields that satisfy homogeneous Neumann boundary conditions: ∂u ∂n = 0 on ∂X. We consider small perturbations u lying in a bounded, closed and strictly convex subset C ⊂ D 0 .
We adopt the approach of [FS17], who select a perturbation u so as to maximise the derivative of the real part of µ 2 (or a group of leading eigenvalues) with respect to the perturbation. The perturbation was made to the Ulam-discretized generator of the vector field, and optimized using linear programming; the perturbed velocity field could then be inferred from the optimized generator. In [FS17] the velocity field was assumed to be time-periodic, however, the same optimization approach could be applied to aperiodic velocity fields using the time-reflected velocity field introduced in this work; one would need only to additionally impose the relevant time-reflection constraints in the optimization problem for the Ulam-discretized generator. In the present work, we consider aperiodic velocity fields and in contrast to [FS17] we perturb the velocity field directly and solve the resulting optimization problem by Lagrange multipliers. This potentially allows for greater flexibility in the discretization scheme.
To theoretically justify our approach in the infinite-dimensional setting we need some results from [KLP19] regarding the regularity of the spectrum of P 0,τ with respect to perturbations of the velocity field. Transferring these results to the regularity of the spectrum ofĜ with respect to velocity field perturbations, we derive a first variation of µ k with respect to u and detail the steps below in section 6.1. In section 6.2 we specify the constraints and we then proceed with discussing necessary and sufficient (section 6.3) conditions for the optimization. Finally we summarize the result of the construction of our optimization.

Objective functional and its smoothness
We choose an eigenvalue µ k ofĜ corresponding to a feature we want to enhance or diminish. We want to alter the real part of µ k with a perturbation u of the original velocity field v as much as possible within our constraints. This is because by Theorem 8 the real part of µ k is a measure for the coherence of the family of features highlighted by the corresponding eigenvector. Our chosen objective functional should be a good measure of the change of µ k with respect to u. The response of an eigenvalue or a singular value with respect to a perturbation in this infinite-dimensional setting is in general complicated. Therefore we approximate it locally via linearization; that is by computing a first variation or first-order Taylor expansion. In what follows, we assume that µ k is real; the obvious modifications can be made if µ k is complex by considering the real parts.
Proof. Using (38), the following estimate shows that c is continuous.
Here · ∞ denotes the canonical L ∞ ((0, 2τ ) × X) norm. The Fréchet differentiability of c is straightforward because c is linear.
We will prove further relevant properties of c in section 6.3.

Constraints
As mentioned above we consider perturbations u ∈ C, a bounded, closed and strictly convex subset of D 0 ⊂ H m ((0, τ ) × X; R d ). For our objective functional c to be a valid estimate of the change in µ k due to the perturbation u, we restrict u to be small using R > 0. We consider a ball or ellipsoid in the form of the following energy constraint. For multi-indices α and a weight vector ω = (ω α ) |α|≤m with 0 < ω α ∈ R for all |α| ≤ m we require where D α = (∂ α 1 1 · · · ∂ α d d ). The functional h is continuously Fréchet differentiable in u from H m to R since B ω is a bounded positive definite bilinear form. We consider C = U ∩ {h ≤ 0} where U is a proper subspace of H m ((0, τ ) × X; R d ) and might only have a relative interior. 10 U describes the set of admissible perturbations to the original vector field v. By construction in (39) there are constants γ > 0 (B ω is bounded) and β > 0 (B ω is positive definite) such that

Now, Theorem 22 ensures the existence of a linear, bounded, injective and self-adjoint operator
We will use J B (ω) to derive an explicit formula for the optimal solution.

Optimality conditions
We have an optimization problem with a continuous linear objective c and a closed, bounded, strictly convex feasible set C, defined by a single constraint {h ≤ 0} on a subspace U . Lagrange multipliers provide a convenient and explicit solution to this problem, e.g. [Lue97], however, first we establish a general existence and uniqueness result.

Unique optimum
Lemma 12. Let C be a closed, bounded, and strictly convex subset of H m containing the zero element in its (relative) interior and c : C → R be a bounded linear functional that does not uniformly vanish on C. Then the optimization problem min u∈C c(u) has a unique solution u * ∈ C.
Proof. Continuity of c and boundedness of C imply inf u∈C c(u) = α > −∞. Let u k ∈ C be such that lim k→∞ c(u k ) = α. This sequence is bounded, and so there is a weakly convergent subsequence u n k u * . The set C is closed and convex and therefore also weakly closed, which implies u * ∈ C. By the definition of weak convergence c(u * ) = lim k→∞ c(u n k ) = α follows. Therefore u * is a solution for our optimization problem.
In order to demonstrate uniqueness assume we have two solutions u 1 = u 2 with c(u 1 ) = c(u 2 ) = α. Strict convexity of C implies that u 3 := u 1 /2 + u 2 /2 ∈ int(C), where the relative interior is meant. Linearity of c implies c(u 3 ) = α. Let r > 0 be such that an open ball of radius r centred at u 3 is contained in int(C).
Because c does not vanish on C and the zero vector is in the relative interior of C, there exists a v ∈ C such that c(v) < 0. By linearity of c we have c(u 3 + (r/2)v) < α, contradicting optimality of u 3 and establishing uniqueness of the optimum.

Necessary conditions
The following property will be used for the necessary conditions in the theory of Lagrange multipliers in Lemma 14.
Lemma 13. The unique optimal solution u * is a regular point for h : U → R.
In order to distinguish between the point at which the derivative is taken and the input of the resulting mapping, we will use square brackets for the reference point of the derivative and round brackets for the input of the resulting mapping. 11 Proof. Following the definition on [Lue97,p. 240 The uniqueness argument in the proof of Lemma 12 shows that we may replace our constraint h(u) ≤ 0 with h(u) = 0. We now use [Lue97, Theorem 1, p. 243], stated below. We thus obtain the two necessary conditions: h(u * ) = 0. (41)

Sufficient conditions
We now prove that the necessary conditions (40) and (41) are in fact also sufficient. Because our objective is linear and our constraint is of inner product form, we take a direct approach to developing sufficient conditions, avoiding more complicated general theory.
Proposition 15. Let C = U ∩ {h ≤ 0}. There are exactly two elements of C that satisfy (40) and (41). One is the unique minimizer (with z > 0) and the other is the unique maximizer (with z < 0).
Proof. Lemma 12 guarantees the existence of at least two extrema (one minimum and one maximum), and therefore at least two distinct elements u, w ∈ C satisfying (40) and (41). We show that these are the only such elements. There exist z u , z w ∈ R such that (u) = 2 u, u m,ω = 2 u 2 m,ω = 2R 2 and similarly that w 2 m,ω = R 2 . Thus, either z u = z w or z u = −z w . If z u = z w , then we have u = w, while if z u = −z w , then u = −w. Thus, the only possibility for distinct u and w is that u = −w, and therefore that there are at most two functions satisfying the necessary conditions. Finally, without loss, assume that u is a minimum. Since c [u](u) + z u h [u](u) = c(u) + z u R 2 = 0 and c(u) < 0 if u is a minimum, we must have z u > 0. Therefore z w < 0, implying that c(w) > 0 and that w is a maximum.
Using the injective operator J B (ω) we can solve the necessary and sufficient conditions (40) and (41) for the optimal solution u * , leading to (44) below. First we can transform (40) into an equation in H m using the Riesz representation theorem. We now know that u * exists and fulfills the following equation.
Here c R ∈ H m is the Riesz representation of the functional c on H m . Thus c R is in the range of J B (ω) and we can apply J B (ω) −1 to c R . Now we can solve (42) for J B (ω)u * and for u * because Proposition 15 guarantees z = 0, giving Inserting these expressions into (41) leads to Solving (43) for z > 0 (minimizer) and using this z leads to the following explicit expressions

Optimization of µ k numerically
In this section we apply the results derived in section 6 to some examples.

Discrete optimization problem
Before discretising the objective functional we want to construct C N , a finite dimensional version of the constraint set C. Therefore we choose finitely many basis elements {ϕ } =1,...,N spanning the admissible subspace U N := span{ϕ } =1,...,N for our perturbations. We then intersect U N with C and represent elements by their coefficient vectors in R N with respect to the chosen basis. Hence we define R N ⊃ C N C ∩ U N ⊂ H m . Coefficient vectors will be denoted with a bar¯in the following. We will omit theˆin the following calculations, but note that it can be done analogously using augmented reflected objects. The energy neighborhood constraint (39) can be expressed as a quadratic constraint in the coefficient vectorū ∈ R N , This constraint describes a strictly convex set (ball or ellipsoid) in R N . Regarding the objective functional c(u) = g k , Ef k L 2 , we have to account for the two possibly different bases for discretization: (i) the discretization of u, and (ii) the discretization ofĜ, which can also involve a test and an ansatz basis. Let us denote the basis functions for the discretization of f k by {χ j } j=1,...,m and g k by {ξ i } i=1,...,n for now. Then for f k = m j=1f k j χ j and g k = n i=1ḡ k i ξ i we have Using this last equation we can calculate a finite-dimensional representation of E acting from span{ξ j } j to span{χ i } i . Due to linearity we may decompose in and separately compute E in a similar way to the numerical approximation of G, outlined in Section 5.1. We use Ulam's method to discretize the generators, taking m = n and (ξ j ) j = (χ i ) i to be indicator functions of space-time boxes. The cost vector can be constructed bȳ The discretized optimization problem then has a linear objective and a single quadratic constraint maxc Tū Since the energy constraint is induced by a scalar product, the matrix B ω is invertible by typical arguments for Galerkin discretization (i.e., B ω is symmetric positive definite). Thus the optimal solution can be obtained with Lagrange multipliers, identical to the analysis leading to (44), and is given by In practice we choose the "admissible energy" R sufficiently small to ensure the approximate validity of our linearised objective functional. We can then iterate the optimization process as a gradient ascent/decent method to invest more cumulative energy in the perturbation. Each step consists of constructingc and solving the equations (45). The construction ofc requires the calculation of G, g k , f k , and E (or the respective augmented reflected objects); the latter are fixed through all optimization steps and do not need to be updated. We apply this procedure in the following examples.
Remark 16. All of our finitely many perturbation ansatz functions (ϕ ) =1,...,N , introduced in the next section 7.2, are C ∞ thus · H m and · L 2 are equivalent on span{ϕ l } l . Therefore for numerical convenience, in the numerical examples we use m = 0 to calculate B ω , although the functional c is only strictly well defined for m ≥ 1.

Perturbing fields
We construct a suitable basis for velocity field perturbation as follows. Our spatial domain will be a rectangle. In order to impose zero velocity normal to the boundary and divergencefreeness, we construct the spatial components of the basis vectors {ϕ } =1,...,N with a possible constant movement in x or y direction from smooth stream functions Ψ kl and then multiply these components with time-dependent scalar (amplitude) functions φ r . For a rectangular domain [a x , b x ] × [a y , b y ] we take the streamfunctions k = 1, . . . , K, l = 1, . . . , L, which are slightly modified Fourier modes that induce a velocity field ψ kl := (− ∂Ψ kl ∂y , ∂Ψ kl ∂x ) with k horizontal gyres and l vertical gyres that satisfy the homogeneous Neumann boundary conditions in space and are divergence free in space. These fields may travel in x direction with speed c x or in y direction with speed c y .
We use L 2 ((0, τ ) × X)-normalized versions of the functions where φ r is a scalar (temporal) modulation of the amplitude of the spatial Fourier modes: We omit using r = 1 for the sin-modulation, since it would have both positive and negative values, meaning a sign change of the perturbing velocity field during the evolution. Such perturbing fields proved to be less efficient in early numerical experiments. Thus, the increasing time-linear modulation is assigned r = −1 to avoid confusion. In summary, time-modulation of the perturbing fields is described by φ r (t), r ∈ {−1, 0, 2}, and we have 3KL = N basis functions {ϕ } =1,...,N in total.

Increasing coherence: forced double gyre flow
Extending the example from section 5.2, our goal in this section is to increase coherence of the left-right separation captured by the 2nd eigenvalue ofĜ (equivalently, the left 2nd singular vector of the transfer operator P 0,τ ), Figure 2 (a). The original velocity field has a total energy (space-time L 2 norm) of ≈ 1.6. We will use an optimization budget of R = 0.05per step to increase the 2nd eigenvalue µ 2 ofĜ which encodes the left-right separation of the domain. We iterate our optimization procedure 8 times to invest a total energy of 0.4, which is 25% of the energy of the original velocity field. This seems like a moderate investment, but we note that while in general it is easy to destroy coherence by almost any perturbation, to increase it, the dynamics and the perturbation need to work together-making it harder to increase coherence than to decrease it. Following the formula (45) to optimize coherence (i.e, minimize mixing) we use −ū, where we recall thatū denotes the coefficient vector representing the optimal perturbation. Of course the other eigenvalues of the generator also change as the velocity field is perturbed. In each iterative step we check a posteriori that the second eigenvalue did indeed increase, which we consider an indicator for the validity of our objective functional. Our perturbation library consists of the functions from (46) for k = 1, . . . , 5, l = 1, 2, 3 with c x = 0 = c y and the time modulation r = −1, 0, 2, hence N = 45 = 5 · 3 · 3. After the 8 iteration steps we arrive at the solution vectorū (8) ∈ R N and the effective change of the second eigenvalue singular value: Each step of the iteration increases the eigenvalue roughly by 0.003. The result is visualized in Figure 6.  . We repeated the same noisy evolution procedure for ε = 0.01; then the 9% of particles changing sides originally were reduced to 5% for the coherence-improved velocity field. This is shown in Figure 6 (b)-(d), where the seeded particles are shown at initial time and colored according to whether they end up left or right of the line x = 1 after this noisy evolution, at time t = τ . Note that the time direction is unimportant and we could have colored at the initial time t = 0 and evolved forward.
The optimization of the left-right coherence in the periodically forced double gyre has also been considered in [FS17, Section 6.3]; see, in particular Figure 18 therein, where the regular regions of the flow were dramatically increased.

Decreasing coherence: forced double gyre flow
We now wish to diminish the coherence of various coherent features. Firstly, the left-right separation discussed in Section 7.3. Using the same optimization protocols as in Section 7.3, but switching the sign of the objective, we produce a velocity field that should increase mixing across the left-right separatrix. This is indeed indicated in Figure 6 (d). These results are consistent with the results of [FS17], where the "lobes" of stable and unstable manifold intersections greatly increased [FS17, Figures 11 and 14].
We now turn our attention to the two central vortices, which are encoded in the 5th eigenmode (and after one iteration step of the optimization in the 6th eigenmode). We use the same perturbation basis and energy criterion for the iteration as above (8 iterative optimization steps). After a first optimization step of the iteration described at the end of section 7.1 the gyre feature (initially 5th eigenvalue) is pushed to the 6th eigenvalue spot, i.e., there is an interchange position with the mentioned features in terms of ranking with respect to their coherence. Thus, we have to keep track of the ranking of the eigenvalues during our iterative optimization procedure, which we do here manually between each step. This could be done in an automated fashion similarly to section 5.3, by computing correlations between eigenvectors of successive iterates.
We obtain the following change in the eigen-and (corresponding) singular values: Next we seed particles in the vortices induced by the level sets of the 5th eigenvector shown in Figure 2 and evolve them with and without perturbation. The results are visualized in Figure 7. We note that increasing mixing in the double gyre flow has been considered in [FS17, Section

Targeted manipulation of distinguished coherent features
In certain situations it may be of interest to manipulate the mixing of parts of phase space that do not arise as eigenmodes. For example, these parts of phase space may be individual (or combinations of) SEBA vectors as described in section 5.4, or they may be related to the phase space geometry. We describe a procedure to accomplish this. We recall Proposition 5 and Theorem 8, where we showed and that the sign structure of f (recall f necessarily has zero mean) indicates a family of finite-time coherent sets. Further, the coherence ratio of the family is bounded by an expression involving µ, indicating more coherence the closer µ is to 0. By normalising f so that f = 1 we see smaller Ĝ f corresponds to more strongly coherent features encoded in f . Now let ϕ ∈ D(Ĝ) be a general normalised, zero-mean space-time feature; that is, ϕ is not an eigenfunction. We might think of ϕ being mean-removed SEBA vector or a mollified (such that it is in the domain D(Ĝ) of the generator) version of ϕ = 1 C − |C|1 X , the meancentred 12 indicator function of a possibly coherent family C ⊂ X of sets in augmented-space representation, where |C| denotes the augmented-space Lebesgue measure of C. As we would in general like ϕ to represent a finite-time coherent set, we should restrict our attention to features satisfying ϕ(t, ·) ≈ ϕ(2τ − t, ·).
Analogously to the case of an eigenfunction f , to quantify the coherence of a feature ϕ that is not necessarily an eigenfunction, we employ the heuristic of measuring Ĝ ϕ . The rationale for this is as follows. If a family of sets encoded by the eigenvector f is completely coherent (in the absence of diffusion), then the temporal change ("movement") of the sets at any time θ, namely ∂ θ f (θ), would be identical to how the dynamics transports the mass located in the set, i.e., ∂ θ f (θ) =Ĝ(θ)f (θ, ·) by the Fokker-Planck equation (11). Thus, if the coherence of the feature ϕ is strong, one has ∂ θ ϕ(θ, ·) −Ĝ(θ)ϕ(θ, ·) ≈ 0 for all θ, leading to Ĝ ϕ ≈ 0. Section 3 gives a geometric view on the very same situation: in (14) and (17), if the boundary of a time-dependent set moves with a velocity b(t, x) that is approximately equal to the velocity field v(t, x) driving the dynamics, then the outflow from this family of sets will be small-and this can analogously be quantified by the space-time flux (18).
Thus, to destroy a coherent feature encoded in ϕ we could maximize Ĝ ϕ 2 with respect to the perturbing fields u. Again, as this is a nonlinear problem, we approach it by local optimization, and aim to maximize the objective function given by the local linear change, subject to constraints on the perturbation u. Conversely, if we wish to enhance a coherent feature ϕ we should minimize Ĝ ϕ 2 . If we would simultaneously like to destroy coherence of a feature ϕ 1 and enhance the coherence of other features encoded in ϕ 2 , then we would maximize with weights α 1 , α 2 > 0. 12 We note that the removal of the mean from ϕ makes no difference for the optimization of the objective function in (48) below, sinceĜ1 X = 0, however we keep this for the intuitive connection with "eigenfeatures" and Theorem 8.

Non-eigenfeature optimization: traveling wave
We consider a traveling wave example [Pie91,SW06,FLS10] given by . Here, two rotating gyres move in x direction with speed ν = 0.25 and are superimposed with a constant drift c drift = 1. To account for this constant speed and periodicity in x-direction we choose k = 2, 4, . . . , 20, l = 1, . . . , 5 and c x = ν for our perturbation dictionary; see Section 7.2 and equation (46). We remark that Balasuriya [Bal15] has investigated a similar dynamical system and considered single "one at a time" (as opposed to general linear combinations of) perturbations drawn from a family similar to ours. In [Bal15], flux out of a small "gate" connecting a stable and unstable manifold is taken as a measure of mixing. In contrast, we measure mixing through the L 2 -norm decay of an initial concentration field, and we find the unique perturbation in a convex subset of a 150-dimensional subspace that maximizes the change in mixing rate. We use the resolution 80×(80×40) and ε = 0.1. We now take a feature that is not described by an eigenfunction and aim to increase its coherence. The feature we choose is the mean-centered version of the following time-constant and horizontally constant profile which is shown in the coloring of Figure 8 (a). We iteratively update the perturbation by solving problem (48) in each step. We iterate for 35 steps with an energy budget of R = 0.1 (∼ 1% of the original energy of v) per step. Figure 8   We next investigate which perturbing basis functions are favored by the optimization. First, we note that the basis function induced by ϕ 0,2,1 is equal to the original velocity field up to the constant horizontal drift c drift . Thus, it is conceivable that this basis function is heavily used in order to partly cancel the original velocity field and slow the flow down.
The amplitudes of all of the streamfunctions in the optimized solution are shown in Figure 9, ordered according to amplitude and grouped according to the spatial mode. The streamfunctions of the spatial modes with the largest amplitudes in the optimal solution (first to fourth) are shown in Figure 10. Altogether, these modes combined to perturb the traveling double gyre towards a laminar horizontal flow (Figure 11 (c)). Figure 9: Plot of optimal perturbation coefficientsū ordered in decreasing magnitude and grouped by spatial mode ψ kl (cf. (46)). The corresponding k and l are labels on upper and lower x axes, respectively. The plot is cut off at y = 0.75 for visualization purposes. The first red bar has height 7.6.
We additionally extended the above computation to 100 steps to investigate the asymptotic behavior under large perturbations. Around step 95 the increase in coherence diminishes rapidly, as the value of the objective function Ĝ (v + u)ϕ 2 approaches a plateau. The corresponding velocity field approaches a purely laminar flow (results not shown here), as the optimal perturbation cancels the rotational part of the original velocity field.

Conclusions
Froyland and Koltai [FK17] introduced a time-augmented construction to enable the efficient numerical construction of the infinitesimal generator of a periodically driven flow; [FK17] built on earlier work for steady flows [FJK13]. In the first part of this work, we extended these results to a finite-time flows with general aperiodic driving, by a novel "reflected" process. Proposition 3 provided a formula for the accumulated outflow from a general time-dependent family of sets. Proposition 5 derived a spectral mapping theorem for our reflected process, connecting the spectrum of the time-augmented generator with the spectrum of the corresponding reflected evolution operator. Using the sign structure of the time-augmented eigenfunctions of the augmented generator we built a family of coherent sets and Theorem 8 lower bounded the coherence ratio of this family in terms of the corresponding eigenvalue.
In the second part of this work, we built on the optimization techniques of [FS17] to optimally enhance or destroy coherent features encoded in eigenfunctions. We directly manipulated the underlying drift field, subject to energy constraints, and proved that the "small perturbation" problem has a unique optimum (Proposition 15). Using Lagrange multipliers, we derived an explicit formula for the infinitesimal drift field perturbation; this would be very difficult to achieve without using a generator framework. In Section 7 we implemented a multi-step gradient descent method, utilising the efficient time-augmented generator framework, and the explicit Lagrange multiplier solution, to optimize over relatively large energy budgets.
An advantage of our optimization approach is that the basis of perturbing velocity fields is fixed; thus their generators need only be computed once. On the other hand, the generators are very large, sparse matrices, of which we need to compute the spectrum with the smallest real part. Hybrid spectral discretization techniques as in [FK17] or multilevel solvers can be a remedy to this. Finally, the approach could be extended to the non-zero divergence case, and to open systems by considering homogeneous Dirichlet boundary conditions in the Fokker-Planck equation (2).  (c) Streamfunction of final velocity field v + u. Figure 11: Streamfunctions of the original velocity field, optimal perturbation, and the perturbed velocity field at t = 0.
(3) The coefficients of G(·) are bounded and uniformly continuous in x on X [Tan96, sec. 6.13].
(4) The differential operator G(·) and the boundary operator B(t, x)(ξ) := n(x), ξ , satisfy the complementing conditions [Tan96,p.131] for all t ∈ [0, τ ]. Before we continue with the proof, let us remark, that most of the proof has already been done in [Tan96] and some conditions ((3),(4) and (5)) become trivial in our case because the highest order part (called principal part in [Tan96]) of G(t) is the time-independent Laplace operator ∆.
Proof. Lemma 18 ensures that we can apply the results from [Tan96] that we will use in the following. The results from [Tan96, sec. 6.13] that use more abstract results from Acquistapace and Terreni ( [AT86] and [AT87]) give the existence of a unique solution f to (50) [Tan96, Thm. 6.6] and a corresponding two parameter family of solution operators (called fundamental solution in [Tan96]) (P s,t ).
• For all t > s we the solution can be expressed as f (t) = P s,t f (s) with the two parameter solution family (P s,t ) t≥s . This follows from [Tan96, Thm. 6.5].
• For every t > s the operator P s,t is compact on L p (X). The result [Tan96, Thm. 6.6], i.e., P s,t f (s) = f (t) ∈ D p for f (s) ∈ L p (X)), gives that for t > s P s,t : L p (X) → D p ⊂ W 2,p (X) is a bounded linear operator from L p (X) to W 2,p (X). Now the Rellich-Kondrachov embedding theorem [AF03, Thm. 6.3] states that W 2,p (X) is compactly embedded in L p (X). Thus P s,t is a compact operator from L p (X) to L p (X), because it maps bounded sets in L p to bounded sets in W 2,p (X) which are relatively compact sets in L p (X).
Due to the common domain of all G(t) we get a better regularity for the the solution if the initial condition is more regular. More precisely, we use the result [Lun95, Cor. 6.1.9 (iv)] to prove the following.

Proof.
(i) This result is basically [Lun95, Cor. 6.1.9 (iv)] with D p = D(G(t)) being the common domain of the family of unbounded operators (G(t)) t∈[0,τ ] in the Banach space L p (X) (D p is dense in L p (X)), and the resulting two parameter evolution family is (P s,t ) t≥s .
(ii) For every t ∈ [0, τ ] we have f (t, ·) ∈ W m,p (X). From the appropriate Sobolev embedding theorem [AF03,Thm. 4.12] (also cf. Morrey) follows that for domains as smooth as our X ⊂ R n the space W m,p (X) is continuously embedded in C α (X) for 0 < α ≤ m − d p , i.e., there exists a K > 0 such that g C α ≤ K g W 2,p for all g ∈ W 2,p (X). This with the regularity f ∈ C([0, τ ]; W m,p (X)) further implies that the constant for the continuous embedding for (f (t)) t∈[0,τ ] into C α can be taken as independent of t.

B. Proof of Theorem 8 and Proposition 9
Proof of Theorem 8. The proof strongly follows the lines of those of [FK17, Theorems 16 and 19], which, in turn, borrows ideas from [FS10,FS13]. It consists of two main steps. First, we consider the events E n := ω | x r i (ω) ∈ A + r i , ∀ i = 1, . . . , n for a dense sequence of times, and show that E n ↓ E := r∈[0,τ ] {ω | x r (ω) ∈ A + r }. Second, we use this approximation to bound the retention probability in the family, hence the coherence ratio.
Step 1. Let (r i ) i∈N be a dense sequence in [0, τ ] such that r 1 = τ ; this latter condition is needed such that the decomposition in (53) below is always possible. The events E n are clearly measurable. Since the paths t → x t (ω) of the process (1) are continuous, so is t → f (t, x t (ω)) =: F t (ω). By corollary 21 we have that f ∈ C([0, 2τ ] × X). To see E n ↓ E, note that by the continuity of t → F t it holds For a finite measure π let P π := P(· | x 0 ∼ π) denote the law of the process (1) with initial distributions π. If π is not a probability measure, then we denote P π = π(Ω)P π/π(Ω) , where Ω is the entire event space. If, additionally, π is a signed measure, then P ν := P π + − P π − , where π + and π − denote the positive and negative parts of the signed measure π, respectively, in the sense of the Hahn decomposition, i.e. π = π + − π − . Now, the σ-additivity of P yields also P π (E n ) → P π (E) as n → ∞.
Step 2. By Proposition 5 we have that f (0, ·) is an eigenfunction of the integral preserving operator P * 0,τ P 0,τ at the eigenvalue e 2µτ < 1. Thus we have X f (0, ·) dm = 0, and with it X f (t, ·) dm = e −µt X P 0,t f (0, ·) dm = 0 again by the integral-preserving property. We define the signed measure ν via dν(x) = f (0, x) dm(x). With the given scaling of f we have that A ± τ f (τ, ·) dm = ±1. By Proposition 5 we have that P 0,t f (0, ·) = e µt f (t, ·), thus for every n ∈ N we have e µτ = e µτ A + τ f (τ, ·)dm The last equality follows from the decomposition of the event {x r 1 ∈ A + τ } = {x τ ∈ A + τ } into disjoint events {x r i ∈ A + r i for all r i > r j , but x r j / ∈ A + r j }, j = 2, . . . , n, and {x r i ∈ A + r i for all i = 1, . . . , n}. One can see [FJK13] that p j ≤ 0, because the set of initial conditions x r j / ∈ A + r j is contained in the non-positive support of ν. It follows that e µτ ≤ P ν (x r i ∈ A + r i , i = 1, . . . , n) = P ν (E n ).
Proof of Proposition 9. The proof is entirely analogous to that of Theorem 8, except that the system of equations containing (53) is altered. The deviating part is the system of inequalities, which follows by the assumptions of the proposition: