A new approach for American option pricing: The Dynamic Chebyshev method

We introduce a new method to price American options based on Chebyshev interpolation. In each step of a dynamic programming time-stepping we approximate the value function with Chebyshev polynomials. The key advantage of this approach is that it allows to shift the model-dependent computations into an offline phase prior to the time-stepping. In the offline part a family of generalised (conditional) moments is computed by an appropriate numerical technique such as a Monte Carlo, PDE or Fourier transform based method. Thanks to this methodological flexibility the approach applies to a large variety of models. Online, the backward induction is solved on a discrete Chebyshev grid, and no (conditional) expectations need to be computed. For each time step the method delivers a closed form approximation of the price function along with the options' delta and gamma. Moreover, the same family of (conditional) moments yield multiple outputs including the option prices for different strikes, maturities and different payoff profiles. We provide a theoretical error analysis and find conditions that imply explicit error bounds for a variety of stock price models. Numerical experiments confirm the fast convergence of prices and sensitivities. An empirical investigation of accuracy and runtime also shows an efficiency gain compared with the least-square Monte-Carlo method introduced by Longstaff and Schwartz (2001).


Introduction
A challenging task for financial institutions is the computation of prices and sensitivities for large portfolios of derivatives such as equity options. Typically, equity options have an early exercise feature and can either be exercised at any time until maturity (American type) or at a set of pre-defined exercise dates (Bermudan type). In lack of explicit solutions, different numerical methods haven been developed to tackle this problem. One of the first algorithms to compute American put option prices in the Black-Scholes model has been proposed by Brennan and Schwartz (1977). In this approach, the related partial differential inequality is solved by a finite difference scheme. A rich literature further developing the PDE approach has accrued since, including methods for jump models (Levendorskiȋ (2004), Hilber et al. (2013)), extensions to two dimensions (Haentjens and int Hout (2015)) and combinations with complexity reduction techniques (Haasdonk et al. (2013)). Besides PDE based methods a variety of other approaches has been introduced, many of which trace back to the solution of the optimal stopping problem by the dynamic programming principle, see e.g. Peskir and Shiryaev (2006). For Fourier based solution schemes we refer to Lord et al. (2008), Fang and Oosterlee (2009). Simulation based approaches are of fundamental importance, the most prominent representative of this group is the Least-squares Monte-Carlo (LSM) approach of Longstaff and Schwartz (2001), we refer to Glasserman (2003) for an overview of different Monte-Carlo methods. Fourier and PDE methods typically are highly efficient, compared to simulation, however, they are less flexible towards changes in the model and particularly in the dimensionality.
In order to reconcile the advantages of the PDE and Fourier approach with the flexibility of Monte Carlo simulation, we propose a new approach. We consider a dynamic programming time-stepping. Let X t be the underlying Markov process and the value function V t is given by, with time steps t < t + 1 < . . . < T and payoff function g. The computational challenge is to compute E[V t+1 (X t+1 )|X t = x] for for all time steps t and all states x, where V t+1 depends on all previous time steps.
In order to tackle this problem, we approximate the value function in each time step by Chebyshev polynomial interpolation. We thus express the value function as a finite sum of Chebyshev polynomials The choice of Chebyshev polynomials is motivated by the promising properties of Chebyshev interpolation such as • The vector of coefficients (c t+1 j ) j=0,...,N is explicitly given as a linear combination of the values V t (x k ) at the Chebyshev grid points x k . For this, equation (1.1) needs to be solved at the Chebyshev grid points x = x k only.
• Exponential convergence of the interpolation for analytic functions and polynomial convergence of differential functions depending on the order.
• The interpolation can be implemented in a numerically stable way.
The computation of the continuation value at a single time step coincides with the pricing of a European option. Its interpolation with Chebyshev polynomials is proposed in Gaß et al. (2018), where the method shows to be highly promising and exponential convergence is established for a large set of models and option types.
Moreover, the approximation of the value function with Chebyshev polynomials has already proven to be beneficial for optimal control problems in economics, see Judd (1998) and Cai and Judd (2013).
The key advantage of our approach for American option pricing is that it collects all model-dependent computations in the generalized conditional moments Γ j,k = E[T j (X t+1 )|X t = x k ]. If there is no closed-form solution their calculation can be shifted into an offline phase prior to the time-stepping. Depending on the underlying model a suitable numerical technique such as Monte Carlo, PDE and Fourier transform methods can be chosen, which reveals the high flexibility of the approach. Once the generalized conditional moments Γ j,k are computed, the backward induction is solved on a discrete Chebyshev grid. Which avoids any computations of conditional expectations during the time-stepping. For each time step the method delivers a closed form approximation of the price function x → c t j T j (x) along with the options' delta and gamma. Since the family of generalized conditional moments Γ j,k are independent of the value function, they can be used to generate multiple outputs including the option prices for different strikes, maturities and different payoff profiles. The structure of the method is also beneficial for the calculation of expected future exposure which is the computational bottleneck in the computation of CVA, as investigated in Glau et al. (2018).
The offline-online decomposition separates model and payoff yielding a modular design. We exploit this structure for a thorough error analysis and find conditions that imply explicit error bounds. They reflect the modularity by decomposing into a part stemming from the Chebyshev interpolation, from the time-stepping and from the offline computation. Under smoothness conditions the asymptotic convergence behaviour is deduced.
We perform numerical experiments using the Black-Scholes model, Merton's jump diffusion model and the Constant Elasticity of Variance (CEV) model as a represenative of a local volatility model. For the computation of the generalized conditional moments we thus use different techniques, namely numerical integration based on Fourier transforms and Monte Carlo simulation. Numerical experiments confirm the fast convergence of option prices along with its delta and gamma. A comprehensive comparison with the LSM reveals the potential efficiency gain of the new approach, particularly when several options on the same underlying are priced.
The rest of the article is organized as follows. We introduce the problem setting and the new method in Section 2 and provide the error analysis in Section 3. Section 4 discusses general traits of the implementation and Section 5 presents the numerical experiments. Section 6 concludes the article, followed by an appendix with the proof of the main result.

The Chebyshev method for Dynamic programming problems
First, we present the Bellman-Wald equation as a specific form of dynamic programming. Second, we provide the necessary notation for the Chebyshev interpolation. Then we are in a position to introduce the new approach and its application to American option pricing.

Optimal stopping and Dynamic Programming
Let X = (X t ) 0≤t≤T be a Markov process with state space R d defined on the filtered probability space (Ω, F, Peskir and Shiryaev (2006). In discrete time the optimal stopping problems can be solved with dynamic programming. Namely, with time stepping t = t 0 < . . . < t n = T the solution of the optimal stopping problem can be calculated via backward induction Note that n refers to the number of time steps between t and T . For notational convenience, we indicate the value function at each time step with subscript t u to directly refer to the time step t u . For a detailed overview of optimal control problems in discrete time we refer to Peskir and Shiryaev (2006).

Chebyshev polynomial interpolation
The univariate Chebyshev polynomial interpolation as described in detail in Trefethen (2013) has a tensor based extension to the multivariate case, see e.g. Sauter and Schwab (2010). Usually, the Chebyshev interpolation is defined for a function on a [−1, 1] D domain. For an arbitrary hyperrectangular X = [x 1 , x 1 ]×. . .×[x D , x D ], we introduce a linear transformation τ X : [−1, 1] D → X componentwise defined by The Chebyshev polynomials are defined for z ∈ [−1, 1] D and j ∈ J by and the j-th Chebyshev polynomial on X as p j (x) = T j (τ −1 X (x))1 X (x). The Chebyshev points are given by and the transformed Chebyshev points by x k = τ X (z k ). The Chebyshev interpolation of a function f : X → R with D i=1 (N i + 1) summands can be written as a sum of Chebyshev polynomials where ′′ indicates the summand is multiplied by 1/2 if k i = 0 or k i = N i .

The Dynamic Chebyshev method
In this section, we present the new approach to solve a dynamic programming problem via backward induction using Chebyshev polynomial interpolation.
Definition 2.1. We consider a Dynamic Programming Problem (DPP) with value function At the initial time T = t n , we apply Chebyshev interpolation to the function g(T, x), i.e. for x ∈ X , At the first time step t n−1 , the derivation of E[g(t n , X tn )|X t n−1 = x] is replaced by At time step t n−1 the value function V t n−1 needs only to be evaluated at the specific Chebyshev nodes. Hence, denoting with x k = (x k 1 , . . . , x k D ) the Chebyshev nodes, it suffices to evaluate A linear transformation of ( V t n−1 (x k )) k∈J yields the Chebyshev coefficients according to (2.3) which determines the Chebyshev interpolation V t n−1 = j c j (t n−1 )p j . We apply this procedure iteratively as described in detail in Algorithm 1.
The stochastic part is gathered in the expectations of the Chebyshev polynomials conditioned on the Chebyshev nodes, i.e. Γ j, Moreover, if an equidistant time stepping is applied the computation can be further simplified. If for the underlying stochastic process for u = 0, . . . , n − 1, then the conditional expectations need to be computed only for one time step, see Algorithm 2. One can pre-compute these conditional expectations and thus, the method allows for an offline/online decomposition.

Error Analysis
In this section we analyse the error of Algorithm 1, i.e.
Two different error sources occur at t u , the classical interpolation error of the Chebyshev interpolation and a distortion error at the nodal points. The latter comes from the fact that the values V tu (x k ) are approximations of V tu (x k ). The behaviour of the interpolation error depends on the regularity of the value function. Here, we assume analyticity of the value function. The concept can be extended to further cases such as assuming differentiability or piecewise analyticity. The latter is discussed in preliminary form in Mahlstedt (2017, Section 5.3) and is further investigated in a follow-up paper. Hence, we need a convergence result for the Chebyshev interpolation which incorporates a distortion error at the nodal points.
First, we introduce the required notation. A Bernstein ellipse B([−1, 1], ̺) with ̺ > 1 is defined as the open region in the complex plane bounded by an ellipse with foci ±1 and semiminor and semimajor axis lengths summing to ̺. We define a generalized Bernstein ellipse B(X , ̺) around the hyperrectangle X with parameter vector ̺ ∈ (1, ∞) D as Proof. Using the linearity of the interpolation operator we obtain for the Chebyshev The tensor-based multivariate Chebyshev interpolation I N (ε) can be written in Lagrange form The term Λ N is the Lebesgue constant of the (multivariate) Chebyshev nodes which is given by Trefethen (2013, Theorem 15.2) we obtain for the univariate Chebyshev interpolation Λ N ≤ 2 π log(N + 1) + 1 and hence Therefore, the proposition follows directly from (3.3) and Sauter and Schwab (2010).
We use this result to investigate the error of the Dynamic Chebyshev method. First, we introduce the following assumption.
Assumptions 3.2. We assume X ∋ x → V tu (x) is a real valued function that has an analytic extension to a generalized Bernstein ellipse B(X , ̺ tu ) with ̺ tu ∈ (1, ∞) D and sup x∈B(X ,̺t u ) |V tu (x)| ≤ B tu for u = 1, . . . , n.
Proposition 3.5 provides conditions on the process X and the functions f and g that guaranty Assumptions 3.2. Under this assumptions, we can apply Proposition 3.1 to obtain an error bound for the Dynamic Chebyshev method at each time step. This error bound has a recursive structure, since the values of V tu depend on the conditional expectation of V t u+1 . The interpolation error of the final time step is of form (3.2). At any other time step t u an additional distortion error by approximating the function values at the nodal points by comes into play. Proposition 3.1 yields The term F tu depends on the function f and the interpolation error at the previous time step t u+1 .
Moreover, two additional error sources can influence the error bound. If there is no closed-form solution for the generalized moments E[p j (X t u+1 )|X tu = x k ] a numerical technique, e.g. numerical quadrature or Monte Carlo methods, introduces an additional error. The former is typically deterministic and bounded whereas the latter is stochastic. In order to incorporate this error in the following error analysis we introduce some additional notation. The conditional expectation can be seen as a linear operator which operates on the vector space of all continuous functions Define the subspace of all D variate polynomials P N (X ) := span{p j , j ∈ J } equipped with the L ∞ -norm. We assume the operator Γ k tu is approximated by a linear operator Γ k tu : P N (X ) → R on P N (X ) which fullfills one of the two following conditions. For all u = 0, . . . , n the approximation is either deterministic and the error is bounded by a constant δ, or the approximation is stochastic and uses M samples of the underlying process and the polynomials p may have stochastic coefficients. In this case we assume the error bound Note that in the deterministic case (3.1) and (3.4) coincide. Additionally, a truncation error is introduced by restricting to a compact interpolation domain X . We assume that the conditional expectation of the value function outside this set is bounded by a constant The following theorem provides an error bound for the Dynamic Chebyshev method.
Theorem 3.3. Let the DPP be given as in Definition 2.1. Assume the regularity Assumptions 3.2 hold and the boundedness of the truncation error (TR). Then we have Proof. The proof of the theorem can be found in the appendix.
The following corollary provides a simplified version of the error bound (3.5) presenting its decomposition into three different error sources (interpolation error ε int , truncation error ε tr and the error from the numerical computation of the generalized moments ε gm ).
Corollary 3.4. Let the setting be as in Theorem 3.3. Then the error is bounded by Moreover, if ε tr = 0, L f = 1 and N = N i , i = 1, . . . , D the error bound can be simplified further.
Proof. Assuming C > 2 and using the geometric series, the first term in the error bound (3.5) can be rewritten as N, D, B) for ̺ = min 1≤u≤n ̺ u and B = max 1≤u≤n B tu . For C ≤ 2 the sum is bounded by ε int 2 n+1−u . Similar, we obtain for the second term in the error bound (3.5) with β = (ε tr + ε gm V j ) where β = max j β j . Moreover, we used Λ N L f ≤ Λ N L f (1 + ε gm ) = C in the last step. Thus, we obtain the following error bound (3.5) ε tu ≤ (ε int + β)C n+1−u = ε int (̺, N, D, B) + ε tr + ε gm V C n+1−u , whereC = max{2, C} and V = max j V j , which shows (3.6).
Furthermore, using the definition of the error bound (3.2) and N = N i , i = 1, . . . , D we conclude that ε int (̺, N, D, B) ≤ c 1 ̺ −N for a constant c 1 > 0. For the Lebesgue constant of the Chebyshev interpolation exists a constant c 2 > 0 such that The following proposition provides conditions under which the value function has an analytic extension to some generalized Bernstein ellipse and Assumptions 3.2 hold.
Proposition 3.5. Consider a DPP as defined in (2.4) and (2.5) with equidistant time-stepping and g(t, x) = g(x). Let X = (X t ) 0≤t≤T be a Markov process with stationary increments. Assume e η,· g(·) ∈ L 1 (R D ) for some η ∈ R D and g has an analytic extension to the generalized Bernstein ellipse B(X , ̺ g ). Furthermore, assume f : R × R → R has an analytic extension to C 2 . If (i) the characteristic function ϕ x of X ∆t with X 0 = x is in L 1 (R D ) for every x ∈ X , (ii) for every z ∈ R D the mapping x → ϕ x (z − iη) has an analytic extension to B(X , ̺ ϕ ) and there are constants α ∈ (1, 2] and c 1 , c 2 > 0 such that sup x∈B(X ,̺ϕ) |ϕ x (z)| ≤ c 1 e −c 2 |z| α for all z ∈ R D , then the value function x → V tu (x) of the DPP has an analytic extension to B(X , ̺) with ̺ = ̺ g .
Proof. At T the value function x → V T (x) is analytic since V T (x) = g(x) and g has an analytic extension by assumption. Moreover, e η,· g(·) ∈ L 1 (R D ) for some η ∈ R D . We assume e η,· V t u+1 (·) ∈ L 1 (R D ) and V t u+1 has an analytic extension to B(X , ̺). Then the function has an analytic extension. From Gaß et al. (2018, Conditions 3.1) we obtain conditions (A1)-(A4) under which a function of the form (p 1 , p 2 ) → E[f p 1 (X p 2 )] is analytic. In our case we only have the parameter p 2 = x and so X p 2 = X x ∆t . Condition (A1) is satisfied since e η,· V t u+1 (·) ∈ L 1 (R D ) and for (A2) we have to verify that | V t u+1 (−z − iη)| ≤ c 1 e c 2 |z| for constants c 1 , c 2 > 0.
Often, the discrete time problem (2.4) and (2.5) is an approximation of a continuous time problem and thus, we are interested in the error behaviour for n → ∞.
Remark 3.6. Assume the setup of Corollary 3.4. Moreover, assume that ε tr = ε gm = 0. If we let N and n go to infinity, we have to ensure that the error bound tends to zero. We use that ε int (̺, N, D, B) ≤ C 1 ̺ −N for a constant C 1 > 0 and N = min i N i . The following condition on the relation between n and N ensures convergence

Implementational aspects of the Dynamic Chebyshev method
In this section we discuss several approaches to compute the generalized moments (2.7) which contain the model dependent part. Moreover, preparing the numerical experiments we tailor the Dynamic Chebyshev method to the pricing of American put options.

Derivation of generalized moments
Naturally, the question arises how the generalized moments (2.7) can be derived. Here, we present four different ways and illustrate all approaches in the one-dimensional case X = [x, x]. Similar formulas can be obtained for a multidimensional domain.

Probability density function
For the derivation of E[p j (X t u+1 )|X tu = x k ], let the density function of the random variable X t u+1 |X tu = x k be given as f u,k (x). Then, the conditional expectation can be derived by solving an integral, x] (y)) f u,k (y)dy using p j (y) = T j (τ −1 X (y))1 X (y). This approach is rather intuitive and easy to implement.

Fourier Transformation
Assume the process X has stationary increments and the characteristic function ϕ of X ∆t is explicitly available. We apply Parseval's identity, see Rudin (1973), and use Fourier transforms where p x k j (x) = p j (x + x k ). Using the definition of τ [x,x] , we can express the Fourier transform of p x k j (x) with the help of the Chebyshev polynomial T j (y). This yield (4.1) The Fourier transform of the Chebyshev polynomials T j are presented in Dominguez et al. (2011) and the authors also provide a Matlab implementation.

Truncated moments
In this approach, we use that each one-dimensional Chebyshev polynomial can be represented as a sum of monomials, i.e.
The coefficients a l , l = 0, . . . , j, can easily be derived using the chebfun function poly(), see Driscoll et al. (2014). Then, As τ X is linear the computation of the generalized moments has thus been reduced to deriving truncated moments.

Monte-Carlo simulation
Lastly, especially in cases for which neither a probability density function, nor a characteristic function of the underlying process is given, Monte-Carlo simulation is a suitable choice. For every nodal point x k one simulates N M C paths X i t u+1 of X t u+1 with starting value X tu = x k . These simulations can then be used to approximate for every j ∈ J . For an overview of Monte-Carlo simulation from SDEs and variance reduction techniques we refer to Glasserman (2003).

American Put Option
In the numerical section we use the Dynamic Chebyshev method to price American put options. Assuming an asset model of the form S t = e Xt , the DPP becomes Typically, the support of the underlying process X t is R and restricting the domain to X introduces a truncation error. We reduce this error by exploiting the asymptotic behaviour of the payoff. If X tu is below the exercise boundary the option is exercised at the value K − e Xt u which we exploit for X tu < x. The function x → V tu tends to zero from above for x → ∞ and thus for x large enough the truncation to zero for x > x is justified. Hence, we introduce the following modification of the Dynamic Chebyshev method: for x small and x large enough. One can precompute E[(K−e Xt u+1 )1 {Xt u+1 <x} |X tu = x k ]. We emphasize that similar modifications to reduce the truncation error can be found for other payoff profiles, e.g. for digitals, butterfly options or any other combination of different put options. Moreover, we also modify the first time step. Instead of approximating the payoff with Chebyshev polynomials at t n = T we just use the price of a European option to compute the continuation value at t n−1 . The kink of the payoff is in this case "smoothed" and convergence is improved.
The option's sensitivities Delta and Gamma can be computed by tanking the first or second derivative of S → V 0 (log(S)) = j∈J c j (t 0 )p j (log(S)).
Thus Delta and Gamma are expressed as the sum of derivatives of Chebyshev polynomials. In particular, their derivation comes without any additional computational costs.

Numerical experiments
In this section, we use the Dynamic Chebyshev method to price American put options and we numerically investigate the convergence of the method. Moreover, we compare the method with the Least-squares Monte-Carlo method of Longstaff and Schwartz (2001).

Stock price models
For the convergence analysis we use three different stock price models.

The Black-Scholes model:
In the classical model of Black and Scholes (1973) the stock price process is modelled by the SDE dS t = rS t dt + σS t dW t where r is the risk-free interest rate and σ > 0 is the volatility. In this model the log-returns X t = log(S t ) are normally distributed and for the double truncated moments explicit formulas are available. Kan and Robotti (2017) present results for the (multivariate) truncated moments and provide a Matlab implementation.
The Merton jump diffusion model: The jump diffusion model introduced by Merton (1976) adds jumps to the classical Black-Scholes model. For S t = S 0 e Xt the log-returns X t follow a jump diffusion with volatility σ and added jumps arriving at rate λ > 0 with normal distributed jump sizes according to N (α, β 2 ). The characteristic function of X t is given by The Constant Elasticity of Variance model: The Constant Elasticity of Variance model (CEV) as stated in Schroder (1989) is a local volatility model based on the stochastic process Hence the stock volatility σS (β−2)/2 t depends on the current level of the stock price. For the special case β = 2 the model coincides with the Black-Scholes model. However, from market data one typically observes a β < 2. The CEV-model is one example of a model which has neither a probability density, nor a characteristic function in closed-form.

Convergence analysis
In this section we investigate the convergence of the Dynamic Chebyshev method. We price American put options along with the options' Delta and Gamma in the Black-Scholes and the Merton jump diffusion model, where we can use the COS method of Fang and Oosterlee (2009) as benchmark. The COS method is based on the Fourier-cosine expansion of the density function and provides fast and accurate results for the class of Lévy models.
For the experiments, we use the following parameter sets in the Black-Scholes model For both models the generalized moments are computed by the Fourier approach as stated in (4.1). We truncate the integral at |ξ| ≤ 250 and use Clenshaw-Curtis with 500 nodes for the numerical integration. For the Fourier transform of the Chebyshev polynomials the implementation of Dominguez et al. (2011) is used. We run the Dynamic Chebyshev method for an increasing number of Chebyshev nodes N = 50, 100, . . . , 750. Then, option prices and their sensitivities delta and gamma are calculated on a grid of different values of S 0 equally distributed between 60% and 140% of the strike K. The resulting prices and Greeks are compared using the COS method as benchmark and the maximum error over the grid is calculated. Here we use the implementation provided in von Sydow et al. (2015).

Dynamic Chebyshev with Monte-Carlo
So far we empirically investigated the error decay of the method for option price and their derivatives. In this section we compare the Dynamic Chebyshev method with the Least Square Monte-Carlo approach of Longstaff and Schwartz (2001) in terms of accuracy and runtime.
We compare the Dynamic Chebyshev method to the Least Squares Monte-Carlo approach. We run both methods for an increasing number of Monte-Carlo paths according to M ∈ {2500, 5000, 10000, 20000, 40000, 80000} .
The convergence of the DC method depends on both, the number of nodes N and the number of Monte-Carlo paths M . For an optimal convergence behaviour one needs to find a reasonable relationship between these factors. For the following experiments, we fix the number of Chebyshev nodes as N = √ 2 √ M . Figure 5.2 shows the price surface and the error surface for N = 400 and M = 80000. The error was estimated by using the COS method as benchmark. We reach a maximal error below 4 * 10 −2 on the whole option surface.
In Figure 5.3 the log 10 -error is shown as a function of the log 10 -runtime for both methods. The left figure compares the total runtimes and the right figure compares the offline runtime. For the Dynamic Chebyshev method the total runtime includes the offline-phase and the online phase. The offline-phase consists of the simulation of one time step of the underlying asset process X ∆t for N + 1 starting values X 0 = x k and of the computation of the conditional expectations E[p j (X ∆t )|X 0 = x k ] for j, k = 0, . . . , N . The online phase is the actual pricing of the American option for all strikes and maturities. Similar, the total runtime of the Least-Square Monte-Carlo method includes the simulation of the Monte-Carlo paths (offline-phase) and the pricing of the option via backward induction (online-phase).
We observe that the Dynamic Chebyshev method reaches the same accuracy in a much lower runtime. For example, a maximum error of 0.1 is reached in a total runtime of 0.5s with the Dynamic Chebyshev method whereas the LSM approach needs 98s. This means the Dynamic Chebyshev method is nearly 200 times faster for the same accuracy. For the actual pricing in the online phase, the gain in efficiency is even higher. We observe that the Dynamic Chebyshev method outperforms the Least-Square Monte-Carlo method in terms of the total runtime and the pure online runtime. Moreover, we observe that the performance gain from splitting the computation into an offline and an online phase is much higher for the Dynamic Chebyshev method. For instance, in the example above the online runtime of the Dynamic Chebyshev method is 0.05s whereas the LSM takes 95s, a factor of 1900 times more.
The main advantage of the Dynamic Chebyshev method is that once the conditional expectations are calculated, they can be used to price the whole option surface. The pure pricing, i.e. the online phase, is highly efficient. Furthermore, one only needs to simulate one time step ∆t of the underlying stochastic process instead of the complete path. We investigate this efficiency gain by varying the number of options and the number of time steps (exercise rights). Figure 5.4 compares the total runtime of the DC method with the total runtime of the LSM method for an increasing number of options and for an increasing number time steps. As expected, we can empirically confirm that the efficiency gain by the Dynamic Chebyshev methods increases with number of options and the number of exercise rights. In both cases, the runtime of the DC method stays nearly constant whereas the runtime of the LSM method increases approximately linearly. In Figure 5.6 the log 10 -error is shown as a function of the log 10 -runtime for both methods. The left figure compares the total runtimes and the right figure compares the offline runtimes. Again, we observe that the Dynamic Chebyshev method is faster for the same accuracy and it profits more from an offline-online decomposition. For example, the total runtime of the DC method to reach an accuracy below 0.03 is 3.5s whereas LSM takes 136s. For the online runtimes this out-performance is 1s to 122s. Investigating this efficiency gain further, we look at the performance for different numbers of options and time steps (exercise rights). Similarly to the last section, Figure 5.7 compares the total runtime of the DC method with the total runtime of the LSM method for an increasing number of options and time steps. In both cases, the runtime of the DC method stays nearly constant whereas the runtime of the LSM method increases approximately linearly. This observation is consistent with the findings for the Black-Scholes model in the previous section.

Conclusion and Outlook
We have introduced a new approach to price American options via backward induction by approximating the value function with Chebyshev polynomials. Thereby, the computation of the conditional expectation of the value function in each time step is reduced to the computation of conditional expectations of polynomials. The proposed method separates the pricing of an option into a part which is modeldependent (the computation of the conditional expectations) and the pure pricing of a given payoff which becomes independent of the underlying model. The first step, the computation of the conditional expectation of the Chebyshev polynomials, is the so-called offline phase of the method. The design of the method admits several qualitative advantageous: • If the conditional expectations are set-up once, we can use them for the pricing of many different options. Thus, the actual pricing in the online step becomes very simple and fast.
• In the pre-computation step one can combine the method with different techniques, such as Fourier approaches and Monte-Carlo simulation. Hence the method can be applied in a variety of models.
• The proposed approach is very general and flexible and thus not restricted to the pricing of American options. It can be used to solve a large class of optimal stopping problems.
• We obtain a closed-form approximation of the option price as a function of the stock price at every time step. This approximation can be used to compute the option's sensitivities Delta and Gamma at no additional costs. This holds for all models and payoff profiles, even if Monte-Carlo is required in the offline phase.
• The method is easy to implement and to maintain. The pre-computation step is well-suited for parallelization to speed-up the method.
We have investigated the theoretical error behaviour of the method and introduced explicit error bounds. We put particular emphasis on the combination of the method with Monte-Carlo simulation. Numerical experiments confirm that the method performs well for the pricing of American options. A detailed comparison of the method with the Least-Square Monte-Carlo approach proposed by Longstaff and Schwartz (2001) confirmed a high efficiency gain. Especially, when a high number of options is priced, for example a whole option price surface. In this case, the Dynamic Chebyshev method highly profits from the offline-online decomposition. Once the conditional expectations are computed they can be used to price options with different maturities and strikes. Besides the efficiency gain, the closed-form approximation of the price function is a significant advantage because it allows us to calculate the sensitivities. Since Longstaff and Schwartz (2001) introduced their method different modifications have been introduced. Either to increase efficiency or to tackle the sensitivities. For example the simulation algorithm of Jain and Oosterlee (2015) is comparable to LSM in terms of efficiency but is able to compute the Greeks at no additional costs. Moreover dual approaches were developed to obtain upper bounds for the option price, see Rogers (2002) and more recently Belomestny et al. (2018).
The presented error analysis of the method under an analyticity assumption is the starting point for further theoretical investigations in the case of piecewise analyticity and (piecewise) differentiability. The former allows to cover rigorously the American option pricing problem and a preliminary version is presented in Mahlstedt (2017). The qualitative merits of the method can be exploited in a variety of applications. Glau et al. (2018) take advantage of the closed-form approximation to efficiently compute the expected exposure of early-exercise options as a step in CVA calculation. Moreover, the method can be used to price different options such as Barrier options, Swing options or multivariate American options.
Assume that the regularity Assumption 3.2 and the assumption on the truncation error (TR) hold. Then we have to distinguish between the deterministic case (GM) and the stochastic case (GM * ). In the first case, the expectation in the error bound can simply be ignored. First, we apply Proposition 3.1. At time point T there is no random part and no distortion error. Thus, For the ease of notation we will from know on write ε j int = ε int (̺ t j , N, D, M t j ). We obtain for the error at t u max x∈X E |V tu (x) − V tu (x)| ≤ ε u int + Λ N F (f, t u ) (A.1) with maximal distortion error F (f, t u ) = max k∈J E |V tu (x k ) − V tu (x k )| . Note that whether (GM) or (GM * ) hold an approximation error of the conditional expectation of V t u+1 is made, i.e. E[ V t u+1 (X t u+1 )|X tu = x k ] = Γ k tu ( V t u+1 ) ≈ Γ k tu ( V t u+1 ). The Lipschitz continuity of f yields Next, we consider the expectation for each of the three error terms. For the first term we obtain and for the second term we have For the last term we have to distinguish two cases. If we assume (GM) holds, the operator norm yields Next, we consider the second case and assume that (GM * ) holds. Then we have Hence in either case the following bound holds with ε gm = δ if assumption (GM) holds and ε gm = δ ⋆ (M ) if assumption (GM * ) holds. We need an upper bound for the maximum of the Chebyshev approximation with V u+1 := max x∈X |V t u+1 (x)|. Hence, the error bound (A.1) becomes ε tu ≤ ε u int + Λ N L f (1 + ε gm )ε t u+1 + ε tr + ε gm V u+1 .