Methods and Algorithms for Scientific Computing

A Stochastic Newton MCMC Method for Large-Scale Statistical Inverse Problems with Application to Seismic Inversion

Abstract

We address the solution of large-scale statistical inverse problems in the framework of Bayesian inference. The Markov chain Monte Carlo (MCMC) method is the most popular approach for sampling the posterior probability distribution that describes the solution of the statistical inverse problem. MCMC methods face two central difficulties when applied to large-scale inverse problems: first, the forward models (typically in the form of partial differential equations) that map uncertain parameters to observable quantities make the evaluation of the probability density at any point in parameter space very expensive; and second, the high-dimensional parameter spaces that arise upon discretization of infinite-dimensional parameter fields make the exploration of the probability density function prohibitive. The challenge for MCMC methods is to construct proposal functions that simultaneously provide a good approximation of the target density while being inexpensive to manipulate. Here we present a so-called Stochastic Newton method in which MCMC is accelerated by constructing and sampling from a proposal density that builds a local Gaussian approximation based on local gradient and Hessian (of the log posterior) information. Thus, the method exploits tools (adjoint-based gradients and Hessians) that have been instrumental for fast (often mesh-independent) solution of deterministic inverse problems. Hessian manipulations (inverse, square root) are made tractable by a low-rank approximation that exploits the compact nature of the data misfit operator. This is analogous to a reduced model of the parameter-to-observable map. The method is applied to the Bayesian solution of an inverse medium problem governed by 1D seismic wave propagation. We compare the Stochastic Newton method with a reference black box MCMC method as well as a gradient-based Langevin MCMC method, and observe at least two orders of magnitude improvement in convergence for problems with up to 65 parameters. Numerical evidence suggests that a 1025 parameter problem converges at the same rate as the 65 parameter problem.

Keywords

  1. MCMC
  2. Stochastic Newton
  3. inverse problems
  4. uncertainty quantification
  5. Langevin dynamics
  6. low-rank Hessian

MSC codes

  1. 35Q62
  2. 35Q93
  3. 35Q86
  4. 49M15
  5. 74J20
  6. 74J25
  7. 65C40
  8. 65C60
  9. 65M32

Get full access to this article

View all available purchase options and get full access to this article.

References

1.
V. Akçelik, G. Biros, O. Ghattas, J. Hill, D. Keyes, and B. van Bloeman Waanders, Parallel algorithms for PDE-constrained optimization, in Parallel Processing for Scientific Computing, M. A. Heroux, P. Raghaven, and H. D. Simon, eds., SIAM, Philadelphia, 2006, pp. 291--322.
2.
A. Apte, M. Hairer, A. M. Stuart, and J. Voss, Sampling the posterior: An approach to non-Gaussian data assimilation, Phys. D, 230 (2007), pp. 50--64.
3.
S. R. Arridge, J. P. Kaipio, V. Kolehmainen, M. Schweiger, E. Somersalo, T. Tarvainen, and M. Vauhkonen, Approximation errors and model reduction with an application in optical diffusion tomography, Inverse Problems, 22 (2006), pp. 175--195.
4.
V. Asokan, B. Narayanan, and N. Zabaras, Stochastic inverse heat conduction using a spectral approach, Internat. J. Numer. Methods Engrg., 60 (2004), pp. 1569--1593.
5.
G. Biros and O. Ghattas, Parallel Lagrange--Newton--Krylov--Schur methods for PDE--constrained optimization. Part II: The Lagrange--Newton solver and its application to optimal control of steady viscous flows, SIAM J. Sci. Comput., 27 (2005), pp. 714--739.
6.
S. Brooks and A. Gelman, General methods for monitoring convergence of iterative simulations, J. Comput. Graph. Statist., 7 (1998), pp. 434--456.
7.
T. Bui-Thanh, C. Burstedde, O. Ghattas, J. Martin, G. Stadler, and L. C. Wilcox, Extreme-scale UQ for Bayesian inverse problems governed by PDEs, in Proceedings of SC12, Gordon Bell Prize, submitted.
8.
T. Bui-Thanh and O. Ghattas, Analysis of the Hessian for inverse scattering problems. Part II: Inverse medium scattering of acoustic waves, Inverse Problems, 28 (2012), 055002.
9.
T. Bui-Thanh, O. Ghattas, and D. Higdon, Adaptive Hessian-based non-stationary Gaussian process response surface method for probability density approximation with application to Bayesian solution of large-scale inverse problems, SIAM J. Sci. Comput., submitted.
10.
T. Bui-Thanh, K. Willcox, and O. Ghattas, Parametric reduced-order models for probabilistic analysis of unsteady aerodynamic applications, AIAA J., 46 (2008), pp. 2520--2529.
11.
C. Burstedde and O. Ghattas, Algorithmic strategies for full waveform inversion: $1$D experiments, Geophys., 74 (2009), pp. WCC37--WCC46.
12.
J. A. Christen and C. Fox, Markov chain Monte Carlo using an approximations, J. Comput. Graph. Statist., 14 (2005), pp. 795--810.
13.
J. A. Christen and C. Fox, A general purpose sampling algorithm for continuous distributions (the t-walk), Bayesian Anal., 5 (2010), pp. 263--283.
14.
P. Dostert, Y. Efendiev, T. Y. Hou, and W. Luo, Coarse-gradient Langevin algorithms for dynamic data integration and uncertainty quantification, J. Comput. Phys., 217 (2006), pp. 123--142.
15.
Y. Efendiev, T. Hou, and W. Luo, Preconditioning Markov chain Monte Carlo simulations using coarse-scale models, SIAM J. Sci. Comput., 28 (2006), pp. 776--803.
16.
D. Galbally, K. Fidkowski, K. Willcox, and O. Ghattas, Nonlinear model reduction for uncertainty quantification in large-scale inverse problems, Internat. J. Numer. Methods Engrg., 81 (2010), pp. 1581--1608.
17.
J. Geweke and H. Tanizaki, On Markov chain Monte Carlo methods for nonlinear and non-Gaussian state-space models, Comm. Statist. Simulation Comput., 28 (1999), pp. 867--894.
18.
J. Geweke and H. Tanizaki, Note on the sampling distribution for the Metropolis-Hastings algorithm, Comm. Statist. Theory Methods, 32 (2003), pp. 775--789.
19.
R. G. Ghanem and P. D. Spanos, Stochastic Finite Elements: A Spectral Approach, Springer-Verlag, New York, 1991.
20.
M. Girolami and B. Calderhead, Riemann manifold Langevin and Hamiltonian Monte Carlo methods, J. R. Stat. Soc. Ser. B Stat. Methodol., 73 (2011), pp. 123--214.
21.
H. Haario, M. Laine, A. Miravete, and E. Saksman, DRAM: Efficient adaptive MCMC, Stat. Comput., 16 (2006), pp. 339--354.
22.
K. M. Hanson and G. S. Cunningham, Posterior sampling with improved efficiency, in Medical Imaging: Image Processing, SPIE, Bellingham, WA, 1998, pp. 371--382.
23.
D. Higdon, J. Gattiker, B. Williams, and M. Rightley, Computer model calibration using high-dimensional output, J. Amer. Statist. Assoc., 103 (2008), pp. 570--583.
24.
D. Higdon, H. Lee, and C. Holloman, Markov chain Monte Carlo-based approaches for inference in computationally intensive inverse problems, in Bayesian Statistics 7, J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith, and M. West, eds. Oxford University Press, New York, 2003, pp. 181--197.
25.
D. Higdon, C. S. Reese, J. D. Moulton, J. A. Vrugt, and C. Fox, Posterior exploration for computationally intensive forward models, in Handbook of Markov Chain Monte Carlo, Chapman & Hall/CRC Press, Baca Raton, FL, 2010, pp. 401--418.
26.
M. Hinze, R. Pinnau, M. Ulbrich, and S. Ulbrich, Optimization with PDE Constraints, Springer-Verlag, New York, 2009.
27.
J. Kaipio and E. Somersalo, Statistical and Computational Inverse Problems, Appl. Math. Sci. 160, Springer-Verlag, New York, 2005.
28.
M. C. Kennedy and A. O'Hagan, Bayesian calibration of computer models, J. R. Stat. Soc. Ser. B Statist. Methodol., 63 (2001), pp. 425--464.
29.
C. Lieberman, K. Willcox, and O. Ghattas, Parameter and state model reduction for large-scale statistical inverse problems, SIAM J. Sci. Comput., 32 (2010), pp. 2523--2542.
30.
Y. Marzouk and D. Xiu, A stochastic collocation approach to Bayesian inference in inverse problems, Commun. Comput. Phys., 6 (2009), pp. 826--847.
31.
Y. M. Marzouk and H. N. Najm, Dimensionality reduction and polynomial chaos acceleration of Bayesian inference in inverse problems, J. Comput. Phys., 228 (2009), pp. 1862--1902.
32.
Y. M. Marzouk, H. N. Najm, and L. A. Rahn, Stochastic spectral methods for efficient Bayesian solution of inverse problems, J. Comput. Phys., 224 (2007), pp. 560--586.
33.
N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, Equation of state calculations by fast computing machines, J. Chem. Phys., 21 (1953), pp. 1087--1092.
34.
R. M. Neal, MCMC using Hamiltonian dynamics, in Handbook of Markov Chain Monte Carlo, Chapman & Hall/CRC Press, Baca Raton, FL, 2010, pp. 113--162.
35.
N. C. Nguyen, G. Rozza, D. B. P. Huynh, and A. T. Patera, Reduced basis approximation and a posteriori error estimation for parametrized parabolic PDEs; Application to real-time Bayesian parameter estimation, in Large-Scale Inverse Problems and Quantification of Uncertainty, L. Biegler, G. Biros, O. Ghattas, M. Heinkenschloss, D. Keyes, B. Mallick, L. Tenorio, B. van Bloemen Waanders, and K. Willcox, eds., John Wiley & Sons, New York, 2011, pp. 151--178.
36.
Y. Qi and T. P. Minka, Hessian-based Markov chain Monte-Carlo algorithms, in Proceedings of the First Cape Cod Workshop on Monte Carlo Methods, Cape Cod, MA, 2002.
37.
C. P. Robert and G. Casella, Monte Carlo Statistical Methods, Springer Texts Statist., Springer-Verlag, Secaucus, NJ, 2004.
38.
G. O. Roberts and J. S. Rosenthal, Optimal scaling for various Metropolis-Hastings algorithms, Statist. Sci., 16 (2001), pp. 351--367.
39.
G. O. Roberts and R. L. Tweedie, Exponential convergence of Langevin distributions and their discrete approximations, Bernoulli, 2 (1996), pp. 341--363.
40.
Y. Saad, Iterative Methods for Sparse Linear Systems, 2nd ed., SIAM, Philadelphia, 2003.
41.
A. M. Stuart, Inverse problems: A Bayesian perspective, Acta Numer., 19 (2010), pp. 451--559.
42.
A. M. Stuart, P. Wiberg, and J. Voss, Conditional path sampling of SDEs and the Langevin MCMC method, Commun. Math. Sci., 2 (2004), pp. 685--697.
43.
A. Tarantola, Inverse Problem Theory and Methods for Model Parameter Estimation, SIAM, Philadelphia, 2005.
44.
L. Tierney, Markov chains for exploring posterior distributions, Ann. Statist., 22 (1994), pp. 1701--1762.
45.
C. R. Vogel, Computational Methods for Inverse Problems, Frontiers Appl. Math. 23, SIAM, Philadelphia, 2002.
46.
J. Wang and N. Zabaras, Using Bayesian statistics in the estimation of heat source in radiation, Int. J. Heat Mass Tran., 48 (2005), pp. 15--29.
47.
N. Zabaras and B. Ganapathysubramanian, A scalable framework for the solution of stochastic inverse problems using a sparse grid collocation approach, J. Comput. Phys., 227 (2008), pp. 4697--4735.

Information & Authors

Information

Published In

cover image SIAM Journal on Scientific Computing
SIAM Journal on Scientific Computing
Pages: A1460 - A1487
ISSN (online): 1095-7197

History

Submitted: 24 August 2011
Accepted: 13 February 2012
Published online: 5 June 2012

Keywords

  1. MCMC
  2. Stochastic Newton
  3. inverse problems
  4. uncertainty quantification
  5. Langevin dynamics
  6. low-rank Hessian

MSC codes

  1. 35Q62
  2. 35Q93
  3. 35Q86
  4. 49M15
  5. 74J20
  6. 74J25
  7. 65C40
  8. 65C60
  9. 65M32

Authors

Affiliations

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited By

View Options

View options

PDF

View PDF

Figures

Tables

Media

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media