Abstract

Over forty years ago average-case error was proposed in the applied mathematics literature as an alternative criterion with which to assess numerical methods. In contrast to worst-case error, this criterion relies on the construction of a probability measure over candidate numerical tasks, and numerical methods are assessed based on their average performance over those tasks with respect to the measure. This paper goes further and establishes Bayesian probabilistic numerical methods as solutions to certain inverse problems based upon the numerical task within the Bayesian framework. This allows us to establish general conditions under which Bayesian probabilistic numerical methods are well defined, encompassing both the nonlinear and non-Gaussian contexts. For general computation, a numerical approximation scheme is proposed and its asymptotic convergence established. The theoretical development is extended to pipelines of computation, wherein probabilistic numerical methods are composed to solve more challenging numerical tasks. The contribution highlights an important research frontier at the interface of numerical analysis and uncertainty quantification, and a challenging industrial application is presented.

Keywords

  1. probabilistic numerics
  2. Bayesian methods
  3. numerical analysis
  4. information-based complexity

MSC codes

  1. 65N21
  2. 65N75
  3. 62-02
  4. 62C10
  5. 62G08
  6. 62M40

Get full access to this article

View all available purchase options and get full access to this article.

Supplementary Material

Index of Supplementary Materials

Title of paper: Bayesian Probabilistic Numerical Methods

Authors: Jon Cockayne, Chris J. Oates, T. J. Sullivan and Mark Girolami

File: M113935SupMat.pdf

Type: PDF

Contents:
SM1: Proofs
SM2: Philosophical Status of the Belief Distribution
SM3: Dichotomy of Existing PNMs
SM4: Decision-Theoretic Treatment
SM5: Monte Carlo Methods for Numerical Disintegration

References

1.
N. L. Ackerman, C. E. Freer, and D. M. Roy, On computability and disintegration, Math. Structures Comput. Sci., 27 (2017), pp. 1287--1314, https://doi.org/10.1017/s0960129516000098.
2.
I. Albert, S. Donnet, C. Guihenneuc-Jouyaux, S. Low-Choy, K. Mengersen, and J. Rousseau, Combining expert opinions in prior elicitation, Bayesian Anal., 7 (2012), pp. 503--531, https://doi.org/10.1214/12-BA717.
3.
T. V. Anderson, Efficient, Accurate, and non-Gaussian Error Propagation through Nonlinear, Closed-Form, Analytical System Models, Master's thesis, Department of Mechanical Engineering, Brigham Young University, 2011.
4.
I. Babuška and G. Söderlind, On roundoff error growth in elliptic problems, ACM Trans. Math. Software, 44 (2018), pp. 1--22, https://doi.org/10.1145/3134444.
5.
J. O. Berger, Statistical Decision Theory and Bayesian Analysis, 2nd ed., Springer Series in Statistics, Springer-Verlag, New York, 1985, https://doi.org/10.1007/978-1-4757-4286-2.
6.
P. G. Bissiri, C. C. Holmes, and S. G. Walker, A general framework for updating belief distributions, J. R. Stat. Soc. Ser. B. Stat. Methodol., 78 (2016), pp. 1103--1130, https://doi.org/10.1111/rssb.12158.
7.
V. I. Bogachev, Gaussian Measures, Math. Surveys Monogr. 62, AMS, Providence, RI, 1998, https://doi.org/10.1090/surv/062.
8.
D. Bradley, The Hydrocyclone, Internat. Ser. Monogr. Chem. Engrg. 4, Elsevier, 2013.
9.
F.-X. Briol, C. J. Oates, M. Girolami, M. A. Osborne, and D. Sejdinovic, Probabilistic integration: A role in statistical computation? (with discussion), Stat. Sci., 34 (2019), pp. 1--22, https://doi.org/10.1214/18-STS660.
10.
A.-P. Calderón, On an inverse boundary value problem, in Seminar on Numerical Analysis and Its Applications to Continuum Physics (Rio de Janeiro, 1980), Soc. Brasil. Mat., Rio de Janeiro, 1980, pp. 65--73, https://doi.org/10.1590/s0101-82052006000200002.
11.
M. A. Capistrán, J. A. Christen, and S. Donnet, Bayesian analysis of ODEs: Solver optimal accuracy and Bayes factors, SIAM/ASA J. Uncertain. Quantif., 4 (2016), pp. 829--849, https://doi.org/10.1137/140976777.
12.
I. Castillo and R. Nickl, On the Bernstein--von Mises phenomenon for nonparametric Bayes procedures, Ann. Statist., 42 (2014), pp. 1941--1969, https://doi.org/10.1214/14-AOS1246.
13.
J. T. Chang and D. Pollard, Conditioning as disintegration, Statist. Neerlandica, 51 (1997), pp. 287--317, https://doi.org/10.1111/1467-9574.00056.
14.
O. A. Chkrebtii, D. A. Campbell, B. Calderhead, and M. A. Girolami, Bayesian solution uncertainty quantification for differential equations, Bayesian Anal., 11 (2016), pp. 1239--1267, https://doi.org/10.1214/16-BA1017.
15.
J. Cockayne, C. J. Oates, I. C. F. Ipsen, and M. Girolami, A Bayesian conjugate gradient method, Bayesian Anal., to appear.
16.
J. Cockayne, C. Oates, T. J. Sullivan, and M. Girolami, Probabilistic Meshless Methods for Partial Differential Equations and Bayesian Inverse Problems, preprint, https://arxiv.org/abs/1605.07811, 2016.
17.
P. R. Conrad, M. Girolami, S. Särkkä, A. M. Stuart, and K. C. Zygalakis, Statistical analysis of differential equations: Introducing probability measures on numerical solutions, Stat. Comput., 27 (2017), pp. 1065--1082, https://doi.org/10.1007/s11222-016-9671-0.
18.
M. Dashti, S. Harris, and A. Stuart, Besov priors for Bayesian inverse problems, Inverse Probl. Imaging, 6 (2012), pp. 183--200, https://doi.org/10.3934/ipi.2012.6.183.
19.
M. de Carvalho, G. L. Page, and B. J. Barney, On the geometry of Bayesian inference, Bayesian Anal., (2018), https://doi.org/10.1214/18-ba1112.
20.
P. Del Moral, A. Doucet, and A. Jasra, An adaptive sequential Monte Carlo method for approximate Bayesian computation, Stat. Comput., 22 (2012), pp. 1009--1020, https://doi.org/10.1007/s11222-011-9271-y.
21.
C. Dellacherie and P.-A. Meyer, Probabilities and Potential, North-Holland, Amsterdam, New York, 1978, https://doi.org/10.1016/s0304-0208(08)x7141-5.
22.
P. Diaconis, Bayesian numerical analysis, in Statistical Decision Theory and Related Topics IV, Vol. 1, Springer, New York, 1988, pp. 163--175, https://doi.org/10.1007/978-1-4613-8768-8_20.
23.
P. Diaconis and D. Freedman, Frequency properties of Bayes rules, in Scientific Inference, Data Analysis, and Robustness (Madison, Wis., 1981), Publ. Math. Res. Center Univ. Wisconsin 48, Academic Press, Orlando, FL, 1983, pp. 105--115, https://doi.org/10.1016/b978-0-12-121160-8.50011-9.
24.
P. Diaconis and D. A. Freedman, On the consistency of Bayes estimates, Ann. Statist., 14 (1986), pp. 1--67, https://doi.org/10.1214/aos/1176349830.
25.
J. L. Doob, Application of the theory of martingales, in Le Calcul des Probabilités et ses Applications, Colloques Internationaux du Centre National de la Recherche Scientifique 13, Centre National de la Recherche Scientifique, Paris, 1949, pp. 23--27.
26.
A. Doucet, N. de Freitas, and N. Gordon, eds., Sequential Monte Carlo Methods in Practice, Statistics for Engineering and Information Science, Springer-Verlag, New York, 2001, https://doi.org/10.1007/978-1-4757-3437-9.
27.
L. L. Duan, A. L. Young, A. Nishimura, and D. B. Dunson, Bayesian Constraint Relaxation, preprint, https://arxiv.org/abs/1801.01525, 2018.
28.
M. M. Dunlop and A. M. Stuart, The Bayesian formulation of EIT: Analysis and algorithms, Inverse Probl. Imaging, 10 (2016), pp. 1007--1036, https://doi.org/10.3934/ipi.2016030.
29.
P. E. Farrell, Á. Birkisson, and S. W. Funke, Deflation techniques for finding distinct solutions of nonlinear partial differential equations, SIAM J. Sci. Comput., 37 (2015), pp. A2026--A2045, https://doi.org/10.1137/140984798.
30.
G. E. Fasshauer, Solving differential equations with radial basis functions: Multilevel methods and smoothing, Adv. Comput. Math., 11 (1999), pp. 139--159, https://doi.org/10.1023/A:1018919824891.
31.
D. A. Freedman, On the asymptotic behavior of Bayes' estimates in the discrete case, Ann. Math. Statist., 34 (1963), pp. 1386--1403, https://doi.org/10.1214/aoms/1177703871.
32.
S. French, Aggregating expert judgement, Rev. R. Acad. Cienc. Exactas Fí s. Nat. Ser. A Math. RACSAM, 105 (2011), pp. 181--206, https://doi.org/10.1007/s13398-011-0018-6.
33.
N. Garcia Trillos and D. Sanz-Alonso, The Bayesian update: Variational formulations and gradient flows, Bayesian Anal., (2018), https://doi.org/10.1214/18-BA1137.
34.
C. J. Geyer, Markov chain Monte Carlo maximum likelihood, in Proceedings of the 23rd Symposium on the Interface of Computing Science and Statistics, Interface Foundation of North America, 1991, pp. 156--163.
35.
M. Girolami and B. Calderhead, Riemann manifold Langevin and Hamiltonian Monte Carlo methods, J. R. Stat. Soc. Ser. B Stat. Methodol., 73 (2011), pp. 123--214, https://doi.org/10.1111/j.1467-9868.2010.00765.x.
36.
J. A. Gutierrez, T. Dyakowski, M. S. Beck, and R. A. Williams, Using electrical impedance tomography for controlling hydrocyclone underflow discharge, Powder Technol., 108 (2000), pp. 180--184, https://doi.org/10.1016/s0967-0661(97)00233-5.
37.
R. Harvey and D. Verseghy, The reliability of single precision computations in the simulation of deep soil heat diffusion in a land surface model, Clim. Dynam., 46 (2015), pp. 3865--3882, https://doi.org/10.1007/s00382-015-2809-5.
38.
P. Hennig, Probabilistic interpretation of linear solvers, SIAM J. Optim., 25 (2015), pp. 234--260, https://doi.org/10.1137/140955501.
39.
P. Hennig, M. A. Osborne, and M. Girolami, Probabilistic numerics and uncertainty in computations, Proc. A, 471 (2015), art. 20150142, https://doi.org/10.1098/rspa.2015.0142.
40.
P. Henrici, Error Propagation for Difference Method, John Wiley and Sons, New York, London, 1963.
41.
N. J. Higham, Accuracy and Stability of Numerical Algorithms, 2nd ed., SIAM, Philadelphia, 2002, https://doi.org/10.1137/1.9780898718027.
42.
J. B. Kadane, Principles of Uncertainty, Texts in Statistical Science Series, CRC Press, Boca Raton, FL, 2011, https://doi.org/10.1201/b11322.
43.
J. B. Kadane and G. W. Wasilkowski, Average case $\epsilon$-complexity in computer science: A Bayesian view, in Bayesian Statistics, Elsevier, North-Holland, Amsterdam, 1985, pp. 361--374.
44.
W. Kahan, The Improbability of Probabilistic Error Analyses for Numerical Computations, in U. California, Berkeley Statistics Colloquium, 1996, https://www.cs.berkeley.edu/~wkahan/improber.pdf.
45.
M. Kanagawa, B. K. Sriperumbudur, and K. Fukumizu, Convergence guarantees for kernel-based quadrature rules in misspecified settings, in Proceedings of the 30th Annual Conference on Neural Information Processing Systems (NIPS), 2016, pp. 3288--3296.
46.
T. Karvonen, C. J. Oates, and S. Särkkä, A Bayes-Sard cubature method, in Proceedings of the 30th Annual Conference on Neural Information Processing Systems (NIPS), 2019, pp. 5882--5893.
47.
T. Karvonen and S. Särkkä, Classical quadrature rules via Gaussian processes, in 27th IEEE International Workshop on Machine Learning for Signal Processing, IEEE, 2017, pp. 1--6.
48.
T. Karvonen and S. Särkkä, Fully symmetric kernel quadrature, SIAM J. Sci. Comput., 40 (2018), pp. A697--A720, https://doi.org/10.1137/17m1121779.
49.
H. Kersting and P. Hennig, Active uncertainty calibration in Bayesian ODE solvers, in Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence (UAI), AUAI Press, Arlington, VA, 2016, pp. 309--318.
50.
G. S. Kimeldorf and G. Wahba, Spline functions and stochastic processes, Sankhyā Ser. A, 32 (1970), pp. 173--180.
51.
D. Kinderlehrer and G. StampacThia, An Introduction to Variational Inequalities and Their Applications, SIAM, 2000, https://doi.org/10.1137/1.9780898719451. Reprint of the Academic Press 1980 original.
52.
B. J. K. Kleijn and A. W. van der Vaart, The Bernstein--Von-Mises theorem under misspecification, Electron. J. Stat., 6 (2012), pp. 354--381, https://doi.org/10.1214/12-EJS675.
53.
A. N. Kolmogorov, Foundations of Probability, Ergebnisse Der Mathematik, 1933.
54.
J. Kuelbs, F. M. Larkin, and J. A. Williamson, Weak probability distributions on reproducing kernel Hilbert spaces, Rocky Mt. J. Math., 2 (1972), pp. 369--378, https://doi.org/10.1216/RMJ-1972-2-3-369.
55.
F. Larkin, Probabilistic estimation of poles or zeros of functions, J. Approx. Theory, 27 (1979), pp. 355--371, https://doi.org/10.1016/0021-9045(79)90124-2.
56.
F. M. Larkin, Estimation of a non-negative function, BIT, 9 (1969), pp. 30--52.
57.
F. M. Larkin, Optimal approximation in Hilbert spaces with reproducing kernel functions, Math. Comp., 24 (1970), pp. 911--921, https://doi.org/10.2307/2004625.
58.
F. M. Larkin, Gaussian measure in Hilbert space and applications in numerical analysis, Rocky Mountain J. Math., 2 (1972), pp. 379--421, https://doi.org/10.1216/RMJ-1972-2-3-379.
59.
F. M. Larkin, Probabilistic error estimates in spline interpolation and quadrature, in Information Processing 74 (Proc. IFIP Congress, Stockholm, 1974), North-Holland, Amsterdam, 1974, pp. 605--609.
60.
F. M. Larkin, A modification of the secant rule derived from a maximum likelihood principle, BIT, 19 (1979), pp. 214--222, https://doi.org/10.1007/BF01930851.
61.
M. Lassas, E. Saksman, and S. Siltanen, Discretization-invariant Bayesian inversion and Besov space priors, Inverse Probl. Imaging, 3 (2009), pp. 87--122, https://doi.org/10.3934/ipi.2009.3.87.
62.
S. Lauritzen, Graphical Models, Oxford University Press, 1991.
63.
L. Le Cam, On some asymptotic properties of maximum likelihood estimates and related Bayes' estimates, Univ. California Publ. Statist., 1 (1953), pp. 277--329.
64.
D. Lee and G. W. Wasilkowski, Approximation of linear functionals on a Banach space with a Gaussian measure, J. Complexity, 2 (1986), pp. 12--43, https://doi.org/10.1016/0885-064X(86)90021-X.
65.
D. V. Lindley, Understanding Uncertainty, Wiley Series in Probability and Statistics, John Wiley & Sons, Inc., Hoboken, NJ, revised ed., 2014, https://doi.org/10.1002/9781118650158.indsp2.
66.
D. J. C. MacKay, Bayesian interpolation, Neural Comput., 4 (1992), pp. 415--447.
67.
J. Mockus, Bayesian Approach to Global Optimization: Theory and Applications, Springer Science & Business Media, 1989.
68.
A. Müller, Integral probability metrics and their generating classes of functions, Adv. in Appl. Probab., 29 (1997), pp. 429--443, https://doi.org/10.2307/1428011.
69.
S. Niederer, L. Mitchell, N. Smith, and G. Plank, Simulating human cardiac electrophysiology on clinical time-scales, Front. in Physiol., 2 (2011), https://doi.org/10.3389/fphys.2011.00014.
70.
E. Novak and H. Woźniakowski, Tractability of Multivariate Problems: Standard Information for Functionals, European Mathematical Society, 2010.
71.
C. J. Oates, J. Cockayne, R. G. Aykroyd, and M. A. Girolami, Bayesian probabilistic numerical methods in time-dependent state estimation for industrial hydrocyclone equipment, J. Amer. Statist. Assoc., 2019, pp. 1--27.
72.
W. L. Oberkampf and C. J. Roy, Verification and Validation in Scientific Computing, Cambridge University Press, Cambridge, 2013.
73.
A. O'Hagan, Bayes--Hermite quadrature, J. Statist. Plann. Inference, 29 (1991), pp. 245--260, https://doi.org/10.1016/0378-3758(91)90002-V.
74.
H. Owhadi, Bayesian numerical homogenization, Multiscale Model. Simul., 13 (2015), pp. 812--828, https://doi.org/10.1137/140974596.
75.
H. Owhadi, Multigrid with rough coefficients and multiresolution operator decomposition from hierarchical information games, SIAM Rev., 59 (2017), pp. 99--149, https://doi.org/10.1137/15M1013894.
76.
H. Owhadi, C. Scovel, and T. J. Sullivan, On the brittleness of Bayesian inference, SIAM Rev., 57 (2015), pp. 566--582, https://doi.org/10.1137/130938633.
77.
J. Pfanzagl, Conditional distributions as derivatives, Ann. Probab., 7 (1979), pp. 1046--1050.
78.
W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes: The Art of Scientific Computing, 3rd ed., Cambridge University Press, Cambridge, UK, 2007.
79.
K. Ritter, Average-Case Analysis of Numerical Problems, Lecture Notes in Math. 1733, Springer-Verlag, Berlin, 2000, https://doi.org/10.1007/BFb0103934.
80.
C. Roy, Review of discretization error estimators in scientific computing, in Proceedings of 48th AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition, AIAA, 2010.
81.
J. Sacks, W. J. Welch, T. J. Mitchell, and H. P. Wynn, Design and analysis of computer experiments, Stat. Sci., 4 (1989), pp. 409--423, https://doi.org/10.1214/ss/1177012413.
82.
S. Särkkä, J. Hartikainen, L. Svensson, and F. Sandblom, On the relation between Gaussian process quadratures and sigma-point methods, J. Adv. Inf. Fusion, 11 (2016), pp. 31--46.
83.
M. Schober, D. K. Duvenaud, and P. Hennig, Probabilistic ODE solvers with Runge--Kutta means, in Proceedings of Advances in Neural Information Processing Systems (NIPS), 2014, pp. 739--767.
84.
M. Schober, S. Särkkä, and P. Hennig, A probabilistic model for the numerical solution of initial value problems, Stat. Comput., 29 (2019), pp. 99--122, https://doi.org/10.1007/s11222-017-9798-7.
85.
G. Shafer, A Mathematical Theory of Evidence, Princeton University Press, Princeton, NJ, 1976.
86.
R. Sripriya, M. Kaulaskar, S. Chakraborty, and B. Meikap, Studies on the performance of a hydrocyclone and modeling for flow characterization in presence and absence of air core, Chem. Eng. Sci., 62 (2007), pp. 6391--6402, https://doi.org/10.1016/j.ces.2007.07.046.
87.
G. Strang and G. Fix, An Analysis of the Finite Element Method, Prentice-Hall, Englewood Cliffs, NJ, 1973.
88.
A. M. Stuart, Inverse problems: A Bayesian perspective, Acta Numer., 19 (2010), pp. 451--559, https://doi.org/10.1017/S0962492910000061.
89.
T. J. Sullivan, Well-posed Bayesian inverse problems and heavy-tailed stable quasi-Banach space priors, Inverse Probl. Imaging, 11 (2017), pp. 857--874, https://doi.org/10.3934/ipi.2017040.
90.
O. Teymur, K. Zygalakis, and B. Calderhead, Probabilistic linear multistep methods, in Proceedings of Advances in Neural Information Processing Systems (NIPS), 2016, pp. 4314--4321.
91.
A. Törn and A. Žilinskas, Global Optimization, Lecture Notes in Comput. Sci. 350, Springer-Verlag, Berlin, 1989, https://doi.org/10.1007/3-540-50871-6.
92.
J. F. Traub, G. W. Wasilkowski, and H. Woźniakowski, Information-Based Complexity, Computer Science and Scientific Computing, Academic Press, Inc., Boston, MA, 1988.
93.
J. Wang, J. Cockayne, and C. Oates, On the Bayesian solution of differential equations, in Proceedings of the 38th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, 2018.
94.
G. Wasilkowski, Optimal algorithms for linear problems with Gaussian measures, Rocky Mountain J. Math., 16 (1986), pp. 727--750, https://doi.org/10.1216/rmj-1986-16-4-727.
95.
R. M. West, S. Meng, R. G. Aykroyd, and R. A. Williams, Spatial-temporal modeling for electrical impedance imaging of a mixing process, Rev. Sci. Instrum., 76 (2005), art. 073703, https://doi.org/10.1063/1.1947882.
96.
X. Xi, F.-X. Briol, and M. Girolami, Bayesian quadrature for multiple related integrals, in Proceedings of the 35th International Conference on Machine Learning (ICML), 2018, pp. 5373--5382.

Information & Authors

Information

Published In

cover image SIAM Review
SIAM Review
Pages: 756 - 789
ISSN (online): 1095-7200

History

Submitted: 18 July 2017
Accepted: 14 February 2019
Published online: 6 November 2019

Keywords

  1. probabilistic numerics
  2. Bayesian methods
  3. numerical analysis
  4. information-based complexity

MSC codes

  1. 65N21
  2. 65N75
  3. 62-02
  4. 62C10
  5. 62G08
  6. 62M40

Authors

Affiliations

Funding Information

Freie Universität Berlin https://doi.org/10.13039/501100007537 : 1114
Lloyd's Register Foundation https://doi.org/10.13039/100008885
National Science Foundation https://doi.org/10.13039/100000001 : DMS-1127914
Engineering and Physical Sciences Research Council https://doi.org/10.13039/501100000266 : EP/R034710/1, EP/R018413/1, EP/R004889/1, EP/P020720/1, EP/J016934/3
Royal Academy of Engineering https://doi.org/10.13039/501100000287

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited By

View Options

View options

PDF

View PDF

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media