High-Dimensional Gaussian Sampling: A Review and a Unifying Approach Based on a Stochastic Proximal Point Algorithm
Abstract
Efficient sampling from a high-dimensional Gaussian distribution is an old but high-stakes issue. Vanilla Cholesky samplers imply a computational cost and memory requirements that can rapidly become prohibitive in high dimensions. To tackle these issues, multiple methods have been proposed from different communities ranging from iterative numerical linear algebra to Markov chain Monte Carlo (MCMC) approaches. Surprisingly, no complete review and comparison of these methods has been conducted. This paper aims to review all these approaches by pointing out their differences, close relations, benefits, and limitations. In addition to reviewing the state of the art, this paper proposes a unifying Gaussian simulation framework by deriving a stochastic counterpart of the celebrated proximal point algorithm in optimization. This framework offers a novel and unifying revisiting of most of the existing MCMC approaches while also extending them. Guidelines to choosing the appropriate Gaussian simulation method for a given sampling problem in high dimensions are proposed and illustrated with numerical examples.
1. , Over-relaxation method for the Monte Carlo evaluation of the partition function for multiquadratic actions , Phys. Rev. D , 23 ( 1981 ), pp. 2901 -- 2904 .
2. . Figueiredo, Fast image recovery using variable splitting and constrained optimization , IEEE Trans. Image Process. , 19 ( 2010 ), pp. 2345 -- 2356 .
3. . Figueiredo, An augmented Lagrangian approach to the constrained optimization formulation of imaging inverse problems , IEEE Trans. Image Process. , 20 ( 2011 ), pp. 681 -- 695 .
4. , Numerical approximation of the product of the square root of a matrix with a vector , Linear Algebra Appl. , 310 ( 2000 ), pp. 167 -- 181 , https://doi.org/10.1016/S0024-3795(00)00068-9.
5. , Sampling from a multivariate Gaussian distribution truncated on a simplex: A review , in
Proc. IEEE-SP Workshop Stat. and Signal Process. (SSP) ,Gold Coast, Australia , 2014 , pp. 113 -- 116 , https://doi.org/10.1109/SSP.2014.6884588.6. , MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia , Nat. Genet. , 30 ( 2002 ), pp. 41 -- 47 , https://doi.org/10.1038/ng765.
7. , Iterative numerical methods for sampling from high dimensional Gaussian distributions , Stat. Comput. , 23 ( 2013 ), pp. 501 -- 521 , https://doi.org/10.1007/s11222-012-9326-8.
8. , Iterative Solution Methods , Cambridge University Press , 1994 , https://doi.org/10.1017/CBO9780511624100.
9. , Clone MCMC: Parallel high-dimensional Gaussian Gibbs sampling , in Adv. in Neural Information Process. Systems , 2017 , pp. 5020 -- 5028 .
10. , Improving stochastic relaxation for Gaussian random fields , Probab. Eng. Inform. Sci. , 4 ( 1990 ), pp. 369 -- 389 .
11. , Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2 nd ed., Springer , 2017 .
12. , On conditional and intrinsic autoregressions , Biometrika , 82 ( 1995 ), pp. 733 -- 746 , https://doi.org/10.2307/2337341.
13. , Pattern Recognition and Machine Learning , Springer-Verlag , 2006 .
14. ,
Time Series Analysis: Forecasting and Control , 3 rd ed., Prentice Hall , 1994 .15. , Distributed optimization and statistical learning via the alternating direction method of multipliers , Found. Trends Mach. Learn. , 3 ( 2011 ), pp. 1 -- 122 .
16. , Gaussian process for nonstationary time series prediction , Comput. Statist. Data Anal. , 47 ( 2004 ), pp. 705 -- 712 , https://doi.org/10.1016/j.csda.2004.02.006.
17. , A proximal point analysis of the preconditioned alternating direction method of multipliers , J. Optim. Theory Appl. , 173 ( 2017 ), pp. 878 -- 907 , https://doi.org/10.1007/s10957-017-1112-5.
18. , Stan: A probabilistic programming language , J. Stat. Softw. , 76 ( 2017 ), pp. 1 -- 32 , https://doi.org/10.18637/jss.v076.i01.
19. , A first-order primal-dual algorithm for convex problems with applications to imaging , J. Math. Imaging Vision , 40 ( 2011 ), pp. 120 -- 145 , https://doi.org/10.1007/s10851-010-0251-1.
20. , Sur la résolution numérique des systèmes d'équations linéaires , Bull. Soc. Amis Bibl. École Polytech. , 39 ( 1910 ), pp. 81 -- 95 .
21. , Preconditioned Krylov subspace methods for sampling multivariate Gaussian distributions , SIAM J. Sci. Comput. , 36 ( 2014 ), pp. A588 -- A608 , https://doi.org/10.1137/130920587.
22. , A fast divide-and-conquer algorithm for computing the spectra of real symmetric tridiagonal matrices , Appl. Comput. Harmonic Anal. , 34 ( 2013 ), pp. 379 -- 414 , https://doi.org/10.1016/j.acha.2012.06.003.
23. , Statistics for Spatial Data , 2 nd ed., Wiley , 1993 .
24. , Generating large stochastic simulations---the matrix polynomial approximation method , Math. Geosci. , 19 ( 1987 ), pp. 99 -- 107 , https://doi.org/10.1007/BF00898190.
25. . Knapp, Exploring classification strategies with the CoEPrA 2006 contest , Bioinformatics , 26 ( 2010 ), pp. 603 -- 609 , https://doi.org/10.1093/bioinformatics/btq021.
26. , Fast and exact simulation of stationary Gaussian processes through circulant embedding of the covariance matrix , SIAM J. Sci. Comput. , 18 ( 1997 ), pp. 1088 -- 1107 , https://doi.org/10.1137/S1064827592240555.
27. , Hybrid Monte Carlo , Phys. Lett. B , 195 ( 1987 ), pp. 216 -- 222 .
28. , A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science , SIAM J. Imaging Sci. , 3 ( 2010 ), pp. 1015 -- 1046 , https://doi.org/10.1137/09076934X.
29. , Bayesian semiparametric regression analysis of multicategorical time-space data , Ann. Inst. Statist. Math. , 53 ( 2001 ), pp. 11 -- 30 , https://doi.org/10.1023/A:1017904118167.
30. , Gradient scan Gibbs sampler: An efficient algorithm for high-dimensional Gaussian distributions , IEEE J. Sel. Topics Signal Process. , 10 ( 2016 ), pp. 343 -- 352 , https://doi.org/10.1109/JSTSP.2015.2510961.
31. , Convergence in variance of Chebyshev accelerated Gibbs samplers , SIAM J. Sci. Comput. , 36 ( 2014 ), pp. A124 -- A147 , https://doi.org/10.1137/120900940.
32. , Accelerated Gibbs sampling of normal distributions using matrix splittings and polynomials , Bernoulli , 23 ( 2017 ), pp. 3711 -- 3743 , https://doi.org/10.3150/16-BEJ863.
33. , A dual algorithm for the solution of nonlinear variational problems via finite element approximation , Comput. Math. Appl. , 2 ( 1976 ), pp. 17 -- 40 , https://doi.org/10.1016/0898-1221(76)90003-1.
34. , Random phase textures: Theory and synthesis , IEEE Trans. Image Process. , 20 ( 2011 ), pp. 257 -- 267 , https://doi.org/10.1109/TIP.2010.2052822.
35. , Bayesian Data Analysis , Chapman & Hall , 1995 .
36. , Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , IEEE Trans. Pattern Anal. Mach. Intell. , 6 ( 1984 ), pp. 721 -- 741 .
37. , Efficient Gaussian sampling for solving large-scale inverse problems using MCMC , IEEE Trans. Signal Process. , 63 ( 2015 ), pp. 70 -- 80 , https://doi.org/10.1109/TSP.2014.2367457.
38. , Regularization and Bayesian Methods for Inverse Problems in Signal and Image Processing , 1 st ed., Wiley-IEEE Press , 2015 .
39. , Decomposable graphical Gaussian model determination , Biometrika , 86 ( 1999 ), pp. 785 -- 801 , https://doi.org/10.1093/biomet/86.4.785.
40. , Sur l'approximation, par éléments finis d'ordre un, et la résolution, par pénalisation-dualité d'une classe de problèmes de Dirichlet non linéaires , RAIRO Anal. Numer. , 9 ( 1975 ), pp. 41 -- 76 , http://www.numdam.org/item/M2AN_1975__9_2_41_0. , http://www.numdam.org/item/M2AN_1975__9_2_41_0.
41. , Matrix Computations , 2 nd ed., The Johns Hopkins University Press , 1989 .
42. , Multigrid Monte Carlo method. Conceptual foundations , Phys. Rev. D , 40 ( 1989 ), pp. 2035 -- 2071 , https://doi.org/10.1103/PhysRevD.40.2035.
43. Carlo computation and Bayesian model determination , Biometrika , 82 ( 1995 ), pp. 711 -- 732 , https://doi.org/10.1093/biomet/82.4.711.
44. , A divide-and-conquer algorithm for the symmetric tridiagonal eigenproblem , SIAM J. Matrix Anal. Appl. , 16 ( 1995 ), pp. 172 -- 191 , https://doi.org/10.1137/S0895479892241287.
45. , Computing $A^\alpha, \log(a)$, and related matrix functions by contour integrals , SIAM J. Numer. Anal. , 46 ( 2008 ), pp. 2505 -- 2523 , https://doi.org/10.1137/070700607.
46. , Gamma: Exploring Euler's Constant , Princeton University Press , 2003 .
47. , Methods of conjugate gradients for solving linear systems , J. Res. Natl. Inst. Stand. , 49 ( 1952 ), pp. 409 -- 436 .
48. , A primer on space-time modeling from a Bayesian perspective , in Statistical Methods for Spatio-Temporal Systems, B. Finkenstadt, L. Held, and V. lsham , eds., Chapman & Hall/CRC , 2007 , pp. 217 -- 279 .
49. , Honest exploration of intractable probability distributions via Markov chain Monte Carlo , Stat. Sci. , 16 ( 2001 ), pp. 312 -- 334 , https://doi.org/10.1214/ss/1015346317.
50. , ed., Bayesian Approach to Inverse Problems , Wiley , 2008 , https://doi.org/10.1002/9780470611197.
51. , Bayesian computations and efficient algorithms for computing functions of large sparse matrices , ANZIAM J. , 45 ( 2004 ), pp. 504 -- 518 , https://eprints.qut.edu.au/22511/.
52. , A restarted Lanczos approximation to functions of a symmetric matrix , IMA J. Numer. Anal. , 30 ( 2009 ), pp. 1044 -- 1061 , https://doi.org/10.1093/imanum/drp003.
53. , An invariant form for the prior probability in estimation problems , Proc. R. Soc. A , 186 ( 1946 ), pp. 453 -- 461 .
54. , Analyzing Hogwild parallel Gaussian Gibbs sampling , in Adv. in Neural Information Process. Systems , 2013 , pp. 2715 -- 2723 .
55. , Markov chain Monte Carlo in practice: A roundtable discussion , Amer. Statist. , 52 ( 1998 ), pp. 93 -- 100 .
56. , Contextual classification of multispectral pixel data , Image Vis. Comput. , 2 ( 1984 ), pp. 13 -- 29 , https://doi.org/10.1016/0262-8856(84)90040-4.
57. , Gradient-based learning applied to document recognition , Proc. IEEE , 86 ( 1998 ), pp. 2278 -- 2324 , https://doi.org/10.1109/5.726791.
58. . Toh, A majorized ADMM with indefinite proximal terms for linearly constrained convex composite optimization , SIAM J. Optim. , 26 ( 2016 ), pp. 922 -- 950 , https://doi.org/10.1137/140999025.
59. , Markov Random Field Modeling in Image Analysis , 3 rd ed., Springer , 2009 .
60. , Efficient sampling methods for truncated multivariate normal and Student-t distributions subject to linear inequality constraints , J. Stat. Theory Pract. , 9 ( 2015 ), pp. 712 -- 732 , https://doi.org/10.1080/15598608.2014.996690.
61. , A variational approach for Bayesian blind image deconvolution , IEEE Trans. Signal Process. , 52 ( 2004 ), pp. 2222 -- 2233 .
62. , Monte Carlo Strategies in Scientific Computing , Springer , 2001 .
63. , Information Theory, Inference and Learning Algorithms , Cambridge University Press , 2003 .
64. , A Wavelet Tour of Signal Processing: The Sparse Way , 3 rd ed., Academic Press , 2008 .
65. , The turning bands method for simulation of random fields using line generation by a spectral method , Water Resour. Res. , 18 ( 1982 ), pp. 1379 -- 1394 , https://doi.org/10.1029/WR018i005p01379.
66. , A data augmentation approach for sampling Gaussian models in high dimension , in
Proc. European Signal Process. Conf. (EUSIPCO) ,Coruna, Spain , 2019 , pp. 1 -- 5 .67. . Pesquet, An auxiliary variable method for Markov chain Monte Carlo algorithms in high dimension , Entropy , 20 ( 2018 ), art. 110 , https://doi.org/10.3390/e20020110.
68. Régularisation d'inéquations variationnelles par approximations successives , ESAIM Math. Model. Numer. Anal. , 4 ( 1970 ), pp. 154 -- 158 , http://www.numdam.org/item/M2AN_1970__4_3_154_0. , http://www.numdam.org/item/M2AN_1970__4_3_154_0.
69. point fixe dâune application pseudocontractante. Cas de l'application prox , C. R. Math. Acad. Sci. Paris , 274 ( 1972 ), pp. 163 -- 165 .
70. , Chebyshev Polynomials , CRC Press , 2002 .
71. , On the synthesis of random field sampling from the spectrum: An application to the generation of hydrologic spatial processes , Water Resour. Res. , 10 ( 1974 ), pp. 705 -- 711 , https://doi.org/10.1029/WR010i004p00705.
72. , Duke Math. J. , 29 ( 1962 ), pp. 341 -- 346 , https://doi.org/10.1215/S0012-7094-62-02933-2.
73. , The Doctrine of Chances: Or, a Method of Calculating the Probability of Events in Play , 1 st ed., W. Pearson , 1718.
74. , Blind deconvolution using a variational approach to parameter, image, and blur estimation , IEEE Trans. Image Process. , 15 ( 2006 ), pp. 3715 -- 3727 .
75. , Using spatial models as priors in astronomical image analysis , J. Appl. Stat. , 16 ( 1989 ), pp. 193 -- 206 .
76. , Proximité et dualité dans un espace hilbertien , Bull. de la Soc. Math. de France , 93 ( 1965 ), pp. 273 -- 299 , https://doi.org/10.24033/bsmf.1625.
77. , Sampling high-dimensional Gaussian distributions for general linear inverse problems , IEEE Signal Process. Lett. , 19 ( 2012 ), pp. 251 -- 254 , https://doi.org/10.1109/LSP.2012.2189104.
78. , Bayesian estimation of regularization and point spread function parameters for Wiener--Hunt deconvolution , J. Opt. Soc. Am. A , 27 ( 2010 ), pp. 1593 -- 1607 , https://doi.org/10.1364/JOSAA.27.001593.
79. , Gaussian sampling by local perturbations , in Adv. in Neural Information Process. Systems , 2010 , pp. 1858 -- 1866 .
80. , The Bayesian Lasso , J. Amer. Stat. Assoc. , 103 ( 2008 ), pp. 681 -- 686 , https://doi.org/10.1198/016214508000000337.
81. , Sampling Gaussian distributions in Krylov spaces with conjugate gradients , SIAM J. Sci. Comput. , 34 ( 2012 ), pp. B312 -- B334 , https://doi.org/10.1137/110831404.
82. , Efficient simulation of Gaussian Markov random fields by Chebyshev polynomial approximation , Spat. Stat. , 31 ( 2019 ), art. 100359 , https://doi.org/10.1016/j.spasta.2019.100359.
83. , Bayesian inference for logistic models using Pólya-Gamma latent variables , J. Amer. Stat. Assoc. , 108 ( 2013 ), pp. 1339 -- 1349 , https://doi.org/10.1080/01621459.2013.829001.
84. , Numerical Recipes: The Art of Scientific Computing , 3 rd ed., Cambridge University Press , 2007 .
85. , Gaussian processes to speed up hybrid Monte Carlo for expensive Bayesian integrals , in Bayesian Statistics 7 , Oxford University Press , 2003 , pp. 651 -- 660 .
86. , The Bayesian Choice: A Decision-Theoretic Motivation , Springer , 2001 .
87. , Monte Carlo Statistical Methods , Springer , 2004 .
88. , Updating schemes, correlation structure, blocking and parameterization for the Gibbs sampler , J. Roy. Stat. Soc. Ser. B , 59 ( 1997 ), pp. 291 -- 317 , https://doi.org/10.1111/1467-9868.00070.
89. , Exponential convergence of Langevin distributions and their discrete approximations , Bernoulli , 2 ( 1996 ), pp. 341 -- 363 , https://doi.org/10.2307/3318418.
90. , Monotone operators and the proximal point algorithm , SIAM J. Control Optim. , 14 ( 1976 ), pp. 877 -- 898 , https://doi.org/10.1137/0314056.
91. , Fast sampling of Gaussian Markov random fields , J. Roy. Stat. Soc. Ser. B , 63 ( 2001 ), pp. 325 -- 338 , https://doi.org/10.1111/1467-9868.00288.
92. , Gaussian Markov Random Fields: Theory and Applications , Chapman & Hall/CRC , 2005 .
93. , Iterative Methods for Sparse Linear Systems , 2 nd ed., SIAM , 2003 , https://doi.org/10.1137/1.9780898718003.
94. , On the generation of normal random vectors , Technometrics , 4 ( 1962 ), pp. 278 -- 281 , https://doi.org/10.2307/1266625.
95. , A Krylov subspace method for covariance approximation and simulation of random processes and fields , Multidimens. Syst. Signal Process. , 14 ( 2003 ), pp. 295 -- 318 , https://doi.org/10.1023/A:1023530718764.
96. , Normalized cuts and image segmentation , IEEE Trans. Pattern Anal. Mach. Intell. , 22 ( 2000 ), pp. 888 -- 905 , https://doi.org/10.1109/34.868688.
97. . Jan, Digital simulation of random processes and its applications , J. Sound Vib. , 25 ( 1972 ), pp. 111 -- 128 , https://doi.org/10.1016/0022-460X(72)90600-1.
98. , Fast Sampling from a Gaussian Markov Random Field Using Krylov Subspace Approaches , 2008 , https://eprints.qut.edu.au/14376/.
99. , Scalable Iterative Methods for Sampling from Massive Gaussian Random Vectors, preprint, https://arxiv.org/abs/1312.1476 , 2013 . , https://arxiv.org/abs/1312.1476.
100. : Eigensystems ,
SIAM, 2001 1805 8. , https://doi.org/10.1137/1.9780898718058.101. , Gaussian random number generators , ACM Comput. Surv. , 39 ( 2007 ), art. 11 , https://doi.org/10.1145/1287620.1287622.
102. , An algorithm for the inversion of finite Toeplitz matrices , SIAM J. Appl. Math. , 12 ( 1964 ), pp. 512 -- 522 , https://doi.org/10.1137/0112045.
103. , Global Earth structure: Inference and assessment , Geophys. J. Int. , 137 ( 1999 ), pp. 381 -- 407 , https://doi.org/10.1046/j.1365-246X.1999.00823.x.
104. , Split-and-augmented Gibbs sampler: Application to large-scale inference problems , IEEE Trans. Signal Process. , 67 ( 2019 ), pp. 1648 -- 1661 , https://doi.org/10.1109/TSP.2019.2894825.
105. , Asymptotically exact data augmentation: Models, properties, and algorithms , J. Comput. Graph. Statist. , 30 ( 2021 ), pp. 335 -- 348
106. , Bayesian learning via stochastic gradient Langevin dynamics , in
Proc. Int. Conf. Machine Learning (ICML) , 2011 , pp. 681 -- 688 .107. , tmvtnorm: A package for the truncated multivariate normal distribution , The R Journal , 2 ( 2010 ), pp. 25 -- 29 , https://doi.org/10.32614/RJ-2010-005.
108. , Gaussian Processes for Machine Learning , MIT Press , 2006 .
109. , Simulation of stationary Gaussian processes in $[0, 1]^d$ , J. Comput. Graph. Stat. , 3 ( 1994 ), pp. 409 -- 432 , https://doi.org/10.1080/10618600.1994.10474655.
110. , A unified primal-dual algorithm framework based on Bregman iteration , J. Sci. Comput. , 46 ( 2011 ), pp. 20 -- 46 , https://doi.org/10.1007/s10915-010-9408-8.