Abstract.

This paper studies the learning of linear operators between infinite-dimensional Hilbert spaces. The training data comprises pairs of random input vectors in a Hilbert space and their noisy images under an unknown self-adjoint linear operator. Assuming that the operator is diagonalizable in a known basis, this work solves the equivalent inverse problem of estimating the operator’s eigenvalues given the data. Adopting a Bayesian approach, the theoretical analysis establishes posterior contraction rates in the infinite data limit with Gaussian priors that are not directly linked to the forward map of the inverse problem. The main results also include learning-theoretic generalization error guarantees for a wide range of distribution shifts. These convergence rates quantify the effects of data smoothness and true eigenvalue decay or growth, for compact or unbounded operators, respectively, on sample complexity. Numerical evidence supports the theory in diagonal and nondiagonal settings.

Keywords

  1. operator learning
  2. linear inverse problems
  3. Bayesian inference
  4. posterior consistency
  5. statistical learning theory
  6. distribution shift

MSC codes

  1. 62G20
  2. 62C10
  3. 68T05
  4. 47A62

Get full access to this article

View all available purchase options and get full access to this article.

Acknowledgments.

The authors thank Kamyar Azizzadenesheli and Joel A. Tropp for helpful discussions about statistical learning. The authors are also grateful to the associate editor and two anonymous referees for their helpful feedback. The computations presented in this paper were conducted on the Resnick High Performance Computing Center, a facility supported by the Resnick Sustainability Institute at the California Institute of Technology.

References

1.
J. Abernethy, F. Bach, T. Evgeniou, and J.-P. Vert, A new approach to collaborative filtering: Operator estimation with spectral regularization, J. Mach. Learn. Res., 10 (2009), pp. 803–826.
2.
K. Abraham and R. Nickl, On statistical Calderón problems, Math. Stat. Learn., 2 (2019), pp. 165–216.
3.
B. Adcock, S. Brugiapaglia, N. Dexter, and S. Moraga, Deep neural networks are effective at learning high-dimensional Hilbert-valued functions from limited data, Proc. Mach. Learn. Res. (PMLR), 145 (2022), pp. 1–36.
4.
S. Agapiou, J. M. Bardsley, O. Papaspiliopoulos, and A. M. Stuart, Analysis of the Gibbs sampler for hierarchical inverse problems, SIAM/ASA J. Uncertain. Quantif., 2 (2014), pp. 511–544.
5.
S. Agapiou, S. Larsson, and A. M. Stuart, Posterior contraction rates for the Bayesian approach to linear ill-posed inverse problems, Stochastic Process. Appl., 123 (2013), pp. 3828–3860.
6.
S. Agapiou and P. Mathé, Posterior contraction in Bayesian inverse problems under Gaussian priors, in New Trends in Parameter Identification for Mathematical Models, Trends in Mathematics, B. Hofmann, A. Leitão, and J. Zubelli, eds., Birkhäuser, Cham, Switzerland, 2018, pp. 1–29.
7.
S. Agapiou and P. Mathé, Designing truncated priors for direct and inverse Bayesian problems, Electron. J. Stat., 16 (2022), pp. 158–200.
8.
S. Agapiou, A. M. Stuart, and Y.-X. Zhang, Bayesian posterior contraction rates for linear severely ill-posed inverse problems, J. Inverse Ill-Posed Probl., 22 (2014), pp. 297–321.
9.
G. S. Alberti, E. de Vito, M. Lassas, L. Ratti, and M. Santacesaria, Learning the optimal Tikhonov regularizer for inverse problems, Adv. Neural Inf. Process. Syst., 34 (2021), pp. 25205–25216.
10.
R. G. Antonini, Subgaussian random variables in Hilbert spaces, Rend. Semin. Mat. Univ. Padovo, 98 (1997), pp. 89–99.
11.
S. Arridge, P. Maass, O. Öktem, and C.-B. Schönlieb, Solving inverse problems using data-driven models, Acta Numer., 28 (2019), pp. 1–174.
12.
A. Aspri, Y. Korolev, and O. Scherzer, Data driven regularization by projection, Inverse Problems, 36 (2020), 125009.
13.
K. Bhattacharya, B. Hosseini, N. B. Kovachki, and A. M. Stuart, Model reduction and neural networks for parametric PDEs, SMAI J. Comput. Math., 7 (2021), pp. 121–157.
14.
N. H. Bingham, C. M. Goldie, J. L. Teugels, and J. Teugels, Regular Variation, Encyclopedia Math. Appl. 27, Cambridge University Press, Cambridge, 1989.
15.
I. R. Bleyer and R. Ramlau, A double regularization approach for inverse problems with noisy data and inexact operator, Inverse Problems, 29 (2013), 025004.
16.
J. Bohr, Stability of the non-abelian X-ray transform in dimension \(\geq 3\) , J. Geom. Anal., 31 (2021), pp. 11226–11269.
17.
N. Boullé and A. Townsend, Learning elliptic partial differential equations with randomized linear algebra, Found. Comput. Math., to appear.
18.
S. L. Brunton, J. L. Proctor, and J. N. Kutz, Discovering governing equations from data by sparse identification of nonlinear dynamical systems, Proc. Natl. Acad. Sci. USA, 113 (2016), pp. 3932–3937.
19.
T. A. Bubba, M. Galinier, M. Lassas, M. Prato, L. Ratti, and S. Siltanen, Deep neural networks for inverse problems with pseudodifferential operators: An application to limited-Angle tomography, SIAM J. Imaging Sci., 14 (2021), pp. 470–505.
20.
T. Bui-Thanh, Q. Li, and L. Zepeda-Núñez, Bridging and improving theoretical and computational electrical impedance tomography via data completion, SIAM J. Sci. Comput., 44 (2022), pp. B668–B693.
21.
T. T. Cai and P. Hall, Prediction in functional linear regression, Ann. Statist., 34 (2006), pp. 2159–2179.
22.
A. Caponnetto and E. De Vito, Optimal rates for the regularized least-squares algorithm, Found. Comput. Math., 7 (2007), pp. 331–368.
23.
L. Cavalier, Nonparametric statistical inverse problems, Inverse Probl., 24 (2008), 034004.
24.
M. J. Colbrook, V. Antun, and A. C. Hansen, The difficulty of computing stable and accurate neural networks: On the barriers of deep learning and Smale’s 18th problem, Proc. Natl. Acad. Sci. USA, 119 (2022), e2107151119.
25.
M. Colbrook, A. Horning, and A. Townsend, Computing spectral measures of self-adjoint operators, SIAM Rev., 63 (2021), pp. 489–524.
26.
C. Crambes and A. Mas, Asymptotics of prediction in functional linear regression with functional outputs, Bernoulli, 19 (2013), pp. 2627–2651.
27.
M. Dashti, K. J. Law, A. M. Stuart, and J. Voss, MAP estimators and their consistency in Bayesian nonparametric inverse problems, Inverse Problems, 29 (2013), 095017.
28.
M. Dashti and A. M. Stuart, The Bayesian Approach to Inverse Problems, in Handbook of Uncertainty Quantification, R. Ghanem, D. Higdon, and H. Owhadi, eds., Springer, Cham, Switzerland, 2017, pp. 311–428.
29.
M. V. de Hoop, D. Z. Huang, E. Qian, and A. M. Stuart, The cost-accuracy trade-off in operator learning with neural networks, J. Mach. Learn., 1 (2022), pp. 299–341.
30.
M. V. de Hoop, M. Lassas, and C. A. Wong, Deep learning architectures for nonlinear operator functions and nonlinear inverse problems, Math. Stat. Learn., 4 (2022), pp. 1–86.
31.
E. de Vito, L. Rosasco, A. Caponnetto, U. D. Giovannini, and F. Odone, Learning from examples as an inverse problem, J. Mach. Learn. Res., 6 (2005), pp. 883–904.
32.
Y. Fan and L. Ying, Solving electrical impedance tomography with deep learning, J. Comput. Phys., 404 (2020), 109119.
33.
L. Gawarecki and V. Mandrekar, Stochastic Differential Equations in Infinite Dimensions: With Applications to Stochastic Partial Differential Equations, Springer, Berlin, 2010.
34.
D. Giannakis, Data-driven spectral decomposition and forecasting of ergodic dynamical systems, Appl. Comput. Harmon. Anal., 47 (2019), pp. 338–396.
35.
M. Giordano and H. Kekkonen, Bernstein-von Mises theorems and uncertainty quantification for linear inverse problems, SIAM/ASA J. Uncertain. Quantif., 8 (2020), pp. 342–373.
36.
G. H. Golub and C. F. Van Loan, An analysis of the total least squares problem, SIAM J. Numer. Anal., 17 (1980), pp. 883–893.
37.
S. Grünewälder, G. Lever, L. Baldassarre, S. Patterson, A. Gretton, and M. Pontil, Conditional mean embeddings as regressors, in Proceedings of the 29th International Conference on Machine Learning, 2012.
38.
S. Gugushvili, A. van der Vaart, and D. Yan, Bayesian linear inverse problems in regularity scales, Ann. Inst. H. Poincaré Probab. Stat., 56 (2020), pp. 2081–2107.
39.
S. Hörmann and Ł. Kidziński, A note on estimation in Hilbertian linear models, Scand. J. Stat., 42 (2015), pp. 43–62.
40.
I. Klebanov, I. Schuster, and T. J. Sullivan, A rigorous theory of conditional mean embeddings, SIAM J. Math. Data Sci., 2 (2020), pp. 583–606.
41.
S. Klus, F. Nüske, S. Peitz, J.-H. Niemann, C. Clementi, and C. Schütte, Data-driven approximation of the Koopman generator: Model reduction, system identification, and control, Phys. D, 406 (2020), 132416.
42.
S. Klus, I. Schuster, and K. Muandet, Eigendecompositions of transfer operators in reproducing kernel Hilbert spaces, J. Nonlinear Sci., 30 (2020), pp. 283–315.
43.
B. T. Knapik and J.-B. Salomond, A general approach to posterior contraction in nonparametric inverse problems, Bernoulli, 24 (2018), pp. 2091–2121.
44.
B. T. Knapik, A. W. van der Vaart, and J. H. van Zanten, Bayesian inverse problems with Gaussian priors, Ann. Statist., 39 (2011), pp. 2626–2657.
45.
Y. Korolev, Two-layer neural networks with values in a Banach space, SIAM J. Math. Anal., 54 (2022), pp. 6358–6389.
46.
N. B. Kovachki, S. Lanthaler, and S. Mishra, On universal approximation and error bounds for Fourier neural operators, J. Mach. Learn. Res., 22 (2021), pp. 1–76.
47.
N. B. Kovachki, Z. Li, B. Liu, K. Azizzadenesheli, K. Bhattacharya, A. M. Stuart, and A. Anandkumar, Neural Operator: Learning Maps Between Function Spaces, preprint, arXiv:2108.08481, 2021.
48.
C. Kubrusly and J. Zanni, A note on compactness of tensor products, Acta Math. Univ. Comenian. (N.S.), 84 (2015), pp. 59–62.
49.
S. Lanthaler, S. Mishra, and G. E. Karniadakis, Error estimates for DeepONets: A deep learning framework in infinite dimensions, Trans. Math. Appl., 6 (2022), tnac001.
50.
M. S. Lehtinen, L. Paivarinta, and E. Somersalo, Linear inverse problems for generalised random variables, Inverse Problems, 5 (1989), pp. 599–612.
51.
L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis, Learning nonlinear operators via deeponet based on the universal approximation theorem of operators, Nat. Mach. Intell., 3 (2021), pp. 218–229.
52.
A. Mandelbaum, Linear estimators and measurable linear transformations on a Hilbert space, Z. Wahrscheinlichkeitstheorie verw Geb., 65 (1984), pp. 385–397.
53.
T. Mathieu and S. Minsker, Excess risk bounds in robust empirical risk minimization, Inf. Inference, 10 (2021), pp. 1423–1490.
54.
M. Mollenhauer and P. Koltai, Nonparametric Approximation of Conditional Expectation Operators, preprint, arXiv:2012.12917, 2020.
55.
F. Monard, R. Nickl, and G. P. Paternain, Efficient nonparametric Bayesian inference for X-ray transforms, Ann. Statist., 47 (2019), pp. 1113–1147.
56.
F. Monard, R. Nickl, and G. P. Paternain, Consistent inversion of noisy non-abelian X-ray transforms, Comm. Pure Appl. Math., 74 (2021), pp. 1045–1099.
57.
J. L. Mueller and S. Siltanen, Linear and Nonlinear Inverse Problems with Practical Applications, Comput. Sci. Eng. 10, SIAM, Philadelphia, 2012.
58.
N. H. Nelsen and A. M. Stuart, The random feature model for input-output maps between Banach spaces, SIAM J. Sci. Comput., 43 (2021), pp. A3212–A3243.
59.
R. Nickl, S. van de Geer, and S. Wang, Convergence rates for penalized least squares estimators in PDE constrained regression problems, SIAM/ASA J. Uncertain. Quantif., 8 (2020), pp. 374–413.
60.
T. O’Leary-Roseberry, U. Villa, P. Chen, and O. Ghattas, Derivative-informed projected neural networks for high-dimensional parametric maps governed by PDEs, Comput. Methods Appl. Mech. Eng., 388 (2022), 114199.
61.
R. G. Patel, N. A. Trask, M. A. Wood, and E. C. Cyr, A physics-informed operator regression framework for extracting data-driven continuum models, Comput. Methods Appl. Mech. Eng., 373 (2021), 113500.
62.
T. Portone and R. D. Moser, Bayesian inference of an uncertain generalized diffusion operator, SIAM/ASA J. Uncertain. Quantif., 10 (2022), pp. 151–178.
63.
J. O. Ramsay and B. W. Silverman, Functional Data Analysis, Springer Ser. Statist., 2nd ed., Springer, New York, 2005.
64.
A. Rastogi, G. Blanchard, and P. Mathé, Convergence analysis of Tikhonov regularization for non-linear statistical inverse problems, Electron. J. Stat., 14 (2020), pp. 2798–2841.
65.
K. Ray, Bayesian inverse problems with non-conjugate priors, Electron. J. Stat., 7 (2013), pp. 2516–2549.
66.
M. Reimherr, Functional regression with repeated eigenvalues, Statist. Probab. Lett., 107 (2015), pp. 62–70.
67.
C. Schwab and J. Zech, Deep learning in high dimension: Neural network expression rates for generalized polynomial chaos expansions in UQ, Anal. Appl. (Singap.), 17 (2019), pp. 19–55.
68.
L. Song, J. Huang, A. Smola, and K. Fukumizu, Hilbert space embeddings of conditional distributions with applications to dynamical systems, in Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09, New York, ACM, New York, 2009, pp. 961–968.
69.
I. Steinwart, Convergence types and rates in generic Karhunen-Loeve expansions with applications to sample path properties, Potential Anal., 51 (2019), pp. 361–395.
70.
A. M. Stuart, Inverse problems: A Bayesian perspective, Acta Numer., 19 (2010), pp. 451–559.
71.
P. Tabaghi, M. V. de Hoop, and I. Dokmanić, Learning Schatten-von Neumann Operators, preprint, arXiv:1901.10076, 2019.
72.
M. Trabs, Bayesian inverse problems with unknown operators, Inverse Problems, 34 (2018), 085001.
73.
R. Vershynin, High-Dimensional Probability: An Introduction with Applications in Data Science, Cambridge University Press, Cambridge, 2018.
74.
M. J. Wainwright, High-Dimensional Statistics: A Non-Asymptotic Viewpoint, Camb. Ser. Stat. Probab. Math. 48, Cambridge University Press, Cambridge, 2019.
75.
D. Wang, Z. Zhao, Y. Yu, and R. Willett, Functional linear regression with mixed predictors, J. Mach. Learn. Res., 23 (2022), pp. 1–94.

Information & Authors

Information

Published In

cover image SIAM/ASA Journal on Uncertainty Quantification
SIAM/ASA Journal on Uncertainty Quantification
Pages: 480 - 513
ISSN (online): 2166-2525

History

Submitted: 30 August 2021
Accepted: 1 November 2022
Published online: 11 May 2023

Keywords

  1. operator learning
  2. linear inverse problems
  3. Bayesian inference
  4. posterior consistency
  5. statistical learning theory
  6. distribution shift

MSC codes

  1. 62G20
  2. 62C10
  3. 68T05
  4. 47A62

Authors

Affiliations

Maarten V. de Hoop
Simons Chair in Computational and Applied Mathematics and Earth Science, Rice University, Houston, TX 77005 USA.
NVIDIA AI, NVIDIA, Santa Clara, CA 95051 USA.
Division of Engineering and Applied Science, California Institute of Technology, Pasadena, CA 91125 USA.
Andrew M. Stuart
Division of Engineering and Applied Science, California Institute of Technology, Pasadena, CA 91125 USA.

Funding Information

Funding: The first author is supported by the Simons Foundation under the MATH + X program, U.S. Department of Energy, Office of Basic Energy Sciences, Chemical Sciences, Geosciences, and Biosciences Division under grant DE-SC0020345, the National Science Foundation (NSF) under grant DMS-1815143, and the corporate members of the Geo-Mathematical Imaging Group at Rice University. The third author is supported by the NSF Graduate Research Fellowship Program under grant DGE-1745301. The fourth author is supported by NSF (grant DMS-1818977) and AFOSR (MURI award FA9550-20-1-0358—Machine Learning and Physics-Based Modeling and Simulation). The second, third, and fourth authors are supported by NSF (grant AGS-1835860) and ONR (grant N00014-19-1-2408).

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited By

View Options

View options

PDF

View PDF

Full Text

View Full Text

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media