Abstract.

Using rough path techniques, we provide a priori estimates for the output of deep residual neural networks in terms of both the input data and the (trained) network weights. As trained network weights are typically very rough when seen as functions of the layer, we propose to derive stability bounds in terms of the total \(p\)-variation of trained weights for any \(p\in [1,3]\). Unlike the \(C^1\)-theory underlying the neural ODE literature, our estimates remain bounded even in the limiting case of weights behaving like Brownian motions, as suggested in [A.-S. Cohen, R. Cont, A. Rossier, and R. Xu, Proceedings of the 38th International Conference on Machine Learning, JMLR, Cambridge, MA, 2021, pp. 2039–2048]. Mathematically, we interpret residual neural network as solutions to (rough) difference equations, and analyze them based on recent results of discrete-time signatures and rough path theory.

Keywords

  1. residual neural networks
  2. rough paths
  3. \(p\)-variation
  4. stability

MSC codes

  1. 60L90
  2. 68T07
  3. 39A30

Get full access to this article

View all available purchase options and get full access to this article.

Acknowledgment.

We are grateful for related discussions with Terry Lyons and Gitta Kutyniok.

References

1.
T. Cass, C. Litterer, and T. Lyons, Integrability and tail estimates for Gaussian rough differential equations, Ann. Probab., 41 (2013), pp. 3026–3050, https://doi.org/10.1214/12-AOP821.
2.
A.-S. Cohen, R. Cont, A. Rossier, and R. Xu, Scaling properties of deep residual networks, in Proceedings of the 38th International Conference on Machine Learning, M. Meila and T. Zhang, eds., Proc. Mach. Learn. Res. (PMLR) 139, JMLR, Cambridge, MA, 2021, pp. 2039–2048.
3.
A. M. Davie, Differential equations driven by rough paths: An approach via discrete approximation, Appl. Math. Res. Express. AMRX, 2008 (2008), abm009, https://doi.org/10.1093/amrx/abm009.
4.
A. Deya, M. Gubinelli, M. Hofmanová, and S. Tindel, A priori estimates for rough PDEs with application to rough conservation laws, J. Funct. Anal., 276 (2019), pp. 3577–3645, https://doi.org/10.1016/j.jfa.2019.03.008.
5.
J. Diehl, K. Ebrahimi-Fard, and N. Tapia, Time-warping invariants of multidimensional time series, Acta Appl. Math., 170 (2020), pp. 265–290, https://doi.org/10.1007/s10440-020-00333-x.
6.
W. E, A proposal on machine learning via dynamical systems, Commun. Math. Stat., 5 (2017), pp. 1–11.
7.
D. Feyel and A. de La Pradelle, Curvilinear integrals along enriched paths, Electron. J. Probab., 11 (2006), pp. 860–892, https://doi.org/10.1214/ejp.v11-356.
8.
P. K. Friz and N. B. Victoir, Multidimensional Stochastic Processes as Rough Paths: Theory and Applications, Cambridge Stud. Adv. Math. 120, Cambridge University Press, Cambridge, UK, 2010, https://doi.org/10.1017/cbo9780511845079.
9.
M. Gubinelli, Controlling rough paths, J. Funct. Anal., 216 (2004), pp. 86–140, https://doi.org/10.1016/j.jfa.2004.01.002.
10.
E. Haber and L. Ruthotto, Stable architectures for deep neural networks, Inverse Problems, 34 (2018), 014004, https://doi.org/10.1088/1361-6420/aa9a90.
11.
E. Haber, L. Ruthotto, E. Holtham, and S.-H. Jun, Learning across scales—multiscale methods for convolution neural networks, in Proceedings of the AAAI Conference on Artificial Intelligence, 2018.
12.
M. Hairer and D. Kelly, Geometric versus non-geometric rough paths, Ann Inst. Henri Poincaré Probab. Stat., 51 (2015), pp. 207–251, https://doi.org/10.1214/13-AIHP564.
13.
K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, https://doi.org/10.1109/cvpr.2016.90.
14.
K. He, X. Zhang, S. Ren, and J. Sun, Identity Mappings in Deep Residual Networks, preprint, arXiv:1603.05027, 2016.
15.
P. Kidger, J. Morrill, J. Foster, and T. Lyons, Neural controlled differential equations for irregular time series, in Advances in Neural Information Processing Systems 33, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, eds., Curran Associates, Red Hook, NY, 2020, pp. 6696–6707.
16.
R. Kruse, Strong and Weak Approximation of Semilinear Stochastic Evolution Equations, Lecture Notes in Math. 2093, Springer, Cham, 2014, https://doi.org/10.1007/978-3-319-02231-4.
17.
T. J. Lyons, Differential equations driven by rough signals, Rev. Mat. Iberoam., 14 (1998), pp. 215–310, https://doi.org/10.4171/rmi/240.
18.
J. Morrill, C. Salvi, P. Kidger, and J. Foster, Neural rough differential equations for long time series, in Proceedings of the 38th International Conference on Machine Learning, M. Meila and T. Zhang, eds., Proc. Mach. Learn. Res. (PMLR) 139, JMLR, Cambridge, MA, 2021, pp. 7829–7838.

Information & Authors

Information

Published In

cover image SIAM Journal on Mathematics of Data Science
SIAM Journal on Mathematics of Data Science
Pages: 50 - 76
ISSN (online): 2577-0187

History

Submitted: 18 January 2022
Accepted: 3 October 2022
Published online: 3 February 2023

Keywords

  1. residual neural networks
  2. rough paths
  3. \(p\)-variation
  4. stability

MSC codes

  1. 60L90
  2. 68T07
  3. 39A30

Authors

Affiliations

Weierstrass Institute, 10117 Berlin, Germany.
Weierstrass Institute and Technische Universität Berlin, 10623 Berlin, Germany.
Weierstrass Institute and TU Berlin, Berlin, Germany.

Funding Information

Deutsche Forschungsgemeinschaft (DFG): EXC-2046/1, EF1-5, EF1-13
Funding: This work was supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy—The Berlin Mathematics Research Center MATH+ (EXC-2046/1, projects EF1-5 and EF1-13).

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited By

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media

The SIAM Publications Library now uses SIAM Single Sign-On for individuals. If you do not have existing SIAM credentials, create your SIAM account https://my.siam.org.