Abstract

In this paper, we propose neural networks that tackle the problems of stability and field-of-view of a convolutional neural network. As an alternative to increasing the network's depth or width to improve performance, we propose integral-based spatially nonlocal operators which are related to global weighted Laplacian, fractional Laplacian, and inverse fractional Laplacian operators that arise in several problems in the physical sciences. The forward propagation of such networks is inspired by partial integro-differential equations. We test the effectiveness of the proposed neural architectures on benchmark image classification datasets and semantic segmentation tasks in autonomous driving. Moreover, we investigate the extra computational costs of these dense operators and the stability of forward propagation of the proposed neural networks.

Keywords

  1. deep neural networks
  2. field-of-view
  3. nonlocal operators
  4. partial integro-differential equations
  5. fractional Laplacian
  6. pseudodifferential operator

MSC codes

  1. 65D15
  2. 65L07
  3. 68T05
  4. 68W25
  5. 47G20
  6. 47G30

Get full access to this article

View all available purchase options and get full access to this article.

Supplementary Material


PLEASE NOTE: These supplementary files have not been peer-reviewed.


Index of Supplementary Materials

Title of paper: Deep Neural Networks and PIDE discretizations

Authors: Bastian Bohn, Michael Griebel, and Dinesh Kannan

File: supplement.pdf

Type: PDF

Contents: Plots and figures showing the stability of each of the proposed neural networks.

References

1.
H. Antil, R. Khatri, R. Löhner, and D. Verma, Fractional deep neural network via constrained optimization, Mach. Learn., 2 (2020), 015003.
2.
U. Ascher, Numerical Methods for Evolutionary Differential Equations, SIAM, Philadelphia, 2008.
3.
A. Buades, B. Coll, and J.-M. Morel, A non-local algorithm for image denoising, in Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), vol. 2, IEEE, 2005, pp. 60--65.
4.
M. Caputo, Linear models of dissipation whose Q is almost frequency independent II, Geophys. J. Internat., 13 (1967), pp. 529--539.
5.
B. Chang, L. Meng, E. Haber, L. Ruthotto, D. Begert, and E. Holtham, Reversible architectures for arbitrarily deep residual neural networks, in Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 2018.
6.
A. Coates, A. Ng, and H. Lee, An analysis of single-layer networks in unsupervised feature learning, in Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, Proc. Mach. Learn. Res. (PMLR) 15, JMLR, Cambridge, MA, 2011, pp. 215--223.
7.
R. Collobert and J. Weston, A unified architecture for natural language processing: Deep neural networks with multitask learning, in Proceedings of the 25th International Conference on Machine Learning (ICML), ACM, 2008, pp. 160--167.
8.
R. Cont, Long range dependence in financial markets, in Fractals in Engineering, Springer, London, 2005, pp. 159--179.
9.
W. Dahmen, S. Prößdorf, and R. Schneider, Wavelet approximation methods for pseudodifferential equations ii: Matrix compression and fast solution, Adv. Comput. Math., 1 (1993), pp. 259--335.
10.
E. Di Nezza, G. Palatucci, and E. Valdinoci, Hitchhiker's guide to the fractional Sobolev spaces, Bull. Sci. Math., 136 (2012), pp. 521--573.
11.
Q. Du, M. Gunzburger, R. Lehoucq, and K. Zhou, Analysis and approximation of nonlocal diffusion problems with volume constraints, SIAM Rev., 54 (2012), pp. 667--696.
12.
E. Weinan, A proposal on machine learning via dynamical systems, Commun. Math. Stat., 5 (2017), pp. 1--11.
13.
G. Gilboa and S. Osher, Nonlocal linear image regularization and supervised segmentation, Multiscale Model. Simul., 6 (2007), pp. 595--630.
14.
I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, MIT Press, Cambridge, MA, 2016, http://www.deeplearningbook.org.
15.
I. Goodfellow, J. Shlens, and C. Szegedy, Explaining and harnessing adversarial examples, in Proceedings of the 3rd International Conference on Learning Representations (ICLR), ICLR, 2015.
16.
E. Haber and L. Ruthotto, Stable architectures for deep neural networks, Inverse Problems, 34 (2017), 014004.
17.
K. He, G. Gkioxari, P. Dollár, and R. Girshick, Mask R-CNN, in Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2980--2988.
18.
K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770--778.
19.
K. He, X. Zhang, S. Ren, and J. Sun, Identity mappings in deep residual networks, in Proceedings of the European Conference on Computer Vision (ECCV), Springer, 2016, pp. 630--645.
20.
S. Ioffe and C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, in Proceedings of the 32nd International Conference on Machine Learning (ICML), Proc. Mach. Learn. Res. (PMLR) 37, JMLR, Cambridge, MA, 2015, pp. 448--456.
21.
A. Krizhevsky, Learning Multiple Layers of Features From Tiny Images, technical report, University of Toronto, 2009.
22.
A. Krizhevsky, I. Sutskever, and G. Hinton, ImageNet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems 25, Curran Associates, Red Hook, NY, 2012, pp. 1097--1105.
23.
M. Kwaśnicki, Ten equivalent definitions of the fractional Laplace operator, Fract. Calc. Appl. Anal., 20 (2017), pp. 7--51.
24.
M. Kwaśnicki and J. Mucha, Extension technique for complete Bernstein functions of the Laplace operator, J. Evol. Equ., 18 (2018), pp. 1341--1379.
25.
C.-Y. Lee, S. Xie, P. Gallagher, Z. Zhang, and Z. Tu, Deeply-supervised nets, in Proceedings of the 18th International Conference on Artificial Intelligence and Statistics, JMLR, 2015, pp. 562--570.
26.
W. Luo, Y. Li, R. Urtasun, and R. Zemel, Understanding the effective receptive field in deep convolutional neural networks, in Advances in Neural Information Processing Systems 29, Proc. Mach. Learn. Res. (PMLR) 15, JMLR, Cambridge, MA, 2016, pp. 4898--4906.
27.
D. Onken, S. Fung, X. Li, and L. Ruthotto, OT-Flow: Fast and accurate continuous normalizing flows via optimal transport, in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, 2021, pp. 9223--9232.
28.
R. Pascanu, T. Mikolov, and Y. Bengio, On the difficulty of training recurrent neural networks, in Proceedings of the 30th International Conference on Machine Learning (ICML), Proc. Mach. Learn. Res. (PMLR) 28, JMLR, Cambridge, MA, 2013, pp. 1310--1318.
29.
T. Pohlen, A. Hermans, M. Mathias, and B. Leibe, Full-resolution residual networks for semantic segmentation in street scenes, in Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 3309--3318.
30.
M. Renardy and R. Rogers, An Introduction to Partial Differential Equations, Springer, New York, 2006.
31.
O. Ronneberger, P. Fischer, and T. Brox, U-Net: Convolutional networks for biomedical image segmentation, in Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Springer, 2015, pp. 234--241.
32.
L. Ruthotto and E. Haber, Deep neural networks motivated by partial differential equations, J. Math. Imaging Vision, 62 (2020), pp. 352--364.
33.
L. Ruthotto, S. Osher, W. Li, L. Nurbekyan, and S. Fung, A machine learning framework for solving high-dimensional mean field game and mean field control problems, Proc. Natl. Acad. Sci. USA, 17 (2020), pp. 9783--9793.
34.
L. Silvestre, Regularity of the obstacle problem for a fractional power of the Laplace operator, Comm. Pure Appl. Math., 60 (2007), pp. 67--112.
35.
K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, in Proceedings of the 3rd International Conference on Learning Representations (ICLR), ICLR, 2015.
36.
J. Sirignano and K. Spiliopoulos, DGM: A deep learning algorithm for solving partial differential equations, J. Comput. Phys., 375 (2018), pp. 1339--1364.
37.
E. Stein, Singular Integrals and Differentiability Properties of Functions, Princetion Math. Ser. 2, Princeton University Press, Princetion, NJ, 1970.
38.
P. Stinga, Fractional powers of Second Order Partial Differential Operators: Extension Problem and Regularity Theory, Ph.D. thesis, Universidad Autónoma de Madrid, 2010.
39.
P. Stinga, User’s guide to the fractional Laplacian and the method of semigroups, Fract. Differ. Equ., (2019), pp. 235--266.
40.
P. Stinga and J. Torrea, Extension problem and Harnack's inequality for some fractional operators, Comm. Partial Differential Equations, 35 (2010), pp. 2092--2122.
41.
C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, Inception-v4, Inception-ResNet and the impact of residual connections on learning, in Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017.
42.
Y. Tao, Q. Sun, Q. Du, and W. Liu, Nonlocal neural networks, nonlocal diffusion and nonlocal modeling, in Advances in Neural Information Processing Systems 3, Curran Associates, Red Hook, NY, 2018, pp. 496--506.
43.
R. Vidal, J. Bruna, R. Giryes, and S. Soatto, Mathematics of Deep Learning, preprint, arXiv:1712.04741, 2017.
44.
X. Wang, R. Girshick, A. Gupta, and K. He, Non-local neural networks, in Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 7794--7803.
45.
F. Yu, H. Chen, X. Wang, W. Xian, Y. Chen, F. Liu, V. Madhavan, and T. Darrell, BDD100K: A diverse driving dataset for heterogeneous multitask learning, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
46.
J. Zhang, B. Han, L. Wynter, K. Low, and M. Kankanhalli, Towards robust ResNet: A small step but a giant leap, in Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI-19, International Joint Conferences on Artificial Intelligence Organization, 2019, pp. 4285--4291.
47.
J. Zheng, Y. Feng, C. Bai, and J. Zhang, Hyperspectral image classification using mixed convolutions and covariance pooling, IEEE Trans. Geosci. Remote Sensing, 59 (2021), pp. 522--534.
48.
D.-X. Zhou, Universality of deep convolutional neural networks, Appl. Comput. Harmon. Anal., 48 (2020), pp. 787--794.
49.
D. Zügner, A. Akbarnejad, and S. Günnemann, Adversarial attacks on neural networks for graph data, in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2018, pp. 2847--2856.

Information & Authors

Information

Published In

cover image SIAM Journal on Mathematics of Data Science
SIAM Journal on Mathematics of Data Science
Pages: 1145 - 1170
ISSN (online): 2577-0187

History

Submitted: 3 August 2021
Accepted: 5 May 2022
Published online: 22 August 2022

Keywords

  1. deep neural networks
  2. field-of-view
  3. nonlocal operators
  4. partial integro-differential equations
  5. fractional Laplacian
  6. pseudodifferential operator

MSC codes

  1. 65D15
  2. 65L07
  3. 68T05
  4. 68W25
  5. 47G20
  6. 47G30

Authors

Affiliations

Funding Information

Deutsche Forschungsgemeinschaft https://doi.org/10.13039/501100001659 : 390685813, CRC 1060

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited By

There are no citations for this item

View Options

View options

PDF

View PDF

Figures

Tables

Media

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media