Abstract

Active subspace is a model reduction method widely used in the uncertainty quantification community. In this paper, we propose analyzing the internal structure and vulnerability of deep neural networks using active subspace. Firstly, we employ the active subspace to measure the number of “active neurons” at each intermediate layer, which indicates that the number of neurons can be reduced from several thousands to several dozens. This motivates us to change the network structure and to develop a new and more compact network, referred to as ASNet, that has significantly fewer model parameters. Secondly, we propose analyzing the vulnerability of a neural network using active subspace by finding an additive universal adversarial attack vector that can misclassify a dataset with a high probability. Our experiments on CIFAR-10 show that ASNet can achieve 23.98x parameter and 7.30x flops reduction. The universal active subspace attack vector can achieve around 20% higher attack ratio compared with the existing approaches in our numerical experiments. The PyTorch codes for this paper are available online.

Keywords

  1. active subspace
  2. deep neural network
  3. network reduction
  4. universal adversarial perturbation

MSC codes

  1. 90C26
  2. 15A18
  3. 62G35

Get full access to this article

View all available purchase options and get full access to this article.

References

1.
S. Abdoli, L. G. Hafemann, J. Rony, I. B. Ayed, P. Cardinal, and A. L. Koerich, Universal Adversarial Audio Perturbations, arXiv preprint, arXiv:1908.03173, 2019.
2.
A. Aghasi, A. Abdi, N. Nguyen, and J. Romberg, Net-Trim: Convex pruning of deep neural networks with performance guarantee, in Proceedings of the Conference on Neural Information Processing Systems, 2017, pp. 3177--3186.
3.
L. Armijo, Minimization of functions having Lipschitz continuous first partial derivatives, Pacific J. mathematics, 16 (1966), pp. 1--3.
4.
S. Baluja and I. Fischer, Adversarial Transformation Networks: Learning to Generate Adversarial Examples, arXiv preprint, arXiv:1703.09387, 2017.
5.
M. Behjati, S.-M. Moosavi-Dezfooli, M. S. Baghshah, and P. Frossard, Universal adversarial attacks on text classifiers, in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2019, pp. 7345--7349.
6.
E. G. Birgin, J. M. Martínez, and M. Raydan, Nonmonotone spectral projected gradient methods on convex sets, SIAM J. Optim., 10 (2000), pp. 1196--1211.
7.
H. Cai, L. Zhu, and S. Han, ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware, arXiv preprint, arXiv:1812.00332, 2018.
8.
N. Carlini and D. Wagner, Towards evaluating the robustness of neural networks, in Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), IEEE, 2017, pp. 39--57.
9.
P. G. Constantine, Active Subspaces: Emerging Ideas for Dimension Reduction in Parameter Studies, SIAM, Philadelphia, PA, 2015.
10.
P. G. Constantine, E. Dow, and Q. Wang, Active subspace methods in theory and practice: Applications to kriging surfaces, SIAM J. Sci. Comput., 36 (2014), pp. A1500--A1524.
11.
P. G. Constantine, M. Emory, J. Larsson, and G. Iaccarino, Exploiting active subspaces to quantify uncertainty in the numerical simulation of the HyShot II scramjet, J. Comput. Phys., 302 (2015), pp. 1--20.
12.
M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Bengio, Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1, arXiv preprint, arXiv:1602.02830, 2016.
13.
C. Cui and Z. Zhang, Stochastic collocation with non-Gaussian correlated process variations: Theory, algorithms and applications, IEEE Trans. Components Packaging Manuf. Tech., 9 (2019), pp. 1362--1375.
14.
C. Cui and Z. Zhang, High-dimensional uncertainty quantification of electronic and photonic IC with non-gaussian correlated process variations, IEEE Trans. Comput-Aided Design Integr. Circuits Syst., 39 (2020), pp. 1649--1661.
15.
L. Deng, P. Jiao, J. Pei, Z. Wu, and G. Li, GXNOR-Net: Training deep neural networks with ternary weights and activations without full-precision memory under a unified discretization framework, Neural Networks, 100 (2018), pp. 49--58.
16.
G. K. Dziugaite, Z. Ghahramani, and D. M. Roy, A Study of the Effect of JPG Compression on Adversarial Images, arXiv preprint, arXiv:1608.00853, 2016.
17.
J. Frankle and M. Carbin, The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks, arXiv preprint, arXiv:1803.03635, 2018.
18.
T. Garipov, D. Podoprikhin, A. Novikov, and D. Vetrov, Ultimate Tensorization: Compressing Convolutional and FC Layers Alike, arXiv preprint, arXiv:1611.03214, 2016.
19.
R. Ge, R. Wang, and H. Zhao, Mildly Overparametrized Neural Nets Can Memorize Training Data Efficiently, arXiv preprint, arXiv:1909.11837, 2019.
20.
R. G. Ghanem and P. D. Spanos, Stochastic finite element method: Response statistics, in Stochastic Finite Elements: A Spectral Approach, Springer, Cham, 1991, pp. 101--119.
21.
M. Ghashami, E. Liberty, J. M. Phillips, and D. P. Woodruff, Frequent directions: Simple and deterministic matrix sketching, SIAM J. Comput., 45 (2016), pp. 1762--1792.
22.
I. J. Goodfellow, J. Shlens, and C. Szegedy, Explaining and Harnessing Adversarial Examples, arXiv preprint, arXiv:1412.6572, 2014.
23.
A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber, Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks, in Proceedings of the 23rd International Conference on Machine Learning, ACM, 2006, pp. 369--376.
24.
N. Halko, P.-G. Martinsson, and J. A. Tropp, Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions, SIAM Rev., 53 (2011), pp. 217--288.
25.
S. Han, H. Mao, and W. J. Dally, Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding, arXiv preprint, arXiv:1510.00149, 2015.
26.
C. Hawkins and Z. Zhang, Bayesian Tensorized Neural Networks with Automatic Rank Selection, arXiv preprint, arXiv:1905.10478, 2019.
27.
Y. He, J. Lin, Z. Liu, H. Wang, L.-J. Li, and S. Han, AMC: AutoML for model compression and acceleration on mobile devices, in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 784--800.
28.
G. Hinton, O. Vinyals, and J. Dean, Distilling the Knowledge in a Neural Network, arXiv preprint, arXiv:1503.02531, 2015.
29.
W. Hoeffding, Probability inequalities for sums of bounded random variables, in The Collected Works of Wassily Hoeffding, Springer, Cham, 1994, pp. 409--426.
30.
D. W. Hosmer, Jr., S. Lemeshow, and R. X. Sturdivant, Applied Logistic Regression, John Wiley & Sons, New York, 2013.
31.
I. Jolliffe, Principal component analysis, in International Encyclopedia of Statistical Science, Springer, Cham, 2011, pp. 1094--1096.
32.
C. Kanbak, S.-M. Moosavi-Dezfooli, and P. Frossard, Geometric robustness of deep networks: analysis and improvement, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 4441--4449.
33.
V. Khrulkov and I. Oseledets, Art of singular vectors and universal adversarial perturbations, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 8562--8570.
34.
D. P. Kingma and J. Ba, Adam: A Method for Stochastic Optimization, arXiv preprint, arXiv:1412.6980, 2014.
35.
A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, in Proceedings of the Conference on Neural Information Processing Systems, 2012, pp. 1097--1105.
36.
V. Lebedev, Y. Ganin, M. Rakhuba, I. Oseledets, and V. Lempitsky, Speeding-up Convolutional Neural Networks Using Fine-Tuned CP-Decomposition, arXiv preprint, arXiv:1412.6553, 2014.
37.
D. R. Lide, Handbook of mathematical functions, in A Century of Excellence in Measurements, Standards, and Technology, CRC Press, Boca Raton, FL, 2018, pp. 135--139.
38.
L. Liu, L. Deng, X. Hu, M. Zhu, G. Li, Y. Ding, and Y. Xie, Dynamic Sparse Graph for Efficient Deep Learning, arXiv preprint, arXiv:1810.00859, 2018.
39.
Z. Liu, M. Sun, T. Zhou, G. Huang, and T. Darrell, Rethinking the Value of Network Pruning, arXiv preprint, arXiv:1810.05270, 2018.
40.
S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard, Universal adversarial perturbations, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1765--1773.
41.
S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard, DeepFool: A simple and accurate method to fool deep neural networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2574--2582.
42.
P. Neekhara, S. Hussain, P. Pandey, S. Dubnov, J. McAuley, and F. Koushanfar, Universal Adversarial Perturbations for Speech Recognition Systems, arXiv preprint, arXiv:1905.03828, 2019.
43.
A. Novikov, D. Podoprikhin, A. Osokin, and D. P. Vetrov, Tensorizing neural networks, in Proceedings of the Conference on Neural Information Processing Systems, 2015, pp. 442--450.
44.
S. Oymak and M. Soltanolkotabi, Towards moderate overparameterization: Global convergence guarantees for training shallow neural networks, IEEE J. Sel. Areas Inform. Therory, 1 (2020), pp. 84--105.
45.
N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami, The limitations of deep learning in adversarial settings, in Proceedings of the 2016 IEEE European Symposium on Security and Privacy (EuroS&P), IEEE, 2016, pp. 372--387.
46.
A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio, FitNets: Hints for Thin Deep Nets, arXiv preprint, arXiv:1412.6550, 2014.
47.
H. L. Royden, Real Analysis, Macmillan, New York, 2010.
48.
T. M. Russi, Uncertainty Quantification with Experimental Data and Complex System Models, PhD thesis, UC Berkeley, Berkeley, CA, 2010.
49.
T. N. Sainath, B. Kingsbury, V. Sindhwani, E. Arisoy, and B. Ramabhadran, Low-rank matrix factorization for deep neural network training with high-dimensional output targets, in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2013, pp. 6655--6659.
50.
S. Scardapane, D. Comminiello, A. Hussain, and A. Uncini, Group sparse regularization for deep neural networks, Neurocomputing, 241 (2017), pp. 81--89.
51.
M. Schmidt, E. Berg, M. Friedlander, and K. Murphy, Optimizing costly functions with simple constraints: A limited-memory projected quasi-Newton algorithm, in Proceedings of the International Conference on Artificial Intelligence and Statistics, 2009, pp. 456--463.
52.
A. C. Serban and E. Poll, Adversarial Examples-A Complete Characterisation of the Phenomenon, arXiv preprint, arXiv:1810.01185, 2018.
53.
S. Shalev-Shwartz and T. Zhang, Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization, in Proceedings of the International Conference on Machine Learning, 2014, pp. 64--72.
54.
C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, Intriguing Properties of Neural Networks, arXiv preprint, arXiv:1312.6199, (2013).
55.
D. Xiu and G. E. Karniadakis, Modeling uncertainty in steady state diffusion problems via generalized polynomial chaos, Comput. Methods Appl. Mech. Engrg., 191 (2002), pp. 4927--4948.
56.
D. Xiu and G. E. Karniadakis, The Wiener--Askey polynomial chaos for stochastic differential equations, SIAM J. Sci. Comput., 24 (2002), pp. 619--644.
57.
S. Ye, X. Feng, T. Zhang, X. Ma, S. Lin, Z. Li, K. Xu, W. Wen, S. Liu, J. Tang, M. Fardad, X. Lin, Y. Liu, and Y. Wang, Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates Using ADMM, arXiv preprint, arXiv:1903.09769, 2019.
58.
T. Young, D. Hazarika, S. Poria, and E. Cambria, Recent trends in deep learning based natural language processing, IEEE Comput. Intel. Mag., 13 (2018), pp. 55--75.
59.
V. P. Zankin, G. V. Ryzhakov, and I. Oseledets, Gradient Descent-Based D-Optimal Design for the Least-Squares Polynomial Approximation, arXiv preprint, arXiv:1806.06631, 2018.

Information & Authors

Information

Published In

cover image SIAM Journal on Mathematics of Data Science
SIAM Journal on Mathematics of Data Science
Pages: 1096 - 1122
ISSN (online): 2577-0187

History

Submitted: 28 October 2019
Accepted: 22 July 2020
Published online: 29 October 2020

Keywords

  1. active subspace
  2. deep neural network
  3. network reduction
  4. universal adversarial perturbation

MSC codes

  1. 90C26
  2. 15A18
  3. 62G35

Authors

Affiliations

Funding Information

Ministry of Education and Science of the Russian Federation https://doi.org/10.13039/501100003443 : 14.756.31.0001
University of California, Santa Barbara https://doi.org/10.13039/100007183

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited By

View Options

View options

PDF

View PDF

Figures

Tables

Media

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media