Abstract.

Minimax problems of the form \(\min_x \max_y \Psi (x,y)\) have attracted increased interest largely due to advances in machine learning, in particular generative adversarial networks and adversarial learning. These are typically trained using variants of stochastic gradient descent for the two players. Although convex-concave problems are well understood with many efficient solution methods to choose from, theoretical guarantees outside of this setting are sometimes lacking even for the simplest algorithms. In particular, this is the case for alternating gradient descent ascent, where the two agents take turns updating their strategies. To partially close this gap in the literature we prove a novel global convergence rate for the stochastic version of this method for finding a critical point of \(\psi (\cdot ) := \max_y \Psi (\cdot,y)\) in a setting which is not convex-concave.

Keywords

  1. minimax
  2. saddle point
  3. nonconvex-concave
  4. complexity
  5. prox-gradient method
  6. stochastic gradient descent

MSC codes

  1. 90C47
  2. 90C15
  3. 90C25

Get full access to this article

View all available purchase options and get full access to this article.

References

1.
M. Arjovsky, S. Chintala, and L. Bottou, Wasserstein GAN, https://arxiv.org/abs/1701.07875, 2017.
2.
J. P. Bailey, G. Gidel, and G. Piliouras, Finite regret and cycles with fixed step-size via alternating gradient descent-ascent, in Proceedings of the 33rd Conference on Learning Theory, PMLR, 2020, pp. 391–407.
3.
H. H. Bauschke and P. L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, CMS Books in Math. 408, Springer, New York, 2011.
4.
A. Beck and M. Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM J. Imaging Sci., 2 (2009), pp. 183–202.
5.
A. Böhm, M. Sedlmayer, E. R. Csetnek, and R. I. Boţ, Two steps at a time—taking GAN training in stride with Tseng’s method, SIAM J. Math. Data Sci., 4 (2022), pp. 750–771.
6.
A. Böhm and S. J. Wright, Variable smoothing for weakly convex composite functions, J. Optim. Theory Appl., 188 (2021), pp. 628–649.
7.
Q. Cai, M. Hong, Y. Chen, and Z. Wang, On the Global Convergence of Imitation Learning: A Case for Linear Quadratic Regulator, preprint, https://arxiv.org/abs/1901.03674, 2019.
8.
R. S. Chen, B. Lucier, Y. Singer, and V. Syrgkanis, Robust optimization for non-convex objectives, in Advances in Neural Information Processing Systems, 2017, pp. 4705–4714.
9.
C. Daskalakis, A. Ilyas, V. Syrgkanis, and H. Zeng, Training GANs with optimism, in Proceedings of the International Conference on Learning Representations, 2018.
10.
C. Daskalakis and I. Panageas, The limit points of (optimistic) gradient descent in min-max optimization, in Advances in Neural Information Processing Systems, 2018, pp. 9236–9246.
11.
D. Davis and D. Drusvyatskiy, Stochastic Subgradient Method Converges at the Rate \({O}(k^{-1/4})\) on Weakly Convex Functions, 1802.02988, 2018.
12.
D. Davis and D. Drusvyatskiy, Stochastic model-based minimization of weakly convex functions, SIAM J. Optim., 29 (2019), pp. 207–239.
13.
D. Drusvyatskiy and C. Paquette, Efficiency of minimizing compositions of convex functions and smooth maps, Math. Program., 178 (2019), pp. 503–558.
14.
Y. Fan, S. Lyu, Y. Ying, and B. Hu, Learning with average top-k loss, in Advances in Neural Information Processing Systems, 2017, pp. 497–505.
15.
G. Gidel, H. Berard, G. Vignoud, P. Vincent, and S. Lacoste-Julien, A variational inequality perspective on generative adversarial networks, in Proceedings of the International Conference on Learning Representations, 2019.
16.
G. Gidel, R. A. Hemmat, M. Pezeshki, R. Le Priol, G. Huang, S. Lacoste-Julien, and I. Mitliagkas, Negative momentum for improved game dynamics, in Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019, pp. 1802–1811.
17.
I. Goodfellow, NIPS 2016 Tutorial: Generative Adversarial Networks, https://arxiv.org/abs/1701.00160, 2016.
18.
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, Generative adversarial nets, in Advances in Neural Information Processing Systems, 2014, pp. 2672–2680.
19.
I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, Improved training of Wasserstein GANs, in Advances in Neural Information Processing Systems, 2017, pp. 5767–5777.
20.
E. Y. Hamedani and N. S. Aybat, A primal-dual algorithm for general convex-concave saddle point problems, SIAM J. Optim., 31 (2021), pp. 1299–1329.
21.
M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, GANs trained by a two time-scale update rule converge to a local Nash equilibrium, in Advances in Neural Information Processing Systems, 2017, pp. 6626–6637.
22.
J. Ho and S. Ermon, Generative adversarial imitation learning, in Advances in Neural Information Processing Systems, 2016, pp. 4565–4573.
23.
C. Jin, P. Netrapalli, and M. I. Jordan, What is local optimality in nonconvex-nonconcave minimax optimization?, Proc. Mach. Learn. Res., 119 (2020), pp. 4880–4889.
24.
W. Kong and R. D. Monteiro, An accelerated inexact proximal point method for solving nonconvex-concave min-max problems, SIAM J. Optim., 31 (2021), pp. 2558–2585.
25.
G. Korpelevich, The extragradient method for finding saddle points and other problems, Matecon, 12 (1976), pp. 747–756.
26.
A. Krizhevsky and G. Hinton, Learning Multiple Layers of Features from Tiny Images, Master’s thesis, University of Toronto, Canada, 2009.
27.
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proc. IEEE, 86 (1998), pp. 2278–2324.
28.
T. Lin, C. Jin, and M. I. Jordan, Near-optimal algorithms for minimax optimization, in Proceedings of the Conference on Learning Theory, PMLR, 2020, pp. 2738–2779.
29.
T. Lin, C. Jin, and M. I. Jordan, On gradient descent ascent for nonconvex-concave minimax problems, in Proceedings of the International Conference on Machine Learning, PMLR, 2020, pp. 6083–6093.
30.
M. Liu, H. Rafique, Q. Lin, and T. Yang, First-order convergence theory for weakly-convex-weakly-concave min-max problems, J. Mach. Learn. Res., 22 (2021), pp. 1–34.
31.
M. Liu, W. Zhang, Y. Mroueh, X. Cui, J. Ross, T. Yang, and P. Das, A decentralized parallel algorithm for training generative adversarial nets, Adv. Neural Inf. Process. Syst., 33 (2020), pp. 11056–11070.
32.
S. Lu, I. Tsaknakis, M. Hong, and Y. Chen, Hybrid block successive approximation for one-sided non-convex min-max problems: Algorithms and applications, IEEE Trans. Signal Process., 68 (2020), pp. 3676–3691.
33.
A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, Towards deep learning models resistant to adversarial attacks, in Proceedings of the International Conference on Learning Representations, 2018.
34.
Y. Malitsky and M. K. Tam, A forward-backward splitting method for monotone inclusions without cocoercivity, SIAM J. Optim., 30 (2020), pp. 1451–1472.
35.
L. Mescheder, A. Geiger, and S. Nowozin, Which training methods for GANs do actually converge?, in Proceedings of the International Conference on Machine Learning, 2018.
36.
L. Mescheder, S. Nowozin, and A. Geiger, The numerics of GANs, in Advances in Neural Information Processing Systems, Vol. 30, 2017.
37.
M. Mohri, G. Sivek, and A. T. Suresh, Agnostic federated learning, in Proceedings of the International Conference on Machine Learning, PMLR, 2019, pp. 4615–4625.
38.
B. S. Mordukhovich, Variational Analysis and Generalized Differentiation I: Basic Theory, Grundlehren Math. Wiss. 330, Springer, Berlin, 2006.
39.
H. Namkoong and J. C. Duchi, Stochastic gradient methods for distributionally robust optimization with f-divergences, in Advances in Neural Information Processing Systems, 2016, pp. 2208–2216.
40.
A. Nemirovski, Prox-method with rate of convergence \({O}(1/t)\) for variational inequalities with Lipschitz continuous monotone operators and smooth convex-concave saddle point problems, SIAM J. Optim., 15 (2004), pp. 229–251.
41.
Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course, Appl. Optim. 87, Springer, Berlin, 2013.
42.
M. Nouiehed, M. Sanjabi, T. Huang, J. D. Lee, and M. Razaviyayn, Solving a class of non-convex min-max games using iterative first order methods, in Advances in Neural Information Processing Systems, 2019, pp. 14934–14942.
43.
D. M. Ostrovskii, A. Lowy, and M. Razaviyayn, Efficient search of first-order Nash equilibria in nonconvex-concave smooth min-max problems, SIAM J. Optim., 31 (2021), pp. 2508–2538.
44.
N. Papernot, F. Faghri, N. Carlini, I. Goodfellow, R. Feinman, A. Kurakin, C. Xie, Y. Sharma, T. Brown, A. Roy, A. Matyasko, V. Behzadan, K. Hambardzumyan, Z. Zhang, Y.-L. Juang, Z. Li, R. Sheatsley, A. Garg, J. Uesato, W. Gierke, Y. Dong, D. Berthelot, P. Hendricks, J. Rauber, and R. Long, Technical Report on the CleverHans v2.1.0 Adversarial Examples Library, preprint, https://arxiv.org/abs/1610.00768, 2018.
45.
H. Rafique, M. Liu, Q. Lin, and T. Yang, Weakly-convex–concave min–max optimization: Provable algorithms and applications in machine learning, Optim. Methods Softw., 37 (2022), pp. 1087–1121.
46.
A. Sinha, H. Namkoong, and J. Duchi, Certifiable distributional robustness with principled adversarial training, in Proceedings of the International Conference on Learning Representations, 2018.
47.
C. Song, Z. Zhou, Y. Zhou, Y. Jiang, and Y. Ma, Optimistic dual extrapolation for coherent non-monotone variational inequalities, in Advances in Neural Information Processing Systems, Vol. 33, 2020, pp. 14303–14314.
48.
K. K. Thekumparampil, P. Jain, P. Netrapalli, and S. Oh, Efficient algorithms for smooth minimax optimization, in Advances in Neural Information Processing Systems, 2019, pp. 12680–12691.
49.
F. Tramèr, N. Papernot, I. Goodfellow, D. Boneh, and P. McDaniel, The Space of Transferable Adversarial Examples, preprint, https://arxiv.org/abs/1704.03453, 2017.
50.
Q. Tran-Dinh, D. Liu, and L. M. Nguyen, Hybrid Variance-Reduced SGD Algorithms for Nonconvex-Concave Minimax Problems, https://arxiv.org/abs/2006.15266, 2020.
51.
H. Xiao, K. Rasul, and R. Vollgraf, Fashion-MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms, preprint, https://arxiv.org/abs/1708.07747, 2017.
52.
Z. Xu, H. Zhang, Y. Xu, and G. Lan, A unified single-loop alternating gradient projection algorithm for nonconvex–concave and convex–nonconcave minimax problems, Math. Program., (2023), pp. 1–72.
53.
Y. Ying, L. Wen, and S. Lyu, Stochastic online AUC maximization, in Advances in Neural Information Processing Systems, 2016, pp. 451–459.
54.
G. Zhang, Y. Wang, L. Lessard, and R. B. Grosse, Near-optimal local convergence of alternating gradient descent-ascent for minimax optimization, in Proceedings of the International Conference on Artificial Intelligence and Statistics, PMLR, 2022, pp. 7659–7679.
55.
R. Zhao, A Primal Dual Smoothing Framework for Max-structured Nonconvex Optimization, https://arxiv.org/abs/2003.04375, 2020.

Information & Authors

Information

Published In

cover image SIAM Journal on Optimization
SIAM Journal on Optimization
Pages: 1884 - 1913
ISSN (online): 1095-7189

History

Submitted: 14 December 2021
Accepted: 12 February 2023
Published online: 2 August 2023

Keywords

  1. minimax
  2. saddle point
  3. nonconvex-concave
  4. complexity
  5. prox-gradient method
  6. stochastic gradient descent

MSC codes

  1. 90C47
  2. 90C15
  3. 90C25

Authors

Affiliations

Faculty of Mathematics, University of Vienna, Vienna, Austria.
Faculty of Mathematics, University of Vienna, Vienna, Austria.

Funding Information

Funding: The research of the second author was supported by the doctoral program Vienna Graduate School on Computational Optimization (VGSCO), FWF (Austrian Science Fund), project W 1260.

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited By

There are no citations for this item

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media

The SIAM Publications Library now uses SIAM Single Sign-On for individuals. If you do not have existing SIAM credentials, create your SIAM account https://my.siam.org.