Abstract.

This paper discusses the efficiency of Hybrid Primal-Dual (HPD) type algorithms to approximately solve discrete Optimal Transport (OT) and Wasserstein Barycenter (WB) problems, with and without entropic regularization. Our first contribution is an analysis showing that these methods yield state-of-the-art convergence rates, both theoretically and practically. Next, we extend the HPD algorithm with the linesearch proposed by Malitsky and Pock in 2018 to the setting where the dual space has a Bregman divergence, and the dual function is relatively strongly convex to the Bregman’s kernel. This extension yields a new method for OT and WB problems based on smoothing of the objective that also achieves state-of-the-art convergence rates. Finally, we introduce a new Bregman divergence based on a scaled entropy function that makes the algorithm numerically stable and reduces the smoothing, leading to sparse solutions of OT and WB problems. We complement our findings with numerical experiments and comparisons.

Keywords

  1. primal-dual method
  2. optimal transport
  3. Wasserstein barycenter
  4. saddle-point

MSC codes

  1. Primary: 65Y20
  2. 49Q22; Secondary: 90C05
  3. 90C06
  4. 90C08
  5. 90C47

Get full access to this article

View all available purchase options and get full access to this article.

References

1.
Z. Allen-Zhu, Y. Li, R. Oliveira, and A. Wigderson, Much faster algorithms for matrix scaling, in Proceedings of the 58th Annual IEEE Symposium on Foundations of Computer Science—FOCS 2017, IEEE Computer Society, Los Alamitos, CA, 2017, pp. 890–901, https://doi.org/10.1109/FOCS.2017.87.
2.
J. Altschuler, J. Niles-Weed, and P. Rigollet, Near-linear time approximation algorithms for optimal transport via sinkhorn iteration, in Advances in Neural Information Processing Systems 30, 2017, https://proceedings.neurips.cc/paper/2017/file/491442df5f88c6aa018e86dac21d3606-Paper.pdf.
3.
M. Arjovsky, S. Chintala, and L. Bottou, Wasserstein generative adversarial networks, in Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, 2017, pp. 214–223.
4.
H. H. Bauschke, J. Bolte, and M. Teboulle, A descent lemma beyond Lipschitz gradient continuity: First-order methods revisited and applications, Math. Oper. Res., 42 (2017), pp. 330–348, https://doi.org/10.1287/moor.2016.0817.
5.
H. H. Bauschke, J. M. Borwein, and P. L. Combettes, Essential smoothness, essential strict convexity, and Legendre functions in Banach spaces, Commun. Contemp. Math., 3 (2001), pp. 615–647, https://doi.org/10.1142/S0219199701000524.
6.
J.-D. Benamou, G. Carlier, M. Cuturi, L. Nenna, and G. Peyré, Iterative Bregman projections for regularized transportation problems, SIAM J. Sci. Comput., 37 (2015), pp. A1111–A1138, https://doi.org/10.1137/141000439.
7.
J. Bigot, R. Gouet, T. Klein, and A. López, Geodesic PCA in the Wasserstein space by convex PCA, Ann. Inst. H. Poincaré Probab. Statist., 53 (2017), pp. 1–26, https://doi.org/10.1214/15-AIHP706.
8.
J. Blanchet, A. Jambulapati, C. Kent, and A. Sidford, Towards Optimal Running Times for Optimal Transport, preprint, https://arxiv.org/abs/1810.07717, 2018.
9.
M. Blondel, V. Seguy, and A. Rolet, Smooth and sparse optimal transport, in Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, PMLR 84, 2018, pp. 880–889, https://proceedings.mlr.press/v84/blondel18a.html.
10.
N. Bonneel, J. Rabin, G. Peyré, and H. Pfister, Sliced and Radon Wasserstein barycenters of measures, J. Math. Imaging Vision, 51 (2015), pp. 22–45, https://doi.org/10.1007/s10851-014-0506-3.
11.
N. Bonneel, M. Van De Panne, S. Paris, and W. Heidrich, Displacement interpolation using Lagrangian mass transport, in Proceedings of the 2011 SIGGRAPH Asia Conference, 2011, pp. 1–12.
12.
A. Chambolle and T. Pock, A first-order primal-dual algorithm for convex problems with applications to imaging, J. Math. Imaging Vision, 40 (2011), pp. 120–145, https://doi.org/10.1007/s10851-010-0251-1.
13.
A. Chambolle and T. Pock, On the ergodic convergence rates of a first-order primal-dual algorithm, Math. Program., 159 (2016), pp. 253–287, https://doi.org/10.1007/s10107-015-0957-3.
14.
G. Chen and M. Teboulle, Convergence analysis of a proximal-like minimization algorithm using Bregman functions, SIAM J. Optim., 3 (1993), pp. 538–543, https://doi.org/10.1137/0803026.
15.
M. B. Cohen, A. Madry, D. Tsipras, and A. Vladu, Matrix scaling and balancing via box constrained Newton’s method and interior point methods, in Proceedings of the IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), 2017, pp. 902–913, https://doi.org/10.1109/FOCS.2017.88.
16.
M. Cuturi, Sinkhorn distances: Lightspeed computation of optimal transport, in Proceedings of the 26th International Conference on Neural Information Processing Systems, 2013, pp. 2292–2300, https://proceedings.neurips.cc/paper/2013/file/af21d0c97db2e27e13572cbf59eb343d-Paper.pdf.
17.
D. Dvinskikh and D. Tiapkin, Improved complexity bounds in Wasserstein barycenter problem, in Proceedings of the International Conference on Artificial Intelligence and Statistics, PMLR, 2021, pp. 1738–1746.
18.
P. Dvurechensky, A. Gasnikov, and A. Kroshnin, Computational optimal transport: Complexity by accelerated gradient descent is better than by Sinkhorn’s algorithm, in Proceedings of the International Conference on Machine Learning, PMLR, 2018, pp. 1367–1376.
19.
M. Essid and J. Solomon, Quadratically regularized optimal transport on graphs, SIAM J. Sci. Comput., 40 (2018), pp. A1961–A1986, https://doi.org/10.1137/17M1132665.
20.
R. Flamary, N. Courty, A. Gramfort, M. Z. Alaya, A. Boisbunon, S. Chambon, L. Chapel, A. Corenflos, K. Fatras, N. Fournier, L. Gautheron, N. T. Gayraud, H. Janati, A. Rakotomamonjy, I. Redko, A. Rolet, A. Schutz, V. Seguy, D. J. Sutherland, R. Tavenard, A. Tong, and T. Vayer, Pot: Python optimal transport, J. Mach. Learn. Res., 22 (2021), pp. 1–8, http://jmlr.org/papers/v22/20-451.html.
21.
S. Guminov, P. Dvurechensky, N. Tupitsa, and A. Gasnikov, On a combination of alternating minimization and Nesterov’s momentum, in Proceedings of the International Conference on Machine Learning, PMLR, 2021, pp. 3886–3898.
22.
W. Guo, N. Ho, and M. Jordan, Fast algorithms for computational optimal transport and Wasserstein barycenter, in Proceedings of the International Conference on Artificial Intelligence and Statistics, PMLR, 2020, pp. 2088–2097.
23.
N. Ho, X. Nguyen, M. Yurochkin, H. H. Bui, V. Huynh, and D. Phung, Multilevel clustering via Wasserstein means, in Proceedings of the International Conference on Machine Learning, PMLR, 2017, pp.1501–1509.
24.
A. Jambulapati, A. Sidford, and K. Tian, A direct \(\tilde{O}(1/\varepsilon )\) iteration parallel algorithm for optimal transport, in Advances in Neural Information Processing Systems, 2019.
25.
X. Jiang and L. Vandenberghe, Bregman primal-dual first-order method and application to sparse semidefinite programming, Comput. Optim. Appl., 81 (2022), pp. 127–159, https://doi.org/10.1007/s10589-021-00339-7.
26.
A. Kroshnin, N. Tupitsa, D. Dvinskikh, P. Dvurechensky, A. Gasnikov, and C. Uribe, On the complexity of approximating Wasserstein barycenters, in Proceedings of the International Conference on Machine Learning, PMLR, 2019, pp. 3530–3540.
27.
M. Kusner, Y. Sun, N. Kolkin, and K. Weinberger, From word embeddings to document distances, in Proceedings of the International Conference on Machine Learning, PMLR, 2015, pp. 957–966.
28.
G. Lan and Y. Zhou, An optimal randomized incremental gradient method, Math. Program., 171 (2018), pp. 167–215, https://doi.org/10.1007/s10107-017-1173-0.
29.
C. Lee, H. Luo, C. Wei, and M. Zhang, Linear Last-iterate Convergence in Constrained Saddle-point Optimization, preprint, https://arxiv.org/abs/2006.09517, 2020.
30.
C.-W. Lee, C. Kroer, and H. Luo, Last-iterate Convergence in Extensive-form Games, preprint, https://arxiv.org/abs/2106.14326, 2021.
31.
Y. T. Lee and A. Sidford, Efficient inverse maintenance and faster algorithms for linear programming, in Proceedings of the IEEE 56th Annual Symposium on Foundations of Computer Science, IEEE, 2015, pp. 230–249.
32.
T. Lin, N. Ho, X. Chen, M. Cuturi, and M. Jordan, Fixed-support Wasserstein barycenters: Computational hardness and fast algorithm, in Advances in Neural Information Processing Systems 33, Curran Associates, 2020, pp. 5368–5380, https://proceedings.neurips.cc/paper/2020/file/3a029f04d76d32e79367c4b3255dda4d-Paper.pdf.
33.
T. Lin, N. Ho, and M. Jordan, On efficient optimal transport: An analysis of greedy and accelerated mirror descent algorithms, in Proceedings of the 36th International Conference on Machine Learning, 2019, pp. 3982–3991.
34.
H. Lu, R. M. Freund, and Y. Nesterov, Relatively smooth convex optimization by first-order methods, and applications, SIAM J. Optim., 28 (2018), pp. 333–354, https://doi.org/10.1137/16M1099546.
35.
Y. Malitsky and T. Pock, A first-order primal-dual algorithm with linesearch, SIAM J. Optim., 28 (2018), pp. 411–432, https://doi.org/10.1137/16M1092015.
36.
A. Nemirovski, Prox-method with rate of convergence \(O(1/t)\) for variational inequalities with Lipschitz continuous monotone operators and smooth convex-concave saddle point problems, SIAM J. Optim., 15 (2004), pp. 229–251, https://doi.org/10.1137/S1052623403425629.
37.
Y. Nesterov, Smooth minimization of non-smooth functions, Math. Program., 103 (2005), pp. 127–152, https://doi.org/10.1007/s10107-004-0552-5.
38.
Y. Nesterov, Dual extrapolation and its applications to solving variational inequalities and related problems, Math. Program., 109 (2007), pp. 319–344, https://doi.org/10.1007/s10107-006-0034-z.
39.
Y. Nesterov, Efficiency of coordinate descent methods on huge-scale optimization problems, SIAM J. Optim., 22 (2012), pp. 341–362, https://doi.org/10.1137/100802001.
40.
G. Peyré and M. Cuturi, Computational Optimal Transport, preprint, https://arxiv.org/abs/1803.00567, 2018.
41.
J. Rabin, S. Ferradans, and N. Papadakis, Adaptive color transfer with relaxed optimal transport, in Proceedings of the 2014 IEEE International Conference on Image Processing (ICIP), IEEE, 2014, pp. 4852–4856.
42.
R. T. Rockafellar, Convex Analysis, Princeton Landmarks in Mathematics, Princeton University Press, Princeton, NJ, 1997; reprint of the 1970 original, Princeton Paperbacks.
43.
L. Rüschendorf and L. Uckelmann, On the n-coupling problem, J. Multivariate Anal., 81 (2002), pp. 242–258, https://doi.org/10.1006/jmva.2001.2005.
44.
J. Sherman, Area-convexity, \(\ell ^\infty\) regularization, and undirected multicommodity flow, in Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, 2017, pp. 452–460.
45.
A. Silveti-Falls, C. Molinari, and J. Fadili, A Stochastic Bregman Primal-Dual Splitting Algorithm for Composite Optimization, preprint, https://arxiv.org/abs/2112.11928, 2021.
46.
P. Tseng, On Accelerated Proximal Gradient Methods for Convex-Concave Optimization, unpublished, 2008, https://www.mit.edu/∼dimitrib/PTseng/papers/apgm.pdf.
47.
A. B. Tsybakov, Introduction to Nonparametric Estimation, Springer Ser. Statist., Springer, New York, 2009, https://doi.org/10.1007/b13794, revised and extended from the 2004 French original, translated by Vladimir Zaiats.

Information & Authors

Information

Published In

cover image SIAM Journal on Mathematics of Data Science
SIAM Journal on Mathematics of Data Science
Pages: 1369 - 1395
ISSN (online): 2577-0187

History

Submitted: 9 March 2022
Accepted: 24 August 2022
Published online: 20 December 2022

Keywords

  1. primal-dual method
  2. optimal transport
  3. Wasserstein barycenter
  4. saddle-point

MSC codes

  1. Primary: 65Y20
  2. 49Q22; Secondary: 90C05
  3. 90C06
  4. 90C08
  5. 90C47

Authors

Affiliations

Antonin Chambolle
CEREMADE, CNRS & Université Paris-Dauphine, PSL University, Paris, 91128 France.
Universidad Adolfo Ibáñez, Facultad de Ingeniería y Ciencias, 7941169 Santiago, Chile, and Departamento de Ingeniería Industrial, Universidad Católica del Norte, 1780000 Antofagasta, Chile.

Funding Information

ANID-PFCHA/Doctorado Nacional: 2019-21190161
Funding. The work of the second author was supported by a doctoral scholarship from ANID-PFCHA/Doctorado Nacional/2019-21190161.

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited By

There are no citations for this item

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media

The SIAM Publications Library now uses SIAM Single Sign-On for individuals. If you do not have existing SIAM credentials, create your SIAM account https://my.siam.org.