Abstract

In this paper, we present a new stochastic algorithm, namely, the stochastic block mirror descent (SBMD) method for solving large-scale nonsmooth and stochastic optimization problems. The basic idea of this algorithm is to incorporate block coordinate decomposition and an incremental block averaging scheme into the classic (stochastic) mirror descent method, in order to significantly reduce the cost per iteration of the latter algorithm. We establish the rate of convergence of the SBMD method along with its associated large-deviation results for solving general nonsmooth and stochastic optimization problems. We also introduce variants of this method and establish their rate of convergence for solving strongly convex, smooth, and composite optimization problems, as well as certain nonconvex optimization problems. To the best of our knowledge, all these developments related to the SBMD methods are new in the stochastic optimization literature. Moreover, some of our results seem to be new for block coordinate descent methods for deterministic optimization.

Keywords

  1. stochastic optimization
  2. mirror descent
  3. block coordinate descent
  4. nonsmooth optimization
  5. stochastic composite optimization
  6. metric learning

MSC codes

  1. 62L20
  2. 90C25
  3. 90C15
  4. 68Q25

Get full access to this article

View all available purchase options and get full access to this article.

References

1.
A. Beck and M. Teboulle, Mirror descent and nonlinear projected subgradient methods for convex optimization, Oper. Res. Lett., 31 (2003), pp. 167--175.
2.
A. Beck and L. Tetruashvili, On the convergence of block coordinate descent type methods. SIAM J. Optim, 23 (2013), pp. 2037--2060.
3.
L. M. Bregman, The relaxation method of finding the common point convex sets and its application to the solution of problems in convex programming, USSR Comput. Math. Phys., 7 (1967), pp. 200--217.
4.
O. Fercoq and P. Richtárik, Smooth Minimization of Nonsmooth Functions with Parallel Coordinate Descent Methods, ArXiv:1309.5885, 2013.
5.
S. Ghadimi and G. Lan, Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization, I: A generic algorithmic framework, SIAM J. Optim., 22 (2012), pp. 1469--1492.
6.
S. Ghadimi and G. Lan, Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization, II: Shrinking procedures and optimal algorithms, SIAM J. Optim., 23 (2013), pp. 2061--2089.
7.
S. Ghadimi and G. Lan, Stochastic first- and zeroth-order methods for nonconvex stochastic programming, SIAM J. Optim., 23 (2013), pp. 2341--2368.
8.
S. Ghadimi, G. Lan, and H. Zhang, Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization, Math. Program. (2014), pp. 1--39.
9.
A. Juditsky and Y. E. Nesterov, Primal-dual subgradient methods for minimizing uniformly convex functions, arXiv:1401.1792 [math.OC], 2014.
10.
G. Lan, An optimal method for stochastic composite optimization, Math. Program., 133 (2012), pp. 365--397.
11.
G. Lan, Bundle-level type methods uniformly optimal for smooth and non-smooth convex optimization, Math. Program., 149 (2015), pp. 1--45.
12.
G. Lan and R. D. C. Monteiro, Iteration-complexity of first-order augmented Lagrangian methods for convex programming, Math. Program., (2014).
13.
G. Lan and R. D. C. Monteiro, Iteration-complexity of first-order penalty methods for convex programming, Math. Program., 138 (2013), pp. 115--139.
14.
G. Lan, A. S. Nemirovski, and A. Shapiro, Validation analysis of mirror descent stochastic approximation method, Math. Program., 134 (2012), pp. 425--458.
15.
D. Leventhal and A. S. Lewis, Randomized methods for linear constraints: Convergence rates and conditioning, Math. Oper. Res., 35 (2010), pp. 641--654.
16.
Z. Lu and L. Xiao, On the complexity analysis of randomized block-coordinate descent methods, Math. Program. (2014), pp. 1--28.
17.
Z. Q. Luo and P. Tseng, On the convergence of a matrix splitting algorithm for the symmetric monotone linear complementarity problem, SIAM J. Control Optim., 29 (1991), pp. 1037--1060.
18.
A. Nedić and S. Lee, On stochastic subgradient mirror-descent algorithm with weighted averaging, SIAM J. Optim., 24 (2014), pp. 84--107.
19.
A. S. Nemirovski, A. Juditsky, G. Lan, and A. Shapiro, Robust stochastic approximation approach to stochastic programming, SIAM J. Optim., 19 (2009), pp. 1574--1609.
20.
A. S. Nemirovski and D. Yudin, Problem Complexity and Method Efficiency in Optimization, Wiley-Intersci. Ser. Discrete Math. 15, John Wiley, New York, 1983.
21.
Y. E. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course, Kluwer, Boston, 2004.
22.
Y. E. Nesterov, Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems, Technical report, Center for Operations Research and Econometrics (CORE), Catholic University of Louvain, 2010.
23.
Y. E. Nesterov, Subgradient Methods for Huge-Scale Optimization Problems, Technical report, Center for Operations Research and Econometrics (CORE), Catholic University of Louvain, 2012.
24.
B. T. Polyak, New stochastic approximation type procedures, Avtomat. i Telemekh., 7 (1990), pp. 98--107.
25.
B. T. Polyak and A. B. Juditsky, Acceleration of stochastic approximation by averaging, SIAM J. Control Optim., 30 (1992), pp. 838--855.
26.
P. Richtárik and M. Takác̆, Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function, Math. Program., 144 (2014), pp. 1--38.
27.
H. Robbins and S. Monro, A stochastic approximation method, Ann. Math. Statist., 22 (1951), pp. 400--407.
28.
S. Shalev-Shwartz and A. Tewari, Stochastic methods for $l_1$ regularized loss minimization, J. Mach. Learn. Res., 12 (2011), pp. 1865--1892.
29.
S. Shalev-Shwartz and T. Zhang, Stochastic dual coordinate ascent methods for regularized loss minimization, J. Mach. Learn. Res., 14 (2013), pp. 567--599.
30.
P. Tseng and S. Yun, A coordinate gradient descent method for nonsmooth separable minimization, Math. Program., 117 (2009), pp. 387--423.
31.
S. J. Wright, Accelerated Block-Coordinate Relaxation for Regularized Optimizations, Manuscript, University of Wisconsin-Madison, Madison, WI, 2010.
32.
E. P. Xing, A. Y. Ng, M. I. Jordan, and S. Russell, Distance metric learning, with application to clustering with side-information, in Adv. Neural Information Processing Systems 15, MIT Press, Cambridge, MA, 2002, pp. 505--512.

Information & Authors

Information

Published In

cover image SIAM Journal on Optimization
SIAM Journal on Optimization
Pages: 856 - 881
ISSN (online): 1095-7189

History

Submitted: 10 September 2013
Accepted: 5 February 2015
Published online: 30 April 2015

Keywords

  1. stochastic optimization
  2. mirror descent
  3. block coordinate descent
  4. nonsmooth optimization
  5. stochastic composite optimization
  6. metric learning

MSC codes

  1. 62L20
  2. 90C25
  3. 90C15
  4. 68Q25

Authors

Affiliations

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited By

View Options

View options

PDF

View PDF

Figures

Tables

Media

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media