Abstract

The estimation of large sparse inverse covariance matrices is a ubiquitous statistical problem in many application areas such as mathematical finance, geology, health, and many others. The $\ell_1$-regularized Gaussian maximum likelihood (ML) method is a common approach for recovering inverse covariance matrices for datasets with a very limited number of samples. A highly efficient ML-based method is the quadratic approximate inverse covariance (QUIC) method. In this work, we build on the advancements of QUIC algorithm by introducing a highly performant sparse version of QUIC (SQUIC) for large-scale applications. The proposed algorithm focuses on exploiting the potential sparsity in three components of the QUIC algorithm, namely, construction sample covariance matrix, matrix factorization, and matrix inversion operations. For each component, we present two approaches and provide supporting numerical results based on a set of synthetic datasets and a stylized financial autoregressive model. Testing conducted on a single modern multicore machine show that using advanced sparse matrix technology, SQUIC can recover large-scale inverse covariance matrices of datasets with up to $1$ million random variables within minutes. In comparison to competing ML-based algorithms, SQUIC is orders of magnitude faster with comparable recovery rates.

Keywords

  1. covariance matrix
  2. inverse covariance matrix estimation
  3. sparse matrices
  4. approximate inverse matrices

MSC codes

  1. 65N55
  2. 65F10
  3. 65N22

Get full access to this article

View all available purchase options and get full access to this article.

References

1.
P. Amestoy, T. A. Davis, and I. S. Duff, An approximate minimum degree ordering algorithm, SIAM J. Matrix Anal. Appl., 17 (1996), pp. 886--905.
2.
J. Ballani and D. Kressner, Sparse inverse covariance estimation with hierarchical matrices, tech. rep., EPFL Technical Report, 2014.
3.
O. Banerjee, L. E. Ghaoui, and A. d'Aspremont, Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data, J. Mach. Learn. Res., 9 (2008), pp. 485--516.
4.
P. Brockwell and R. Davis, Time Series: Theory and Methods, Springer Series in Statistics, Springer, New York, 2009.
5.
T. Cai, W. Liu, and X. Luo, A constrained $l_1$ minimization approach to sparse precision matrix estimation, J. Amer. Statist. Assoc., 106 (2011), pp. 594--607.
6.
Y. Chen, T. A. Davis, W. W. Hager, and S. Rajamanickam, Algorithm 887: CHOLMOD, supernodal sparse Cholesky factorization and update/downdate, ACM Trans. Math. Softw., 35 (2008), pp. 22:1--22:14.
7.
J. Cochrane, Asset Pricing: (Revised Edition), Princeton University Press, Princeton, NJ, 2009.
8.
T. Coleman, B. Garbow, and J. Moré, FORTRAN subroutines for estimating sparse Jacobian matrices, ACM Trans. Math. Softw., 10 (1984), pp. 346--347.
9.
T. Coleman and J. Moré, Estimation of sparse Jacobian matrices and graph coloring problems, SIAM J. Numer. Anal., 20 (1983), pp. 187--209.
10.
A. Curtis, M. Powel, and J. Reid, On the estimation of sparse Jacobian matrices, J. Inst. Math. Appl., 13 (1974), pp. 117--119.
11.
J. Dahl, L. Vandenberghe, and V. Roychowdhury, Covariance selection for non--chordal graphs via chordal embedding, Optim. Methods Softw., 23 (2008), pp. 501--520.
12.
A. d'Aspremont, O. Banerjee, and L. E. Ghaoui, First-order methods for sparse covariance selection, SIAM J. Matrix Anal. Appl., 30 (2008), pp. 56--66.
13.
T. Davis, Direct Methods for Sparse Linear Systems, SIAM, Philadelphia, 2006.
14.
J. Duchi, S. Gould, and D. Koller, Projected subgradient methods for learning sparse Gaussians, in Proceedings of the Twenty-Fourth Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-08), 2008, pp. 153--160.
15.
I. S. Duff and J. Koster, The design and use of algorithms for permuting large entries to the diagonal of sparse matrices, SIAM J. Matrix Anal. App., 20 (1999), pp. 889--901.
16.
I. S. Duff and S. Pralet, Strategies for scaling and pivoting for sparse symmetric indefinite problems, SIAM J. Matrix Anal. Appl., 27 (2005), pp. 313--340.
17.
A. Eftekhari, M. Bollhöfer, and O. Schenk, Distributed memory sparse inverse covariance matrix estimation on high-performance computing architectures, in Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC '18, Piscataway, NJ, USA, 2018, IEEE Press, pp. 20:1--20:12.
18.
J. Friedman, T. Hastie, and R. Tibshirani, Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 9 (2008), pp. 432--441.
19.
M. Hagemann and O. Schenk, Weighted matchings for the preconditioning of symmetric indefinite linear systems, SIAM J. Sci. Comput., (2006), pp. 403--420.
20.
J. Hamilton, Time Series Analysis, Princeton University Press, Princeton, NJ, 1994.
21.
P. Hénon, P. Ramet, and J. Roman, PaStiX: A High-Performance Parallel Direct Solver for Sparse Symmetric Definite Systems, Parallel Comput., 28 (2002), pp. 301--321.
22.
C.-J. Hsieh, M. A. Sustik, I. S. Dhillon, and P. K. Ravikumar, Sparse inverse covariance matrix estimation using quadratic approximation, in Advances in Neural Information Processing Systems, J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, and K. Weinberger, eds., vol. 24, Neural Information Processing Systems Foundation, 2011, pp. 2330--2338.
23.
C.-J. Hsieh, M. A. Sustik, I. S. Dhillon, P. K. Ravikumar, and R. A. Poldrack, BIG & QUIC: Sparse inverse covariance estimation for a million variables, in Advances in Neural Information Processing Systems, C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Weinberger, eds., vol. 26, Neural Information Processing Systems Foundation, 2013, pp. 3165--3173.
24.
D. Irony, G. Shklarski, and S. Toledo, Parallel and fully recursive multifrontal supernodal sparse Cholesky, Future Generation Computer Systems, 20 (2004), pp. 425--440.
25.
G. Karypis and V. Kumar, A fast and high quality multilevel scheme for partitioning irregular graphs, SIAM J. Sci. Comput., 20 (1998), pp. 359--392.
26.
J. Kurzak, M. Gates, A. YarKhan, I. Yamazaki, P. Wu, P. Luszczek, J. Finney, and J. Dongarra, Parallel BLAS performance report, Tech. Rep. 5, ICL-UT-18-01, 2018.
27.
L. Li and K.-C. Toh, An inexact interior point method for $l_1$-regularized sparse covariance selection, Math. Program. Comput., 2 (2010), pp. 291--315.
28.
N. Li, Y. Saad, and E. Chow, Crout versions of ILU for general sparse matrices, SIAM J. Sci. Comput., 25 (2004), pp. 716--728.
29.
L. Lin, J. Lu, L. Ying, R. Car, and W. E, Fast algorithm for extracting the diagonal of the inverse matrix with application to the electronic structure analysis of metallic systems, Commun. Math. Sci., 7 (2009), pp. 755--777.
30.
L. Lin, C. Yang, J. C. Meza, J. Lu, L. Ying, and W. E, SelInv --- an algorithm for selected inversion of a sparse symmetric matrix, ACM Trans. Math. Softw., 2010.
31.
H. Lütkepohl, New Introduction to Multiple Time Series Analysis, Springer, Berlin Heidelberg, 2007.
32.
F. Oztoprak, J. Nocedal, S. Rennie, and P. A. Olsen, Newton-like methods for sparse inverse covariance estimation, Adv. Neural Inform. Process. Sys., 25 (2012), pp. 755--763.
33.
J. K. P. R. Amestoy, I. S. Duff and J.-Y. L'Excellent, A fully asynchronous multifrontal solver using distributed dynamic scheduling, SIAM J. Matrix Anal. Appl., 23 (2001), pp. 15--41.
34.
B. Rolfs, B. Rajaratnam, D. Guillot, I. Wong, and A. Maleki, Iterative thresholding algorithm for sparse inverse covariance estimation, Adv. Neural Inform, Process. Sys., 25 (2012), pp. 1574--1582.
35.
J. Rothman, P. Bickel, E. Levina, and J. Zhu, Sparse permutation invariant covariance estimation, Electron. J. Stat., 2 (2008), pp. 494--515.
36.
K. Scheinberg and I. Rish, Learning sparse Gaussian Markov networks using a greedy coordinate ascent approach, in Machine Learning and Knowledge Discovery in Databases, J. Balczar, F. Bonchi, A. Gionis, and M. Sebag, eds., vol. 6323 of Lecture Notes in Computer Science, Springer Berlin / Heidelberg, 2010, pp. 196--212.
37.
K. Scheinberg and I. Rish, Learning sparse Gaussian Markov networks using a greedy coordinate ascent approach, in Proceedings of the 2010 European Conference on Machine Learning and Knowledge Discovery in Databases: Part III, 2010, pp. 196--212.
38.
O. Schenk and K. Gärtner, Solving unsymmetric sparse systems of linear equations with PARDISO, Journal of Future Generation Computer Systems, 20 (2004), pp. 475--487.
39.
O. Schenk, A. Wächter, and M. Weiser, Inertia-revealing preconditioning for large-scale nonconvex constrained optimization, SIAM J. Sci. Comput., 31 (2008), pp. 939--960.
40.
K. Takahashi, J. Fagan, and M.-S. Chin, Formation of a sparse bus impedance matrix and its application to short circuit study, IEEE Power Engrg. Society, 1973, pp. 63--69.
41.
J. M. Tang and Y. Saad, A probing method for computing the diagonal of a matrix inverse, Numer. Linear Algebra Appl., 19 (2012), pp. 485--501.
42.
A. Tharwat, Classification assessment methods, Appl. Comput. Inform., 2018.
43.
S. Weisberg, Applied linear regression, Wiley-Interscience, New York, 2005.
44.
M. Yuan and Y. Lin, Model selection and estimation in the Gaussian graphical model, Biometrika, 94 (2007), pp. 19--35.

Information & Authors

Information

Published In

cover image SIAM Journal on Scientific Computing
SIAM Journal on Scientific Computing
Pages: A380 - A401
ISSN (online): 1095-7197

History

Submitted: 15 September 2017
Accepted: 6 November 2018
Published online: 29 January 2019

Keywords

  1. covariance matrix
  2. inverse covariance matrix estimation
  3. sparse matrices
  4. approximate inverse matrices

MSC codes

  1. 65N55
  2. 65F10
  3. 65N22

Authors

Affiliations

Funding Information

Swiss Platform for Advanced Scientific Computing
Hoover Institution

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited By

View Options

View options

PDF

View PDF

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media