Abstract

In cryo-electron microscopy (cryo-EM), a microscope generates a top view of a sample of randomly oriented copies of a molecule. The problem of single particle reconstruction (SPR) from cryo-EM is to use the resulting set of noisy two-dimensional projection images taken at unknown directions to reconstruct the three-dimensional (3D) structure of the molecule. In some situations, the molecule under examination exhibits structural variability, which poses a fundamental challenge in SPR. The heterogeneity problem is the task of mapping the space of conformational states of a molecule. It has been previously suggested that the leading eigenvectors of the covariance matrix of the 3D molecules can be used to solve the heterogeneity problem. Estimating the covariance matrix is challenging, since only projections of the molecules are observed, but not the molecules themselves. In this paper, we formulate a general problem of covariance estimation from noisy projections of samples. This problem has intimate connections with matrix completion problems and high-dimensional principal component analysis. We propose an estimator and prove its consistency. When there are finitely many heterogeneity classes, the spectrum of the estimated covariance matrix reveals the number of classes. The estimator can be found as the solution to a certain linear system. In the cryo-EM case, the linear operator to be inverted, which we term the projection covariance transform, is an important object in covariance estimation for tomographic problems involving structural variation. Inverting it involves applying a filter akin to the ramp filter in tomography. We design a basis in which this linear operator is sparse and thus can be tractably inverted despite its large size. We demonstrate via numerical experiments on synthetic datasets the robustness of our algorithm to high levels of noise.

Keywords

  1. cryo-electron microscopy
  2. X-ray transform
  3. inverse problems
  4. structural variability
  5. classification
  6. heterogeneity
  7. covariance matrix estimation
  8. principal component analysis
  9. high-dimensional statistics
  10. Fourier projection slice theorem
  11. spherical harmonics

MSC codes

  1. 92C55
  2. 44A12
  3. 92E10
  4. 68U10
  5. 33C55
  6. 62H30
  7. 62J10

Get full access to this article

View all available purchase options and get full access to this article.

References

1.
A. Amunts, A. Brown, X. Bai, J. Llácer, T. Hussain, P. Emsley, F. Long, G. Murshudov, S. Scheres, and V. Ramakrishnan, Structure of the yeast mitochondrial large ribosomal subunit, Science, 343 (2014), pp. 1485--1489.
2.
N. Baddour, Operational and convolution properties of three dimensional Fourier transforms in spherical polar coordinates, J. Opt. Soc. Amer. A, 27 (2010), pp. 2144--2155.
3.
X. Bai, I. Fernandez, G. McMullan, and S. Scheres, Ribosome structures to near-atomic resolution from thirty thousand cryo-em particles, eLife, 2:e00461, 2013.
4.
J. Baik, G. Ben Arous, and S. Péché, Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices, Ann. Probab., 33 (2005), pp. 1643--1697.
5.
J. Baik and J. W. Silverstein, Eigenvalues of large sample covariance matrices of spiked population models, J. Multivariate Anal., 97 (2006), pp. 1382--1408.
6.
J. Bennett and S. Lanning, The Netflix prize, in the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, CA, ACM, New York, 2007.
7.
P. J. Bickel and E. Levina, Covariance regularization by thresholding, Ann. Statist., 36 (2008), pp. 2577--2604.
8.
C. Bishop, Pattern Recognition and Machine Learning, Inf. Sci. Statist., Springer-Verlag, New York, 2006.
9.
E. Candes and Y. Plan, Matrix completion with noise, Proc. IEEE, 98 (2010), pp. 925--936.
10.
A. P. Dempster, N.M. Laird, and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc. Ser. B Stat. Methodol., 39 (1977), pp. 1--38.
11.
D. Donoho, High-dimensional data analysis: The curses and blessings of dimensionality, in Math Challenges of the 21st Century, Los Angeles, 2000.
12.
J. Frank, Three-Dimensional Electron Microscopy of Macromolecular Assemblies: Visualization of Biological Molecules in Their Native State, Oxford, Oxford University Press, 2006.
13.
J. Frank, Exploring the Dynamics of Supramolecular Machines with Cryo-Electron Microscopy, Proceedings of the 23rd International Solvay Conference on Chemistry, International Solvay Institutes, Brussels, 2013.
14.
J. Frank, Story in a sample -- the potential (and limitations) of cryo-electron microscopy applied to molecular machines, Biopolymers, 99 (2013), pp. 832--836.
15.
R. Henderson, Realizing the potential of electron cryo-microscopy, Quart. Rev. Biophys., 37 (2004), pp. 3--13.
16.
G. Herman and M. Kalinowski, Classification of heterogeneous electron microscopic projections into homogeneous subsets, Ultramicroscopy, 108 (2008), pp. 327--338.
17.
A. Hjorungnes and D. Gesbert, Complex-valued matrix differentiation: Techniques and key results, IEEE Trans. Signal Process., 55 (2007), pp. 2740--2746.
18.
A. Ilin and T. Raiko, Practical approaches to principal component analysis in the presence of missing values, J. Mach. Learn. Res., 11 (2010), pp. 1957--2000.
19.
P. Jain, P. Netrapalli, and S. Sanghavi, Low-rank matrix completion using alternating minimization, in Proceedings of the Forty-Fifth Annual ACM Symposium on Theory of Computing, STOC '13, ACM, New York, 2013, pp. 665--674.
20.
Q. Jin, C. O. S. Sorzano, J. M. de la Rosa-Trevlin, J. R. Bilbao-Castro, R. Núnez-Ramírez, O. Llorca, F. Tama, and S. Jonić, Iterative elastic 3D-to-2D alignment method using normal modes for studying structural dynamics of large macromolecular complexes, Structure, 22 (2014), pp. 496--506.
21.
I. Johnstone, On the distribution of the largest eigenvalue in principal components analysis, Ann. Statist., 29 (2001), pp. 295--327.
22.
I. Johnstone and A. Lu, On consistency and sparsity for principal components analysis in high dimensions, J. Amer. Statist. Assoc., 104 (2009), pp. 682--693.
23.
A. T. Kalai, A. Moitra, and G. Valiant, Disentangling Gaussians, Commun. ACM, 55 (2012), pp. 113--120.
24.
W. Kühlbrandt, The resolution revolution, Science, 343 (2014), pp. 1443--1444.
25.
O. Kuybeda, G. A. Frank, A. Bartesaghi, M. Borgnia, S. Subramaniam, and G. Sapiro, A collaborative framework for $3$D alignment and classification of heterogeneous subvolumes in cryo-electron tomography, J. Struct. Biol., 181 (2013), pp. 116--127.
26.
O. Kwon and A. H. Zewail, \em$4$D electron tomography, Science, 328 (2010), pp. 1668--1673.
27.
F. Leger, G. Yu, and G. Sapiro, Efficient matrix completion with Gaussian models, in IEEE 2011 International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, Piscataway, NJ, 2011, pp. 1113--1116.
28.
X. Li, P. Mooney, S. Zheng, C. Booth, M. Braunfeld, S. Gubbens, D. Agard, and Y. Cheng, Electron counting and beam-induced motion correction enable near-atomic-resolution single-particle cryo-em, Nature Methods, 10 (2013), pp. 584--590.
29.
H. Liao and J. Frank, Classification by bootstrapping in single particle methods, Proceedings of the 2010 IEEE International Conference on Biomedical Imaging: From Nano to Macro, IEEE, Piscataway, NJ, 2010, pp. 169--172.
30.
M. Liao, E. Cao, D. Julius, and Y. Cheng, Structure of the TRPV1 ion channel determined by electron cryo-microscopy, Nature, 504 (2013), pp. 107--124.
31.
R. Little and D. Rubin, Statistical Analysis with Missing Data, 2nd ed., Wiley Ser. Probab. Stat., John Wiley, Hoboken, NJ, 2002.
32.
P. Loh and M. Wainwright, High-dimensional regression with noisy and missing data: Provable guarantees with nonconvexity, Ann. Statist., 40 (2012), pp. 1637--1664.
33.
K. Lounici, High-dimensional covariance matrix estimation with missing observations, Bernoulli, 20 (2014), pp. 1029--1058.
34.
S. Ludtke, M. Baker, D. Chen, J. Song, D. Chuang, and W. Chiu, De novo backbone trace of GroEL from single particle electron cryomicroscopy, Structure, 16 (2008), pp. 441--448.
35.
V. A. Marc̆enko and L. A. Pastur, Distribution of eigenvalues of some sets of random matrices, Math. USSR Sb., 1 (1967), pp. 507--536.
36.
M. A. Morrison and G. A. Parker, A guide to rotations in quantum mechanics, Aust. J. Phys., 40 (1987), pp. 465--497.
37.
B. Nadler, Finite sample approximation results for principal component analysis: A matrix perturbation approach, Ann. Statist., 36 (2008), pp. 2791--2817.
38.
F. Natterer, The Mathematics of Computerized Tomography, Classics Appl. Math., SIAM, Philadelphia, 2001.
39.
M. O'Neil, F. Woolfe, and V. Rokhlin, An algorithm for the rapid evaluation of special function transforms, Appl. Comput. Harmon. Anal., 28 (2010), pp. 203--226.
40.
K. Pearson, On lines and planes of closest fit to systems of points in space, Philos. Mag., 2 (1901), pp. 559--572.
41.
P. Penczek, Variance in three-dimensional reconstructions from projections, in Proceedings of the 2002 IEEE International Symposium on Biomedical Imaging, M. Unser and Z. P. Liang, eds., IEEE, Piscataway, NJ, 2002, pp. 749--752.
42.
P. Penczek, Y. Chao, J. Frank, and C. M. T. Spahn, Estimation of variance in single-particle reconstruction using the bootstrap technique, J. Struct. Biol., 154 (2006), pp. 168--183.
43.
P. Penczek, M. Kimmel, and C. Spahn, Identifying conformational states of macromolecules by eigen-analysis of resampled cryo-EM images, Structure, 19 (2011), pp. 1582--1590.
44.
P. Penczek, R. Renka, and H. Schomberg, Gridding-based direct Fourier inversion of the three-dimensional ray transform, J. Opt. Soc. Amer. A, 21 (2004), pp. 499--509.
45.
A. P. Prudnikov, Y. A. Brychkov, and O. I. Marychev, Integrals and Series: Special Functions, Gordon and Breach, Amsterdam, 1983.
46.
B. Recht, M. Fazel, and P. A. Parrilo, Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization, SIAM Rev., 52 (2010), pp. 471--501.
47.
M. Rudelson, Random vectors in the isotropic position, J. Funct. Anal., 164 (1999), pp. 60--72.
48.
W. O. Saxton and W. Baumeister, The correlation averaging of a regularly arranged bacterial cell envelope protein, J. Microscopy, 127 (1982), pp. 127--138.
49.
S. Scheres, Relion: Implementation of a Bayesian approach to cryo-EM structure determination, J. Struct. Biol., 180 (2012), pp. 519--530.
50.
S. Scheres, Maximum-likelihood methods in cryo-EM. Part II: Application to experimental data, J. Struct. Biol., 181 (2013), pp. 195--206.
51.
T. Schneider, Analysis of incomplete climate data: Estimation of mean values and covariance matrices and imputation of missing values, J. Climate, 14 (2001), pp. 853--871.
52.
M. Shatsky, R. Hall, E. Nogales, J. Malik, and S. Brenner, Automated multi-model reconstruction from single-particle electron microscopy data, J. Struct. Biol., 170 (2010), pp. 98--108.
53.
F. Sigworth, P. Doerschuk, J. Carazo, and S. Scheres, Maximum-likelihood methods in cryo-EM. Part I: Theoretical basis and overview of existing approaches, Methods Enzymology, 482 (2010), pp. 263--294.
54.
J. W. Silverstein and Z. D. Bai, On the empirical distribution of eigenvalues of a class of large dimensional random matrices, J. Multivariate Anal., 54 (1995), pp. 175--192.
55.
A. Singer and Y. Shkolnisky, Three-dimensional structure determination from common lines in cryo-EM by eigenvectors and semidefinite programming, SIAM J. Imag. Sci., 4 (2011), pp. 543--572.
56.
D. Slepian, Prolate spheroidal wave functions, Fourier analysis and uncertainty -- IV: Extensions to many dimensions; generalized prolate spheroidal functions, Bell System Tech. J., 43 (1964), pp. 3009--3057.
57.
E. M. Stein and G. L. Weiss, Introduction to Fourier Analysis on Euclidean Spaces, Princeton University Press, Princeton, NJ, 1971.
58.
L. Trefethen and D. Bau III, Numerical Linear Algebra, SIAM, Philadelphia, 1997.
59.
J. Tropp, User-friendly tail bounds for sums of random matrices, Found. Comput. Math., 12 (2012), pp. 389--434.
60.
M. van Heel, Principles of Phase Contrast (Electron) Microscopy, http://www.singleparticles.org/methodology/MvH_Phase_Contrast.pdf(2009).
61.
M. van Heel, B. Gowen, R. Matadeen, E. V. Orlova, R. Finn, T. Pape, D. Cohen, H. Stark, R. Schmidt, and A. Patwardhan, Single particle electron cryo-microscopy: Towards atomic resolution, Quart. Rev. Biophys., 33 (2000), pp. 307--369.
62.
R. Vershynin, Introduction to the non-asymptotic analysis of random matrices, in Compressed Sensing, Theory and Applications, Y. Eldar and G. Kutyniok, eds., Cambridge University Press, Cambridge, 2012, pp. 210--268.
63.
L. Wang and F. J. Sigworth, Cryo-EM and single particles, Physiology (Bethesda), 21 (2006), pp. 13--18.
64.
L. Wang, A. Singer, and Z. Wen, Orientation determination of cryo-EM images using least unsquared deviations, SIAM J. Imag. Sci., 6 (2013), pp. 2450--2483.
65.
Q. Wang, T. Matsui, T. Domitrovic, Y. Zheng, P. Doerschuk, and J. Johnson, Dynamics in cryo EM reconstructions visualized with maximum-likelihood derived variance maps, J. Struct. Biol., 181 (2013), pp. 195--206.
66.
S. S. Wilks, Moments and distributions of estimates of population parameters from fragmentary samples, Ann. Math. Statist., 3 (1932), pp. 163--195.
67.
W. Zhang, M. Kimmel, C. M. Spahn, and P. Penczek, Heterogeneity of large macromolecular complexes revealed by $3$d cryo-em variance analysis, Structure, 16 (2008), pp. 1770--1776.
68.
X. Zhang, E. Settembre, C. Xu, P. Dormitzer, R. Bellamy, S. Harrison, and N. Grigorieff, Near-atomic resolution using electron cryomicroscopy and single-particle reconstruction, Proc. Natl. Acad. Sci. USA, 105 (2008), pp. 1867--1872.
69.
Z. Zhao and A. Singer, Fourier-Bessel rotational invariant eigenimages, J. Opt. Soc. Amer. A, 30 (2013), pp. 871--877.
70.
Z. Zhao and A. Singer, Rotationally invariant image representation for viewing direction classification in cryo-EM, J. Struct. Biol., 186 (2014), pp. 153--166.

Information & Authors

Information

Published In

cover image SIAM Journal on Imaging Sciences
SIAM Journal on Imaging Sciences
Pages: 126 - 185
ISSN (online): 1936-4954

History

Submitted: 3 September 2013
Accepted: 22 September 2014
Published online: 22 January 2015

Keywords

  1. cryo-electron microscopy
  2. X-ray transform
  3. inverse problems
  4. structural variability
  5. classification
  6. heterogeneity
  7. covariance matrix estimation
  8. principal component analysis
  9. high-dimensional statistics
  10. Fourier projection slice theorem
  11. spherical harmonics

MSC codes

  1. 92C55
  2. 44A12
  3. 92E10
  4. 68U10
  5. 33C55
  6. 62H30
  7. 62J10

Authors

Affiliations

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited By

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share on social media

The SIAM Publications Library now uses SIAM Single Sign-On for individuals. If you do not have existing SIAM credentials, create your SIAM account https://my.siam.org.