Moment-based metrics for molecules computable from cryogenic electron microscopy images

Andy Zhang; Oscar Mickelin; Joe Kileel; Eric J. Verbeke; Nicholas F. Marshall; Marc Aurèle Gilles; Amit Singer

doi:10.1017/S2633903X24000023

Moment-based metrics for molecules computable from cryogenic electron microscopy images

Published online by Cambridge University Press: 23 February 2024

and

Andy Zhang*: Affiliation:
Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA
Oscar Mickelin: Affiliation:
Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA
Joe Kileel: Affiliation:
Department of Mathematics and Oden Institute, University of Texas at Austin, Austin, TX, USA
Eric J. Verbeke: Affiliation:
Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA
Nicholas F. Marshall: Affiliation:
Department of Mathematics, Oregon State University, Corvallis, OR, USA
Marc Aurèle Gilles: Affiliation:
Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA
Amit Singer: Affiliation:
Program in Applied and Computational Mathematics, Princeton University, Princeton, NJ, USA Department of Mathematics, Princeton University, Princeton, NJ, USA
*: Corresponding author: Andy Zhang; Email: [email protected]

Article contents

Abstract
Impact Statement
Introduction
Background
Definition of Kam’s metrics
Results
Limitations and future work
Discussion
Data availability statement
Author contribution
Funding statement
Competing interest
References

Rights & Permissions

Abstract

Single-particle cryogenic electron microscopy (cryo-EM) is an imaging technique capable of recovering the high-resolution three-dimensional (3D) structure of biological macromolecules from many noisy and randomly oriented projection images. One notable approach to 3D reconstruction, known as Kam’s method, relies on the moments of the two-dimensional (2D) images. Inspired by Kam’s method, we introduce a rotationally invariant metric between two molecular structures, which does not require 3D alignment. Further, we introduce a metric between a stack of projection images and a molecular structure, which is invariant to rotations and reflections and does not require performing 3D reconstruction. Additionally, the latter metric does not assume a uniform distribution of viewing angles. We demonstrate the uses of the new metrics on synthetic and experimental datasets, highlighting their ability to measure structural similarity.

Keywords

alignment-free metric Kam’s method method of moments protein structural similarity rotationally invariant distance structural search

Type: Research Article
Information: Biological Imaging , Volume 4 , 2024 , e3

DOI: https://doi.org/10.1017/S2633903X24000023 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2024. Published by Cambridge University Press

Impact Statement

Single-particle cryogenic electron microscopy (cryo-EM) is a popular method to obtain three-dimensional (3D) reconstructions of biological molecules from noisy two-dimensional (2D) tomographic projection images. Many iterative techniques for this reconstruction require initializations sufficiently close to the unknown structure to obtain high-quality reconstructions. To help select an initialization from a database of known structures, this paper introduces a metric to compare the similarity of known 3D structures with a stack of noisy 2D tomographic projection images of an unknown structure. We show that this metric distinguishes differing structures and present an efficient method to compute it, notably without performing 3D reconstruction.

1. Introduction

Single-particle cryogenic electron microscopy (cryo-EM) enables high-resolution reconstruction of three-dimensional (3D) biological macromolecules from a large collection of noisy and randomly oriented projection images. Kam’s method⁽Reference Kam¹⁾ is one of the earliest methods proposed for homogeneous reconstruction in cryo-EM. It is a statistical method-of-moments approach applied to the cryo-EM reconstruction problem, where the computation of low-order statistics of two-dimensional (2D) images allows for the estimation of 3D structure through solving a polynomial system. Kam’s method has helped push the theoretical understanding of the reconstruction process – under certain conditions, it is a provable algorithm and provides bounds for the estimated structure’s quality in terms of the noise level and the number of images.⁽Reference Bhamre, Zhang and Singer ² ^–Reference Sharon, Kileel, Khoo, Landa and Singer ⁸ ⁾ Kam’s method also enjoys other remarkable properties:

1. It bypasses the need for angular assignment, typically a large computational burden in competing methods.
2. It is a streaming algorithm and is thus theoretically much faster than iterative methods.
3. It can – in theory – break the detection limit of the minimal size of proteins that can be observed in cryo-EM.⁽Reference Bendory, Hadi and Sharon ⁹ ⁾

While theoretically attractive, Kam’s method has not been able to yield high-resolution reconstructions as yet. One direction that is currently being explored to resolve this issue is the development of better priors, for instance, based on the sparsity of the solution as in the study by Bendory et al.⁽Reference Bendory, Khoo, Kileel, Mickelin and Singer ⁷ ⁾ Separately, there has been considerable, continued interest in utilizing the vast amount of solved structures stored in the Protein Data Bank (PDB)⁽Reference Berman ¹⁰ ⁾ to improve cryo-EM reconstructions.

Leveraging the PDB as a prior, we propose a method to match either projection images or molecular volumes to a database of previously solved structures (Section 3). We use this procedure as a rotationally and reflectionally invariant metric that can be directly computed from raw image datasets without needing a 3D reconstruction process. Importantly, our metric neither relies on prior knowledge of rotations nor assumes a uniform rotational distribution, making it applicable to typical datasets.

To demonstrate the efficacy of our metric, we compare it to existing methods and show empirically that it achieves similar performance to alignment-based metrics. As an application, we use our metric to compute a low-dimensional embedding of a subset of the PDB into the Euclidean plane, visually showcasing how structurally similar proteins are embedded near each other (Section 4.2). Further, we apply the version of the metric that can be directly computed from stacks of 2D images and show that it gives an efficient methodology to identify the nearest neighbors in a database to a given set of experimental moments on synthetic and real datasets (Sections 4.3 and 4.4).

2. Background

This section presents the mathematical preliminaries needed to define our metric. Let $ \Phi :{\mathrm{\mathbb{R}}}^3\to \mathrm{\mathbb{R}} $ be the electrostatic potential of a molecule and $ \hat{\Phi}:{\mathrm{\mathbb{R}}}^3\to \mathrm{\mathbb{C}} $ be its Fourier transform, which we define by

$$ \hat{\Phi}\left(\xi \right)={\int}_{{\mathrm{\mathbb{R}}}^3}\Phi (x){e}^{- i\xi \cdot x} dx. $$

A single projection image is given by

$$ I=h\hskip0.35em \ast \hskip0.35em PR\Phi +\varepsilon, $$

and its Fourier transform is

$$ \hat{I}=H\cdot SR\hat{\Phi}+\hat{\varepsilon}, $$

where $ P $ is the projection operator, $ S $ is the slicing operator, $ h $ is a point spread function, $ H $ is the contrast transfer function (CTF), $ \varepsilon $ is noise, and $ R\in \mathrm{SO}(3) $ is a rotation operator. We assume that the Fourier transform $ \hat{\Phi} $ can be expanded in a spherical harmonic expansion:

(1)

$$ \hat{\Phi}\left(r,\theta, \varphi \right)=\sum \limits_{\mathrm{\ell}=0}^L\sum \limits_{m=-\mathrm{\ell}}^{\mathrm{\ell}}{A}_{\mathrm{\ell},m}(r){Y}_{\mathrm{\ell}}^m\left(\theta, \varphi \right), $$

where $ \left(r,\theta, \phi \right) $ are spherical coordinates, and $ {Y}_{\mathrm{\ell}}^m $ denotes the complex spherical harmonic:

$$ {Y}_{\mathrm{\ell}}^m\left(\theta, \varphi \right):= {\left(\frac{\left(\mathrm{\ell}-m\right)!\left(2\mathrm{\ell}+1\right)}{4\pi \left(\mathrm{\ell}+m\right)!}\right)}^{1/2}{e}^{im\varphi}{P}_{\mathrm{\ell}}^m\left(\cos \theta \right), $$

where $ {P}_{\mathrm{\ell}}^m $ are the associated Legendre polynomials, $ {A}_{\mathrm{\ell},m}(r) $ are $ r $ -dependent coefficients, and $ L $ is a bandlimit parameter. See Eq. 14.30.1 in⁽Reference Olver, Olde Daalhuis, Lozier, Schneider, Boisvert, Clark, Miller, Saunders, Cohl and McClain ¹¹ ⁾ for the definitions of $ {Y}_{\mathrm{\ell}}^m $ and $ {P}_{\mathrm{\ell}}^m $ .

Let $ \rho :\mathrm{SO}(3)\to \mathrm{\mathbb{R}} $ be the probability density function of the rotational distribution, which without loss of generality is invariant to in-plane rotations and reflections. (Note that by augmenting the dataset with in-plane rotations and reflections of all 2D images, one can always reduce to the case of such an invariant distribution $ \rho $ ; for example, see the study by Ponce and Singer⁽Reference Ponce and Singer ¹² ⁾). More formally, $ \rho $ is a function on $ SO(3)/O(2)\simeq {\unicode{x1D54A}}^2/\left\{\pm 1\right\} $ _, which is formed by identifying antipodal points on the sphere $ {\unicode{x1D54A}}^2 $ _.⁽Reference Bredon ¹³ ⁾ Thus, we model the density as a function $ \rho :\mathrm{SO}(3)\to \mathrm{\mathbb{R}} $ with an even-degree spherical harmonic expansion:

(2)

$$ \rho (R)=\sum \limits_{\mathrm{\ell}=0}^P\sum \limits_{m=-2\mathrm{\ell}}^{2\mathrm{\ell}}{B}_{2\mathrm{\ell},m}\overline{Y_{2\mathrm{\ell}}^m}\left(\theta (R),\varphi (R)\right), $$

where $ \left(\theta (R),\varphi (R)\right) $ represents the third column of the rotation matrix given by $ R $ in spherical coordinates, and $ P\in {\mathrm{\mathbb{Z}}}_{\ge 0} $ is a bandlimit parameter (see Section 4.2 in the study by Sharon et al.⁽Reference Sharon, Kileel, Khoo, Landa and Singer ⁸ ⁾). The analytical first and second moments $ {\mathbf{m}}_1 $ and $ {\mathbf{m}}_2 $ of the Fourier-transformed projection images with respect to $ \rho $ are

(3)

$$ {\displaystyle \begin{array}{c}{\mathbf{m}}_1\left(x,y\right)=\int \left(R\cdot \hat{\Phi}\right)\left(x,y,0\right)\rho (R) dR,\hskip2em \mathrm{and}\\ {}{\mathbf{m}}_2\left(\left(x,y\right),\Big({x}^{\prime },{y}^{\prime}\Big)\right)=\int \left(R\cdot \hat{\Phi}\right)\left(x,y,0\right)\left(R\cdot \hat{\Phi}\right)\left({x}^{\prime },{y}^{\prime },0\right)\rho (R) dR,\end{array}} $$

where $ dR $ denotes integration with respect to the Haar measure on $ SO(3) $ . It will be convenient to write $ \left(x,y\right) $ and $ \left({x}^{\prime },{y}^{\prime}\right) $ in terms of polar coordinates $ \left(r,\phi \right) $ and $ \left({r}^{\prime },{\phi}^{\prime}\right) $ , respectively. In Appendix A.1, we show in (12) and (13) that the first moment only depends on $ r $ , that is, $ {\mathbf{m}}_1={\mathbf{m}}_1(r):{\mathrm{\mathbb{R}}}_{\ge 0}\to \mathrm{\mathbb{C}} $ , and that the second moment only depends on $ r,{r}^{\prime } $ and $ \Delta \phi =\phi -{\phi}^{\prime } $ , that is, $ {\mathbf{m}}_2={\mathbf{m}}_2\left(r,{r}^{\prime },\Delta \phi \right):{\mathrm{\mathbb{R}}}_{\ge 0}\times {\mathrm{\mathbb{R}}}_{\ge 0}\times \left[-2\pi, 2\pi \right]\to \mathrm{\mathbb{C}} $ . We write $ {\mathbf{m}}_1={\mathbf{m}}_1\left(\hat{\Phi},\rho \right) $ and $ {\mathbf{m}}_2={\mathbf{m}}_2\left(\hat{\Phi},\rho \right) $ to denote the first and second moments defined by $ \hat{\Phi} $ and $ \rho $ when discussing multiple structures. The basis of Kam’s method is that the moments in (3) can be estimated from images and related to expansion coefficients for the potential $ \hat{\Phi} $ ; see Appendix A for explicit formulas.

3. Definition of Kam’s metrics

We now use metrics between the moments in (3) to define similarity between proteins and stacks of images. A first function is used to measure the similarity of two known structures by the moments of their potential as defined in (3). The second is used to measure the similarity between a known structure and the unknown structure observed in a dataset of images.

Crucially, the metrics can be computed without performing 3D alignment of the structures, reducing their computational costs compared to other approaches. Moreover, one of the metrics can be directly computed from noisy and CTF-affected projection images. This enables a nearest neighbor search among known structures to determine an initialization for the 3D reconstruction pipeline, especially in the expectation–maximization procedure.⁽Reference Scheres ¹⁴ ^, Reference Scheres ¹⁵ ⁾

3.1. Kam’s volume metric $ {d}_{\mathrm{vKam}} $

Here, we introduce the first of Kam’s metrics, which measures the similarity of two 3D structures. We use this to perform dimensionality reduction to visualize the relationship between structures from a subset of the PDB.

In detail, given two 3D structures $ {\Phi}_1 $ and $ {\Phi}_2 $ , we define the distance between them through their first and second moments $ {\mathbf{m}}_1 $ and $ {\mathbf{m}}_2 $ under a uniform distribution of viewing directions, which we denote by $ \rho ={\rho}_u $ . We will derive the explicit equations for the uniform case in (17) and (18). We then measure the resulting weighted deviation of the first and second moments by

(4)

$$ \parallel {\mathbf{m}}_2\left({\hat{\Phi}}_1,{\rho}_u\right)-{\mathbf{m}}_2\left({\hat{\Phi}}_2,{\rho}_u\right){\parallel}_{L^2\left({\mathrm{\mathbb{R}}}^2\times {\mathrm{\mathbb{R}}}^2\right)}^2+\lambda \parallel {\mathbf{m}}_1\left({\hat{\Phi}}_1,{\rho}_u\right)-{\mathbf{m}}_1\left({\hat{\Phi}}_2,{\rho}_u\right){\parallel}_{L^2\left({\mathrm{\mathbb{R}}}^2\right)}^2, $$

where $ \lambda \ge 0 $ is a hyperparameter that we set to 1 for all experiments. The moments will be represented on a discretized voxel grid, and we therefore replace the continuous norms with discrete norms. More specifically, we will represent the second moment using a grid $ r,{r}^{\prime}\in \left\{{r}_1,\dots, {r}_{\left\lfloor N/2\right\rfloor}\right\} $ and , where $ N $ is the number of pixels of one side of the discretized volume. We define the grid points $ {r}_k=\delta k/N $ , $ \Delta {\phi}_j=2\pi j/N-2\pi $ , for $ k=1,\dots, \left\lfloor N/2\right\rfloor $ , and $ j=1,\dots, 2N $ , where $ \delta $ is the side length of the volume grid in angstroms. We then use the following two approximations to the continuous norms above

(5)

$$ \parallel \mathbf{M}{\parallel}_{w_2}^2:= \sum \limits_{\begin{array}{c}j=1,\dots, 2N\\ {}{k}_1=1,\dots, \left\lfloor N/2\right\rfloor \\ {}{k}_2=1,\dots, \left\lfloor N/2\right\rfloor \end{array}}{\left|\mathbf{M}\left(\Delta {\phi}_j,{r}_{k_1},{r}_{k_2}\right)\right|}^2{r}_{k_1}{r}_{k_2},\hskip1em \parallel \mathbf{N}{\parallel}_{w_1}^2:= \sum \limits_{k=1,\dots, \left\lfloor N/2\right\rfloor }{\left|\mathbf{N}\left({r}_k\right)\right|}^2{r}_k. $$

With these norms, we define the metric comparing two sets of moments of two 3D structures by

(6)

$$ {d}_{\mathrm{vKam}}\left({\Phi}_1,{\Phi}_2\right)\hskip0.35em := {\left(\parallel {\mathbf{m}}_2\left({\hat{\Phi}}_1,{\rho}_u\right)-{\mathbf{m}}_2\Big({\hat{\Phi}}_2,{\rho}_u\left){\parallel}_{w_2}^2+\lambda \parallel {\mathbf{m}}_1\right({\hat{\Phi}}_1,{\rho}_u\left)-{\mathbf{m}}_1\right({\hat{\Phi}}_2,{\rho}_u\Big){\parallel}_{w_1}^2\right)}^{1/2}, $$

This distance is rotationally invariant since for any rotation $ R $ , we have $ \hat{R\cdot \Phi}=R\cdot \hat{\Phi} $ and the moments $ {\mathbf{m}}_1 $ and $ {\mathbf{m}}_2 $ in (2) satisfy

(7)

$$ {\mathbf{m}}_i(R\cdot \hat{\Phi},\rho )={\mathbf{m}}_i(\hat{\Phi},{R}^T\cdot \rho ), $$

as can be seen through a change of variables in (3). When $ \rho ={\rho}_u $ is uniform, clearly $ {R}^T\cdot \rho =\rho $ , which therefore shows rotational invariance of the cost function in (4), up to the discretization of the volume grid. Note that this bypasses the need for an alignment step. We detail the procedure for computing $ {\mathbf{m}}_1 $ , $ {\mathbf{m}}_2 $ and therefore $ {d}_{\mathrm{vKam}} $ in Appendix A.1. Under certain conditions, it has been demonstrated that the second moment of the image collection identifies the 3D structure uniquely⁽Reference Bhamre, Zhang and Singer ² ^–Reference Levin, Bendory, Boumal, Kileel and Singer ⁴ ^, Reference Huang, Zehni, Dokmanić and Zhao ⁶ ^, Reference Bendory, Khoo, Kileel, Mickelin and Singer ⁷ ⁾ or up to a finite list of candidate structures.⁽Reference Sharon, Kileel, Khoo, Landa and Singer ⁸ ⁾ In Section 4.2, we show that our metric is alike other similarity scores but remarkably does not rely on alignment.

3.2. Kam’s image metric $ {d}_{\mathrm{iKam}} $

We now introduce a metric between the empirical moments computed from a set of experimental projection images and the moments computed from the atomic coordinates of a known structure that compares images to the known structure. We detail the procedure for computing these moments in Appendix A.1.

If the distribution of poses in the experimental dataset would be known to be uniform, the empirical moments could directly be substituted for $ {\mathbf{m}}_1 $ and $ {\mathbf{m}}_2 $ in (6) and the metric could be defined as the deviation between the moments of the two structures. In practice, however, the distribution of angles is not uniform and is unknown. Since the moments are functions of this distribution, it must therefore be inferred.

We will show in (12) and (13) that $ {\mathbf{m}}_1 $ and $ {\mathbf{m}}_2 $ depend linearly on the expansion coefficients $ {B}_{p,u} $ of the distribution of viewing directions. The optimization problem minimizing the discrepancy between the moments of the two structures is, therefore, a linear least-squares problem in $ {B}_{p,u} $ . It follows from Table 3 of⁽Reference Sharon, Kileel, Khoo, Landa and Singer ⁸ ⁾ that this linear least-squares is Zariski-generically full-rank (although not necessarily well-conditioned) for various small bandlimits $ L $ and $ P. $ Solving this optimization problem efficiently eliminates the unknown rotational distribution. We then define the metric between the moments of the structure $ \Phi $ and the experimental moments $ {\tilde{\mathbf{m}}}_1,{\tilde{\mathbf{m}}}_2 $ by

(8)

$$ \underset{\rho \in \mathcal{P}}{\min}\parallel {\tilde{\mathbf{m}}}_2-{\mathbf{m}}_2\left(\hat{\Phi},\rho \right){\parallel}_{L^2\left({\mathrm{\mathbb{R}}}^2\times {\mathrm{\mathbb{R}}}^2\right)}^2+\lambda \parallel {\tilde{\mathbf{m}}}_1-{\mathbf{m}}_1\left(\hat{\Phi},\rho \right){\parallel}_{L^2\left({\mathrm{\mathbb{R}}}^2\right)}^2, $$

where $ \lambda \ge 0 $ is a hyperparameter which we set to 1 for all experiments and

(9)

$$ \mathcal{P}=\left\{\rho (R):\mathrm{SO}(3)\to \mathrm{\mathbb{R}}\;\mathrm{s}.\mathrm{t}.\rho (R)=\sum \limits_{\mathrm{\ell}=0}^P\sum \limits_{m=-2\mathrm{\ell}}^{2\mathrm{\ell}}{B}_{2\mathrm{\ell},m}\overline{Y_{2\mathrm{\ell}}^m}\Big(\theta (R),\varphi (R)\Big),\mathrm{and}\;{\int}_{\mathrm{SO}(3)}\rho (R) dR=1\right\}, $$

is the set of admissible distributions of viewing directions that are invariant to global reflections and in-plane rotations, where $ \left(\theta (R),\varphi (R)\right) $ is as in (2). To simplify the optimization problem and lead to faster algorithms, note that we do not impose positivity of the distributions $ \rho \in \mathcal{P} $ , though this could be enforced, for instance, by imposing linear constraints $ \rho \left({R}_i\right)\ge 0 $ for a suitable choice of $ {R}_i\in \mathrm{SO}(3) $ . Moreover, the constraint $ {\int}_{\mathrm{SO}(3)}\rho (R) dR=1 $ is equivalent to imposing $ {B}_{0,0}=1 $ _,⁽Reference Sharon, Kileel, Khoo, Landa and Singer ⁸ ⁾ which can be achieved by removing $ {B}_{0,0} $ from the set of optimization variables and fixing its value to $ 1 $ . The values of the bandlimit parameters $ L,P $ and the hyperparameter $ \lambda $ used in our numerical experiments are given in Appendix B.1.

Just as in the previous section, we replace the continuous norms in (8) by discrete norms to define the metric between empirical moments and the moments from a 3D structure as

(10)

$$ {d}_{\mathrm{iKam}}\left(\left({\tilde{\mathbf{m}}}_1,{\tilde{\mathbf{m}}}_2\right),\Phi \right):= {\left(\underset{\rho \in \mathcal{P}}{\min}\parallel {\tilde{\mathbf{m}}}_2-{\mathbf{m}}_2\left(\hat{\Phi},\rho \right){\parallel}_{w_2}^2+\lambda \parallel {\tilde{\mathbf{m}}}_1-{\mathbf{m}}_1\Big(\hat{\Phi},\rho \Big){\parallel}_{w_1}^2\right)}^{1/2}. $$

The cost function in (8) is rotationally invariant, in that it does not depend on the orientation of $ \Phi $ , since (7) implies that

(11)

$$ {\displaystyle \begin{array}{l}\underset{\rho \in \mathcal{P}}{\min}\hskip0.3em \parallel {\tilde{\mathbf{m}}}_2-{\mathbf{m}}_2\left(\hat{R\cdot \Phi},\rho \right){\parallel}_{L^2\left({\mathrm{\mathbb{R}}}^2\times {\mathrm{\mathbb{R}}}^2\right)}^2+\lambda \parallel {\tilde{\mathbf{m}}}_1-{\mathbf{m}}_1\left(\hat{R\cdot \Phi},\rho \right){\parallel}_{L^2\left({\mathrm{\mathbb{R}}}^2\right)}^2\\ {}=\underset{\rho \in \mathcal{P}}{\min}\hskip0.3em \parallel {\tilde{\mathbf{m}}}_2-{\mathbf{m}}_2\left(\hat{\Phi},{R}^T\cdot \rho \right){\parallel}_{L^2\left({\mathrm{\mathbb{R}}}^2\times {\mathrm{\mathbb{R}}}^2\right)}^2+\lambda \parallel {\tilde{\mathbf{m}}}_1-{\mathbf{m}}_1\left(\hat{\Phi},{R}^T\cdot \rho \right){\parallel}_{L^2\left({\mathrm{\mathbb{R}}}^2\right)}^2\\ {}=\underset{\rho \in \mathcal{P}}{\min}\hskip0.3em \parallel {\tilde{\mathbf{m}}}_2-{\mathbf{m}}_2\left(\hat{\Phi},\rho \right){\parallel}_{L^2\left({\mathrm{\mathbb{R}}}^2\times {\mathrm{\mathbb{R}}}^2\right)}^2+\lambda \parallel {\tilde{\mathbf{m}}}_1-{\mathbf{m}}_1\left(\hat{\Phi},\rho \right){\parallel}_{L^2\left({\mathrm{\mathbb{R}}}^2\right)}^2.\end{array}} $$

where the last equality follows because $ {R}^T\cdot \rho $ lies in $ \mathcal{P} $ , since rotating a viewing angle distribution over $ \mathrm{SO}(3) $ results in another viewing angle distribution over $ \mathrm{SO}(3) $ .

At the cost of solving the small linear system detailed in Appendix A.3, our method allows for the comparison between a stack of images and a resolved structure, without performing a 3D reconstruction. Furthermore, we precompute the least-squares matrices necessary for optimization, after which the distance function can be calculated in real time. With sufficient storage and precomputation, the procedure is scalable to the entirety of the PDB.

In particular, $ {d}_{\mathrm{iKam}} $ can be used in an efficient scheme to match a stack of synthetic images to the potentials of nearby PDB structures. By selecting a subset of the PDB database, one can efficiently compute $ {d}_{\mathrm{iKam}}\left(\left({\tilde{\mathbf{m}}}_1,{\tilde{\mathbf{m}}}_2\right),\Phi \right) $ for each $ \Phi $ in the subset and find the nearest neighbors. The method for processing image moments in practice is detailed in Appendix A.4, and the computational complexity of the metric is derived in Appendix A.5.

4. Results

4.1. Existing measures of structural similarity

There are several existing methods for reporting structural similarity between two known volumes. We list two approaches based on computing alignment and Zernike moments. We compare both $ {d}_{\mathrm{iKam}} $ and $ {d}_{\mathrm{vKam}} $ to these approaches in the experiments in the following subsections. Note that the following existing metrics are limited to measuring similarity between two structures and cannot compare images to structures, whereas $ {d}_{\mathrm{iKam}} $ can.

1. Euclidean alignment: A classical approach for comparing the similarity of two structures is to sample the volumes on a 3D grid and calculate the Euclidean distance between pairs over rotations and translations. However, this method is expensive to compute since optimization over $ \mathrm{SO}(3) $ is required to align the structures. Accelerated methods for computing these alignments by maximizing the correlation between two volume maps over rotations and translations have been implemented in various programs, for example, via gradient ascent in Chimera.⁽Reference Pettersen, Goddard, Huang, Meng, Couch, Croll, Morris and Ferrin ¹⁶ ⁾ Further acceleration can be achieved by calculating volumetric correlations by expanding the volumes in a well-chosen basis and applying dimensionality reduction⁽Reference Rangan ¹⁷ ⁾ or by maximizing the correlation between common lines in projection images generated from the volumes.⁽Reference Harpaz and Shkolnisky ¹⁸ ⁾ Similar alignment methods, such as those described in the study by Bartesaghi et al. and Xu et al.⁽Reference Bartesaghi, Sprechmann, Liu, Randall, Sapiro and Subramaniam ¹⁹ ^, Reference Xu, Beck and Alber ²⁰ ⁾, are also used in electron tomography for sub-volume similarity. In this paper, we use a Bayesian optimization algorithm to minimize an Euclidean loss function, as described in the study by Singer and Yang,⁽Reference Singer and Yang ²¹ ⁾ to compute the alignment and minimum distance between two volumes.
2. Zernike moments: Another metric for structural similarity is to expand the molecule’s structure in Zernike polynomials and compute a metric from the Zernike expansion coefficients, as described in the study by Guzenko et al.⁽Reference Guzenko, Burley and Duarte ²² ⁾, which is used by the PDB for structural similarity search.

4.2. Applying $ {d}_{\mathrm{vKam}} $ to a PDB subset

To test the ability of $ {d}_{\mathrm{vKam}} $ to discern the similarity between 3D structures, we first generate a database using 1420 structures downloaded from the PDB.⁽Reference Berman ¹⁰ ⁾ The subset chosen here was selected by filtering for human proteins with an experimental structure at resolution between 1 and 3 Å and a molecular weight between 150 and 250 kDa. We use this subset because it encompasses a diverse range of shapes and symmetries as well as many homologous structures. Additionally, the weight range reflects a smaller and more challenging protein size for a typical cryo-EM experiment.⁽Reference Cianfrocco and Kellogg ²³ ⁾ In the future, a larger database containing the entire PDB can also be generated.

Using our database, we first generate a discretized potential for each structure as described in Appendix A.2. The first and second moments of each structure can then be computed using (3). We then compute $ {d}_{\mathrm{vKam}} $ in (6) pairwise for all structures in the database.

To compare the performance of $ {d}_{\mathrm{vKam}} $ against existing metrics, we calculate pairwise scores using $ {d}_{\mathrm{vKam}} $ , Euclidean alignment, and the Zernike metric. We then plot the returned scores against each other and calculate a ranking similarity using normalized discounted cumulative gain⁽Reference Järvelin and Kekäläinen ²⁴ ⁾ (NDCG). We use this metric since it is a popular method to quantify the similarity between sets of rankings; its calculation is given in Appendix A.6.

In Figure 1, we report the NDCG scores between pairs of metrics. All NDCG scores are close to 1, indicating strong agreement among the three different metrics on which structures are most similar. However, the alignment metric and $ \log \left({d}_{\mathrm{vKam}}\right) $ share the highest average NDCG score. To verify the statistical significance of this agreement, we report a t-test by selecting $ 10 $ different subsets, showing that the NDCG score between $ {d}_{\mathrm{vKam}} $ and the alignment metric is statistically significantly higher (with a p-value $ p\approx 8\times {10}^{-9} $ ) than the NDCG score between the alignment metric and the Zernike metric. We thus conclude that $ {d}_{\mathrm{vKam}} $ provides a fast and accurate alternative for the alignment metric.

Figure 1. Comparison between $ {d}_{\mathrm{vKam}} $ , the Zernike metric, and Euclidean alignment. a-c) A random size 100 subset of the database is selected. Then, pairwise similarity metrics are calculated and plotted, where each point represents a pair of structures. The NDCG score is calculated using the metric on the $ y $ -axis as the predicted metric, and the metric on the $ x $ -axis as the true metric. d) The procedure is repeated with 10 randomly selected size 100 subsets, and the mean ( $ \mu $ ) and standard deviation ( $ \sigma $ ) of the NDCG scores are calculated. The error bars and points visualize $ \mu \pm \sigma $ .

Although it is the most interpretable metric, Euclidean alignment is computationally expensive to execute for all pairs of structures in a database. To achieve a manageable runtime for alignment, we calculate pairwise Euclidean alignment distances for a subset of the database of size 100. Pairwise alignment on this subset took 8 hours on a 2.6 GHz Intel Skylake Central Processing Unit (CPU) running AVX-512 using 16 physical cores and 80 GB random-access memory (RAM). To do pairwise alignment via Bayesian optimization for the entire database of 1420 structures would require 46 days of computation, whereas using $ {d}_{\mathrm{vKam}} $ (including precomputation) to calculate pairwise distances between all 1420 structures in the database requires 3 minutes on the same hardware. Despite containing an alignment component, the Zernike metric is also fast, taking 3 minutes to compute pairwise distances for the entire database.

After observing high agreement between $ {d}_{\mathrm{vKam}} $ and the other metrics, we compute a 2D embedding of the similarity between structures in our database using t-distributed stochastic neighbor embedding (t-SNE)⁽Reference Van der Maaten and Hinton ²⁵ ⁾ (see Figure 2). Analogous t-SNE plots for the alignment metric and Zernike metric are reported in Appendix B.2. We find that $ {d}_{\mathrm{vKam}} $ provides interpretable results in identifying similar molecules from their moments without the need for alignment. In particular, we observe that both homologous (i.e., structures with similar sequences) and similar-shaped structures are shown to be clustered together.

Figure 2. 2D embedding of protein structures based on their similarity using $ {d}_{\mathrm{vKam}} $ . The analytical moments of 1420 proteins were computed and compared using (6), and t-SNE was applied for visualization. Each node represents a single structure and is colored by the number of atoms. Distinct clusters containing homologous or similarly shaped structures suggest that $ {d}_{\mathrm{vKam}} $ provides interpretable results.

4.3. Database search using $ {d}_{\mathrm{iKam}} $ with synthetic cryo-EM data

We next demonstrate the ability of $ {d}_{\mathrm{iKam}} $ to accurately find a match for the moments computed from projection images to a database of analytical moments computed from the atomic coordinates of known structures. To test our metric, we use the same dataset as the previous section, selecting the protein structure of a Mas-related G-protein-coupled receptor (available as entry PDB-7VV3⁽Reference Yang, Guo, Li, Wang, Wang, Zhang, Fang, Chen, Liu, Yan and Liu ²⁶ ⁾) from our database described in Section 4.2. We use this entry because our database includes several similarly shaped yet nonidentical structures, on which we examine our metric’s performance.

We generate a synthetic cryo-EM dataset as illustrated in Figure 3: We take $ 25000 $ clean projection images from a nonuniform distribution over $ \mathrm{SO}(3) $ at viewing angles given by a mixture of three von Mises–Fisher distributions.⁽Reference Watson ²⁷ ⁾ To simulate cryo-EM data, the images are then corrupted with one of 100 unique radial CTFs, after which we add white noise with a signal-to-noise ratio (SNR) of $ 0.1 $ . We define the SNR by taking the signal as the average squared intensity over each pixel in all the clean images and setting the noise variance to the appropriate ratio of the signal. These simulated images are generated using the ASPIRE software package⁽Reference Wright, Andén, Bansal, Xia, Langfield, Carmichael, Brook, Shi, Heimowitz, Pragier, Sason, Moscovich, Shkolnisky and and Singer ²⁸ ⁾ and have parameters consistent with many experimental datasets.

Figure 3. Visualization of the generation of simulated images. (a) Protein structure of PDB-7VV3. (b) Clean projection images from PDB-7VV3 generated with a nonuniform viewing angle distribution. (c) Projection images corrupted with a CTF and white noise with $ SNR=0.1 $ . (d) Distribution of nonuniform viewing angles.

We then compute the moments of the simulated images as will be shown in (12) and (13) and compare them to the database of moments using the image-to-volume metric described in (8). We also report the effect of varying the number of images on the metric’s performance in Appendix B.3. Using our metric, we can rank the similarity of the image’s moments to our database as shown in Figure 4. We show that the most similar score (i.e., the smallest value in image Kam’s metric) corresponds to the ground truth structure used to generate the images. Furthermore, based on our results, the next top 116 structures correspond to structures with similar volumes and sequences. These results demonstrate that we are able to compare directly between noisy, CTF-corrupted images and known structures. This approach could be especially valuable if there is no known model for initialization in 3D reconstruction or if the molecule generating the images is unknown.⁽Reference Verbeke, Mallam, Drew, Marcotte and Taylor ²⁹ ⁾

Figure 4. Histogram ranking of dissimilarities computed using $ {d}_{\mathrm{iKam}} $ on simulated noisy projection images generated from PDB-7VV3.

We report alignment scores between molecules in our database to PDB entry 7VV3, compare these to our metric’s scores, and plot the results in Figure 5. Most notably, when the protein structure becomes less similar to the ground truth (7VV3), the alignment metric begins to lose discriminative power. Figure 5 shows structures with varying degrees of dissimilarity as having the same score ( $ \sim $ 100). In contrast, our metric retains discriminative power, ranking structures with similar sequences/functions before structures with similar shapes.

Figure 5. Comparison between the rankings given by $ {d}_{\mathrm{iKam}} $ (computed from simulated images) and the minimum Euclidean distance after alignment (computed from volumes). The structures shown are superimposed with the ground truth after alignment in panels (a)–(d). The points on the graph that correspond to these structures are colored and labeled. The ground truth corresponds to the green cross in the lower left.

Note that the Euclidean alignment metric shows stagnation whereas Kam’s metric does not.

Alignment via Bayesian optimization between one structure and the 1420 structures in the database took 95 minutes using the hardware described in Section 4.2. Aside from the computational cost, the interpretation of the optimal rotation returned by alignment becomes unclear when comparing two structures that are not volumetrically similar. On the other hand, our metric does not return an alignment between two structures, which could render it less useful when an explicit alignment must be computed. Without this alignment, it may become harder to visually compare their volumes.

It is computationally costly to generate and perform moment estimation on synthetic images for every molecule in the database. As such, to compare the performance of our metric against the Zernike metric, we select from our database a random subset of 100 structures. For each structure, we repeat the process we perform on PDB-7VV3: First, we generate a nonuniform distribution over $ {\unicode{x1D54A}}^2 $ as a mixture of three von Mises–Fisher distributions with random means, weights, and covariance matrices. We then generate 25000 images, corrupt with SNR = 0.1 and radial CTFs, compute the moments, and search across the database.

For every structure, we recover the ground truth as one of the first six lowest-scoring molecules. Moreover, 88 of the 100 tests recovered the ground truth as the lowest-scoring molecule. To evaluate how well the metrics agree on structural similarity, we compute the size of the intersection between the top ten structures returned by our metric and those returned by the Zernike metric. As shown in Figure 6, we find that the metrics agree on two to three structures, and a large number of structures agree only on the ground truth structure. When they occur, disagreements between the metrics are likely due to the presence of near-identical molecules in the database.

Figure 6. Comparison between $ {d}_{\mathrm{iKam}} $ _, computed from simulated images, and the Zernike metric, computed from volumes. Here, we repeat simulated experiments 100 times. Then, the size of the intersection of the top ten structures returned by $ {d}_{\mathrm{iKam}} $ and the Zernike metric is plotted as a histogram.

4.4. Toward matching experimental datasets by $ {d}_{\mathrm{iKam}} $

While our simulated result shows success in matching a synthetic cryo-EM dataset to PDB structures, many experimental cryo-EM datasets are corrupted by a large number of unmodeled effects that we have not considered. Among the real-data effects are scattering potential’s corruption by a solvent effect,⁽Reference Shang and Sigworth ³⁰ ⁾ the B-factor,⁽Reference Rosenthal and Henderson ³¹ ⁾ a global scaling ambiguity, imperfect centering, junk particles, non-radial CTF, and imperfect noise model. Our simulation falls short on these counts.

In a first step toward applying $ {d}_{\mathrm{iKam}} $ to real experimental datasets, we compare the moments of a stack of images deposited in the Electron Microscopy Public Image ARchive (EMPIAR)⁽Reference Iudin, Korir, Somasundharam, Weyand, Cattavitello, Fonseca, Salih, Kleywegt and Patwardhan ³² ⁾ to the moments of its preprocessed 3D reconstructions given by the program CryoSPARC⁽Reference Punjani, Rubinstein, Fleet and Brubaker ³³ ⁾. We select the dataset EMPIAR-10076,⁽Reference Davis, Tan, Carragher, Potter, Lyumkis and Williamson ³⁴ ⁾ a heterogeneous dataset containing five major structures. The dataset is well characterized, and each image in the dataset has been classified into one of the five major states⁽Reference Davis, Tan, Carragher, Potter, Lyumkis and Williamson ³⁴ ⁾ or “junk” particles, which we discard. We use the classification to generate five separate datasets, allowing us to compute five different moments, one for each of the major states. This test case allows us to examine our metric matching on a real dataset, while bypassing some of the issues associated with comparing datasets and volumes obtained in different experimental conditions.

We downsample the image stack to $ 64\times 64 $ , center using the deposited shift, and mask the images with a circular binary mask of radius $ 0.8 $ times half the side length of the image. We then estimate the moments for each structure and compare them to moments computed analytically from preprocessed volume reconstructions of the five major structures, as well as two other distinctly shaped ribosomes from the Electron Microscopy Data Bank⁽Reference Turner ³⁵ ⁾ (EMDB), EMD-8457 and EMD-2660, used as a baseline. Scaling issues between the moment computed from the images and the moment computed from the volume are resolved by examining the diagonal entries of the second moments. Specifically, we find a multiplicative scaling factor that best matches the diagonal of the image-computed second moment and those of the volume-computed second moment under a uniform distribution with respect to the $ {l}^2 $ norm.

As shown in Figure 7, it is observed that Kam’s metric recovers the ground truth structure at the lowest distance for the experimental images corresponding to structure 001. We note that the scores for molecules 001 and 002, as well as molecules 003 and 004, are almost identical in value. Also, we find that the analytical moments are closer to each other than to the experimentally determined moments. Finally, the metric reports the baseline structures, which are very different in shape and size, at the largest distances.

Figure 7. $ {d}_{\mathrm{iKam}} $ visualization and ranking results for experimental data corresponding to structure 001 (a) Experimental images from EMPIAR-10076 corresponding to structure 001 downsampled to $ 64\times 64 $ pixels, centered, and with binary mask applied. (b) Comparison between diagonal entries of the second moment computed from the reconstructed volumes 001 and 002 and the moment estimated from experimental images corresponding to structure 001. (c) Comparison between diagonal entries of the second moment computed from the reconstructed volumes 003 and 004 and the moment estimated from experimental images corresponding to structure 001. (d) The five reconstructions (000–004) and two baseline structures (EMD-8457 and EMD-2600) ranked using $ {d}_{\mathrm{iKam}} $ , ordered from left to right.

In Figure 8, we plot the distances between the five reconstructions (or in the case of $ {d}_{\mathrm{iKam}} $ , their experimental images) and the seven candidate structures given by both of our metrics. The exact values for $ {d}_{\mathrm{iKam}} $ are given in Appendix B.4. There is also scaling ambiguity in $ {d}_{\mathrm{vKam}} $ since our reconstructions are preprocessed; hence, we use the same approach as above: We scale each candidate structure’s moment by a multiplicative scaling factor that best matches the candidate structure’s diagonal entries of the second moment with those of the ground truth structure. Analyzing the trends in each row, we observe that the metrics seem to agree on the general ranking of the molecules. While the structures 001, 002 and 003, 004 are very similar, $ {d}_{\mathrm{vKam}} $ shows that the metric distinguishes between them given accurate moment estimation, whereas $ {d}_{\mathrm{iKam}} $ loses some discriminative power. However, when it comes to distinct molecules such as EMD-8457 and EMD-2660, both metrics agree on their rankings.

Figure 8. Visualization of $ \log \left({d}_{\mathrm{vKam}}\right) $ and $ \log \left({d}_{\mathrm{iKam}}\right) $ values on the seven candidate structures. Here, EMD-8457 and EMD-2660 are listed as 8457 and 2660 for brevity. Note that there are five ground truth structures but seven candidate structures since EMD-2660 and EMD-8457 are baseline structures for which there are no images in the experimental dataset.

5. Limitations and future work

$ {d}_{\mathrm{iKam}} $ currently falls short of being directly applicable to experimental datasets. As stated in Section 4.4, there are several unmodeled effects not considered in this work that could lead to unexpected results for real data. The net effect of ignoring these experimental considerations is to bias our moment estimator, which may explain the inability of $ {d}_{\mathrm{iKam}} $ to detect the smaller differences between structures 001 and 002, as well as 003 and 004. Developing an estimator that is robust to outliers (such as junk particles) could help alleviate this.

While we address a few of these parameters, we do so with prior knowledge. For example, the shifts used to center images are a byproduct of the reconstruction process. In future work, we aim to develop methods to correct these effects directly from the raw images. Likewise, here we have controlled for experiment-specific artifacts by using images and structures resolved from the same experiment, whereas in the future we wish to compare across all structures. Furthermore, in the future we seek to compare moments computed from real data directly to the PDB, by appropriately correcting for the discrepancies between PDB and reconstructed structures.

Even with our current mitigations, issues such as the B-factor and inaccuracies in the noise model remain completely unmodeled. Further studies will be required to investigate which of these omissions is important and which can safely be made. Then, our method could be modified to account for the important effects.

6. Discussion

We introduced structural similarity metrics for proteins based on moments, inspired by the moment computation in Kam’s method. $ {d}_{\mathrm{vKam}} $ compares known 3D structures according to the difference between the moments of their potentials. We showed that the metric accurately captures similarity according to the rotationally aligned Euclidean metric, an interpretable but expensive-to-compute molecular similarity metric. Therefore, $ {d}_{\mathrm{vKam}} $ allows for the efficient comparison of large number of known structures. A potential application is to improve the similarity search presently in the PDB, which uses the Zernike metric – a fast but less principled metric that involves learning weights and which our results suggest performs worse than ours.

A second metric, termed $ {d}_{\mathrm{iKam}} $ , allows for the computation of a similarity score between an unknown structure present in a large cryo-EM dataset and a solved structure. The computation of this metric does not require a 3D reconstruction process for the image stack and therefore is very efficient. We showed on simulated projection images that our method could discriminate between even very similar proteins with reasonably sized datasets. If it were to work on experimental datasets, $ {d}_{\mathrm{iKam}} $ could become a versatile tool for 3D reconstruction. Typical reconstruction algorithms used in practice are only locally optimal and thus require good initialization, which $ {d}_{\mathrm{iKam}} $ could provide by returning the homologous structures present in the PDB. By extending the database to the entirety of the PDB and including structure predictions, both solved and predicted structures could be quickly compared against.

Beyond its application to experiments, $ {d}_{\mathrm{iKam}} $ demonstrates that Kam’s method is a feasible strategy for high-resolution reconstruction. Recent works have improved the viability of Kam’s method by using sparsity⁽Reference Bendory, Khoo, Kileel, Mickelin and Singer ⁷ ⁾ or neural network⁽Reference Khoo, Paul and Sharon ³⁶ ⁾ priors; likewise, the search over the PDB using Kam’s metric can be interpreted as simply running Kam’s method under a very strong prior, where only a finite number of structures appear with nonzero probability. Our results suggest that, if one could formulate a tractable prior over the manifold of proteins, Kam’s method could yield high-resolution reconstructions.

Acknowledgments

J.K. and M.A.G. thank Bronson Zhou for helpful conversations.

Data availability statement

Replication code can be found at https://github.com/aszhang107/moment-based-metrics/.

Author contribution

All authors conceived of the project and designed the algorithms. A.Z. wrote the software and performed the experiments. All authors wrote the manuscript and approved its submission.

Funding statement

Part of this research was performed while authors J.K., E.J.V., N.F.M., M.A.G., and A.S. were visiting the long program on Computational Microscopy at the Institute for Pure and Applied Mathematics, which is supported by NSF DMS 1925919. A.Z., O.M., E.J.V., M.A.G., and A.S. are supported in part by AFOSR FA9550-20-1-0266, the Simons Foundation Math+X Investigator Award, NSF DMS 2009753, and NIH/NIGMS R01GM136780–01. J.K. is supported in part by NSF DMS 2309782, NSF CISE-IIS 2312746, and start-up grants from the College of Natural Science and Oden Institute at the University of Texas at Austin. N.F.M. is supported in part by a start-up grant from Oregon State University.

Competing interest

The authors declare none.

Appendix

A. Methodology

In this section, we describe the computational details of the method.

A.1. Moment derivation

Prior work⁽Reference Sharon, Kileel, Khoo, Landa and Singer ⁸ ⁾ has shown that the analytical first and second moments of cryo-EM images generated by $ \hat{\Phi} $ and $ \rho $ equal

(12)

$$ {\mathbf{m}}_1\left(\hat{\Phi},\rho \right)=\sum \limits_{\mathrm{\ell},m}{B}_{\mathrm{\ell},-m}{A}_{\mathrm{\ell},m}(r){N}_{\mathrm{\ell}}^0\frac{1}{2\mathrm{\ell}+1}{\left(-1\right)}^m, $$

where the sum ranges over $ \left(\mathrm{\ell},m\right) $ such that $ 0\le \mathrm{\ell}\le \min \left(L,P\right) $ , $ \mathrm{\ell} $ is even, and $ -\mathrm{\ell}\le m\le \mathrm{\ell} $ , and

(13)

$$ {\displaystyle \begin{array}{l}{\mathbf{m}}_2\left(\hat{\Phi},\rho \right)=\sum \limits_n{e}^{in\left({\phi}^{\prime }-\phi \right)}\\ {}\sum \limits_{\mathrm{\ell},m,{\mathrm{\ell}}^{\prime },{m}^{\prime },{{\mathrm{\ell}}^{\prime}}^{\prime }}{A}_{\mathrm{\ell},m}(r){A}_{{\mathrm{\ell}}^{\prime },{m}^{\prime }}\left({r}^{\prime}\right){N}_{\mathrm{\ell}}^n{N}_{{\mathrm{\ell}}^{\prime}}^{-n}{B}_{{{\mathrm{\ell}}^{\prime}}^{\prime },-m-{m}^{\prime }}{C}_{{{\mathrm{\ell}}^{\prime}}^{\prime }}\left(\mathrm{\ell},{\mathrm{\ell}}^{\prime },m,{m}^{\prime },n,-n\right)\frac{{\left(-1\right)}^{m+{m}^{\prime }}}{2{{\mathrm{\ell}}^{\prime}}^{\prime }+1},\end{array}} $$

where

(14)

$$ {N}_{\mathrm{\ell}}^n:= \left(\begin{array}{ll}\sqrt{\frac{2\mathrm{\ell}+1}{4\pi}\frac{\left(\mathrm{\ell}-n\right)!}{\left(\mathrm{\ell}+n\right)!}}\frac{{\left(-1\right)}^n}{2^{\mathrm{\ell}}\mathrm{\ell}!}\left(\mathrm{\ell}+n\right)!\left(\begin{array}{c}\mathrm{\ell}\\ {}\left(\mathrm{\ell}+n\right)/2\end{array}\right){\left(-1\right)}^{\left(\mathrm{\ell}-n\right)/2}& \mathrm{if}\;n\equiv \mathrm{\ell}\hskip0.1em \left(\operatorname{mod}\;2\right)\\ {}0& \mathrm{if}\hskip0.35em n\;\not\equiv \hskip0.35em \mathrm{\ell}\hskip0.1em \left(\operatorname{mod}\;2\right)\end{array}\right. $$

is an explicitly calculated constant,

(15)

$$ {C}_{{{\mathrm{\ell}}^{\prime}}^{\prime }}\left(\mathrm{\ell},{\mathrm{\ell}}^{\prime },m,{m}^{\prime },n,{n}^{\prime}\right):= C\left(\mathrm{\ell},m;{\mathrm{\ell}}^{\prime },{m}^{\prime }|{{\mathrm{\ell}}^{\prime}}^{\prime },m+{m}^{\prime}\right)C\left(\mathrm{\ell},n;{\mathrm{\ell}}^{\prime },{n}^{\prime }|{{\mathrm{\ell}}^{\prime}}^{\prime },n+{n}^{\prime}\right) $$

is a product of Clebsch–Gordan coefficients,⁽Reference Biedenharn and Louck ³⁷ ⁾ and the sum ranges over those indices $ n,\mathrm{\ell},m,{\mathrm{\ell}}^{\prime },{m}^{\prime },{{\mathrm{\ell}}^{\prime}}^{\prime } $ that satisfy

(16)

$$ {\displaystyle \begin{array}{l}-L\le n\le L,\hskip1em |n|\le \ell \le L,\hskip1em |n|\le {\ell}^{\mathrm{\prime}}\le L,\hskip1em -\ell \le m\le \ell, \hskip1em -{\ell}^{\mathrm{\prime}}\le {m}^{\mathrm{\prime}}\le {\ell}^{\mathrm{\prime}},\\ {}\ell \equiv {\ell}^{\mathrm{\prime}}\equiv n\mathrm{mod}2,\hskip1em \max (|\ell -{\ell}^{\mathrm{\prime}}|,|m+{m}^{\mathrm{\prime}}|)\le {\ell^{\mathrm{\prime}}}^{\mathrm{\prime}}\le \min (\ell +{\ell}^{\mathrm{\prime}},P).\end{array}} $$

See Sections 2.3.1 and 2.3.2 in,⁽Reference Sharon, Kileel, Khoo, Landa and Singer ⁸ ⁾ respectively, for the derivations of (12) and (13). In the case of the uniform density on $ \mathrm{SO}(3) $ , we note that $ {N}_0^0=\frac{1}{\sqrt{4\pi }} $ so (12) and (13) simplify to the following:

(17)

$$ {\mathbf{m}}_1(r)=\sqrt{\frac{1}{4\pi }}{A}_{0,0}(r), $$

(18)

$$ {\displaystyle \begin{array}{c}{\mathbf{m}}_2\left(\Delta \phi, r,{r}^{\prime}\right)=\frac{1}{4\pi}\sum \limits_{\mathrm{\ell},m}{A}_{\mathrm{\ell},m}(r){A}_{\mathrm{\ell},-m}\left({r}^{\prime}\right){P}_{\mathrm{\ell}}\left(\cos \left(\Delta \phi +\pi \right)\right)\\ {}=\frac{1}{4\pi}\sum \limits_{\mathrm{\ell},m}{A}_{\mathrm{\ell},m}(r)\overline{A_{\mathrm{\ell},m}\left({r}^{\prime}\right)}{P}_{\mathrm{\ell}}\left(\cos \left(\Delta \phi \right)\right),\end{array}} $$

where $ {P}_{\mathrm{\ell}} $ is the Legendre polynomial of degree $ \mathrm{\ell} $ and $ \overline{z} $ denotes the complex conjugate of a complex number $ z\in \mathrm{\mathbb{C}} $ . The simplification of (12) to (17) is immediate, whereas the simplification of (13) to (18) uses the sum rule for spherical harmonics; see (10) in the study by Kam.⁽Reference Kam ¹ ⁾

A.2. Uniform case

This section details the method to compute $ {d}_{\mathrm{vKam}} $ . Our algorithm takes as input a PDB identifier (a list of atomic coordinates), on which we center the atomic positions by subtracting the molecule’s center of mass. Then, we use the three-dimensional nonuniform fast Fourier transform (NUFFT)⁽Reference Barnett, Magland and Klinteberg ³⁸ ^, Reference Barnett ³⁹ ⁾ to compute the discrete Fourier transform evaluated on a grid in spherical coordinates, that is, to compute

(19)

$$ {a}_{kjl}=\sum \limits_{i=1}^q{\hat{f}}_i\left({r}_k{\alpha}_{jl}\right){e}^{ix_i\cdot {r}_k{\alpha}_{jl}},\hskip0.35em \mathrm{where}\hskip0.35em {\alpha}_{jl}=\left(\sin {\theta}_j\cos {\varphi}_l,\sin {\theta}_j\sin {\varphi}_l,\cos {\theta}_j\right), $$

where $ {x}_i $ denotes the coordinates of the $ {i}^{\mathrm{th}} $ atom from the PDB identifier and $ q $ is the total number of atoms. The function $ {\hat{f}}_i $ is the Fourier transform of the scattering potential of the $ {i}^{\mathrm{th}} $ atom as reported in the study by Peng et al. and Singer.⁽Reference Peng, Ren, Dudarev and Whelan ⁴⁰ ^, Reference Singer ⁴¹ ⁾ In real space, this corresponds to convolving a Gaussian mixture with a delta function, in other words, adding a Gaussian blob around the atom coordinate. Here, $ {r}_k=\frac{k\delta}{N} $ , $ {\theta}_j=\frac{\pi j}{N} $ _, and $ {\varphi}_l=\frac{2\pi l}{N} $ for $ k=0,\dots, N/2 $ and $ j,l=0,\dots, N-1 $ and $ \delta $ is the side length of the volume grid in angstroms.

Lastly, we apply the spherical harmonic transform to $ {a}_{kjl} $ defined on the spherical coordinate grid $ \left({r}_k,{\theta}_j,{\varphi}_l\right) $ in (19) using SHTools.⁽Reference Driscoll and Healy ⁴² ^, Reference Wieczorek and Meschede ⁴³ ⁾ This gives us coefficients

$$ {A}_{\mathrm{\ell},m}\left({r}_k\right)=\sum \limits_{\begin{array}{c}0\le \mathrm{\ell}\le L\\ {}0\le k\le n-1\end{array}}{a}_{k,j,l}\overline{Y_{\mathrm{\ell}}^m\left({\alpha}_{jl}\right)}. $$

Let $ {\rho}_u $ denote the uniform density on the sphere. In the discrete case, we sample each image as a 2D polar grid at $ N/2 $ radial points $ r $ and $ N $ angular points $ \phi $ , where $ N $ is the number of pixels of one side of the projection images. In (12), the first moment $ {\mathbf{m}}_1 $ is indexed by $ r $ and is thus an $ N/2 $ length vector. Note that in (5), $ \Delta {\phi}_j=2\pi j/N-2\pi $ for $ j=1,\dots, 2N $ , but since $ {e}^{in\Delta \phi } $ in (13) is $ 2\pi /n $ periodic, we have that $ {e}^{in\left(2\pi -\Delta \phi \right)}={e}^{- in\Delta \phi } $ _, and hence, $ {\mathbf{m}}_2\left(r,{r}^{\prime },\Delta {\phi}_j\right)={\mathbf{m}}_2\left(r,{r}^{\prime },\Delta {\phi}_{j+N}\right) $ . Thus, $ \Delta {\phi}_j $ for $ j=1,\dots, N $ is redundant and we consider only $ \Delta {\phi}_j $ for $ j=N+1,\dots, 2N $ , which enumerates $ \left[0,2\pi \right] $ . Thus, $ {\mathbf{m}}_2 $ is a three-dimensional tensor of size $ N\times N/2\times N/2 $ , since there are $ N $ values for $ \Delta \phi $ and $ N/2 $ values each for $ r $ and $ {r}^{\prime } $ , where $ \Delta \phi $ are points uniformly spaced between 0 and $ 2\pi $ and $ r $ and $ {r}^{\prime } $ are $ N $ uniformly spaced points between 0 and $ \delta $ . Equations (17) and (18) give

(20)

$$ {\mathbf{m}}_1(k)={A}_{0,0}\left({r}_k\right), $$

(21)

$$ {\displaystyle \begin{array}{c}{\mathbf{m}}_2\left(j,{k}_1,{k}_2\right)=\frac{1}{4\pi}\sum \limits_{\mathrm{\ell},m}{A}_{\mathrm{\ell},m}\left({r}_{k_1}\right){A}_{\mathrm{\ell},-m}\left({r}_{k_2}\right){P}_{\mathrm{\ell}}\left(\cos \left(\Delta {\phi}_j+\pi \right)\right)\\ {}=\frac{1}{4\pi}\sum \limits_{\mathrm{\ell},m}{A}_{\mathrm{\ell},m}\left({r}_{k_1}\right)\overline{A_{\mathrm{\ell},m}\left({r}_{k_2}\right)}{P}_{\mathrm{\ell}}\left(\cos \left(\Delta {\phi}_j\right)\right).\end{array}} $$

We then compute the metric given in (10). To better approximate the $ {L}_2 $ norm in the continuous case, we scale the difference of each entry $ {\mathbf{m}}_2\left(j,{k}_1,{k}_2\right) $ by $ \sqrt{r_{k_1}{r}_{k_2}} $ so that the squared norm is scaled by $ {r}_{k_1}{r}_{k_2} $ . More precisely, we define weighted $ {\mathrm{\ell}}^2 $ norms $ \parallel \cdot {\parallel}_{w_1} $ and $ \parallel \cdot {\parallel}_{w_2} $ on $ {\mathrm{\mathbb{R}}}^{N/2} $ and $ {\mathrm{\mathbb{R}}}^{N\times N/2\times N/2} $ , as described in (5). Let $ \Phi $ and $ {\Phi}^{\prime } $ be two different molecules, and $ {\mathbf{m}}_1,{\mathbf{m}}_2 $ and $ {\mathbf{m}}_{1^{\prime }},{\mathbf{m}}_{2^{\prime }} $ be the first and second moment tensors, respectively, from two different molecules. We define the distance between the moments as in (6).

A.3. Least squares for the nonuniform case

This section describes the process for generating and solving the least-squares system for $ B $ , the matrix encoding the viewing angle distribution. We use the following convention for the vectorization operator $ \mathrm{vec}\left(\cdot \right) $ : If $ \mathrm{\mathcal{M}}\in {\mathrm{\mathbb{C}}}^{i\times j} $ , $ \mathrm{vec}\left(\mathrm{\mathcal{M}}\right) $ returns a vector of dimension $ ij $ obtained by stacking the columns of $ \mathrm{\mathcal{M}} $ , that is,

(22)

$$ \mathrm{vec}\left(\left[\begin{array}{ccc}{a}_{1,1}& \cdots & {a}_{1,j}\\ {}\vdots & \ddots & \vdots \\ {}{a}_{i,1}& \cdots & {a}_{i,j}\end{array}\right]\right):= {\left[\begin{array}{cccccc}{a}_{1,1}& \cdots & {a}_{i,1}& {a}_{1,2}& \cdots & {a}_{i,j}\end{array}\right]}^T. $$

The first moment is linear in $ B $ as shown in (12), so fitting a viewing distribution to observed moments can be solved through a least-squares problem. We detail this procedure in Algorithm 1.

Algorithm 1: Computation of least-squares matrix $ V $ for $ {\mathbf{m}}_1 $ .

1 initialize $ V\left[i=1,\dots, N/2\right]\left[\left(p=1,\dots, P;m=-p,\dots, p\right)\right]\leftarrow 0 $ .
2 for $ i=1,\dots N/2 $ do.
3 for $ p=1,\dots, P $ do.
4 for $ m=-p,\dots, p $ do.
5 $ V\left[i\right]\left[\left(p;-m\right)\right]\leftarrow {A}_{\mathrm{\ell},m}\left({r}_i\right){N}_{\mathrm{\ell}}^0\frac{{\left(-1\right)}^m}{2\mathrm{\ell}+1} $
6 end.
7 end.
8 end.
9 return $ V $ .

Algorithm 2: Computation of least-squares matrix $ {\mathcal{U}}_{\mathrm{\ell},{\mathrm{\ell}}^{\prime}}^n $ for $ {\mathrm{\mathcal{B}}}_{\mathrm{\ell},{\mathrm{\ell}}^{\prime}}^n $ .

1 Initialize $ {\mathcal{U}}_{\mathrm{\ell},{\mathrm{\ell}}^{\prime}}^n\left[i=1,\dots, \left(2\mathrm{\ell}+1\right)\left(2{\mathrm{\ell}}^{\prime }+1\right)\right]\left[\left({{\mathrm{\ell}}^{\prime}}^{\prime }=1,\dots, P;m=-{{\mathrm{\ell}}^{\prime}}^{\prime },\dots, {{\mathrm{\ell}}^{\prime}}^{\prime}\right)\right]\leftarrow 0 $ .
2 for $ i=1,\dots \left(2\mathrm{\ell}+1\right)\left(2{\mathrm{\ell}}^{\prime }+1\right) $ do.
3 for $ m=-\mathrm{\ell},\dots, \mathrm{\ell} $ do.
4 for $ {m}^{\prime }=-{\mathrm{\ell}}^{\prime },\dots, \mathrm{\ell} $ .
5 for $ {{\mathrm{\ell}}^{\prime}}^{\prime }=\max \left(|\mathrm{\ell}-{\mathrm{\ell}}^{\prime }|,|m+{m}^{\prime }|\right),\dots, \min \left({\mathrm{\ell}}^{\prime }+\mathrm{\ell},P\right) $ .
6 $ {\mathcal{U}}_{\mathrm{\ell},{\mathrm{\ell}}^{\prime}}^n\left[i\right]\left[\left({{\mathrm{\ell}}^{\prime}}^{\prime },-m-{m}^{\prime}\right)\right]\leftarrow {\mathcal{U}}_{\mathrm{\ell},{\mathrm{\ell}}^{\prime}}^n\left[i\right]\left[\left({{\mathrm{\ell}}^{\prime}}^{\prime },-m-{m}^{\prime}\right)\right]+{C}_{{{\mathrm{\ell}}^{\prime}}^{\prime }}\left(\mathrm{\ell},{\mathrm{\ell}}^{\prime },m,{m}^{\prime },n,-n\right)\frac{{\left(-1\right)}^{m+{m}^{\prime }}}{2{{\mathrm{\ell}}^{\prime}}^{\prime }+1} $
7 end.
8 end.
9 end.
10 end.
11 return $ {\mathcal{U}}_{\mathrm{\ell},{\mathrm{\ell}}^{\prime}}^n $ .

For the second moment, we rewrite (13) more compactly:

(23)

$$ {\mathbf{m}}_2\left(\Delta \phi \right)=\sum \limits_n{e}^{in\Delta \phi}\sum \limits_{\mathrm{\ell},{\mathrm{\ell}}^{\prime }}{\mathcal{A}}_{\mathrm{\ell}}{\mathrm{\mathcal{B}}}_{\mathrm{\ell},{\mathrm{\ell}}^{\prime}}^n{\mathcal{A}}_{{\mathrm{\ell}}^{\prime}}^{\ast }, $$

where

(24)

$$ {\left({\mathcal{B}}_{\ell, {\ell}^{\mathrm{\prime}}}^n\right)}_{m,{m}^{\mathrm{\prime}}}=\sum \limits_{\ell^{\mathrm{\prime}\mathrm{\prime }}}{N}_{\ell}^n{N}_{\ell^{\mathrm{\prime}}}^{-n}{B}_{\ell^{\mathrm{\prime}\mathrm{\prime }},-m-{m}^{\mathrm{\prime}}}{\mathcal{C}}_{\ell^{\mathrm{\prime}\mathrm{\prime }}}(\ell, {\ell}^{\mathrm{\prime}},m,{m}^{\mathrm{\prime}},n,-n)\frac{(-1)^{m+{m}^{\mathrm{\prime}}}}{2{\ell}^{\mathrm{\prime}\mathrm{\prime }}+1} $$

is a matrix of size $ \left(2\mathrm{\ell}+1\right)\times \left(2{\mathrm{\ell}}^{\prime }+1\right) $ indexed by $ {m}_1=-\mathrm{\ell}\dots \mathrm{\ell} $ and $ {m}_2=-{\mathrm{\ell}}^{\prime}\dots {\mathrm{\ell}}^{\prime } $ , and $ {\left({\mathcal{A}}_{\mathrm{\ell}}\right)}_{m,r}={A}_{\mathrm{\ell},m}(r) $ is a matrix of spherical harmonic coefficients indexed by $ m,r $ . Here, the sum ranges are detailed in (16). Since $ \mathrm{\mathcal{B}} $ is linear in $ B $ , we use Algorithm 2 to construct many linear systems $ {\mathcal{U}}_{\mathrm{\ell},{\mathrm{\ell}}^{\prime}}^n $ such that

$$ {\mathcal{U}}_{\mathrm{\ell},{\mathrm{\ell}}^{\prime}}^n\mathrm{vec}(B)=\mathrm{vec}\left({\mathrm{\mathcal{B}}}_{\mathrm{\ell},{\mathrm{\ell}}^{\prime}}^n\right), $$

Using the Kronecker product, (23) can be written as

$$ \mathrm{vec}\left({\mathbf{m}}_2\left(\Delta \phi \right)\right)=\sum \limits_n{e}^{in\Delta \phi}\sum \limits_{\mathrm{\ell},{\mathrm{\ell}}^{\prime }}\left({\mathcal{A}}_{{\mathrm{\ell}}^{\prime }}\otimes {\mathcal{A}}_{\mathrm{\ell}}\right){\mathcal{U}}_{\mathrm{\ell},{\mathrm{\ell}}^{\prime}}^n\mathrm{vec}(B). $$

This, too, is linear in $ B $ :

(25)

$$ \mathrm{vec}\left({\mathbf{m}}_2\left(\Delta \phi \right)\right)=U\left(\Delta \phi \right)\mathrm{vec}(B),\hskip1em \mathrm{where}\hskip1em U\left(\Delta \phi \right)=\sum \limits_n{e}^{in\Delta \phi}\sum \limits_{\mathrm{\ell},{\mathrm{\ell}}^{\prime }}\left({\mathcal{A}}_{{\mathrm{\ell}}^{\prime }}\otimes {\mathcal{A}}_{\mathrm{\ell}}\right){\mathcal{U}}_{\mathrm{\ell},{\mathrm{\ell}}^{\prime}}^n. $$

By vertically appending $ V $ and copies of $ U\left(\Delta \phi \right) $ for all values of $ \Delta \phi $ in Section 3.1, we obtain the least-squares formulation

$$ \underset{x}{\min}\parallel Ax-b\parallel, $$

where

(26)

$$ A=\left(\begin{array}{c}V\\ {}U\left(\Delta {\phi}_1\right)\\ {}\vdots \\ {}U\left(\Delta {\phi}_n\right)\end{array}\right),\hskip1em x=\mathrm{vec}(B),\hskip1em \mathrm{and}\hskip1em b=\left(\begin{array}{c}\mathrm{vec}\left({\mathbf{m}}_1\right)\\ {}\mathrm{vec}\left({\mathbf{m}}_2\right)\end{array}\right). $$

To solve this, we perform $ QR $ decomposition $ A= QR $ and then solve the normal equations

$$ {A}^{\ast } Ax={R}^{\ast } Rx={A}^{\ast }b={R}^{\ast }{Q}^{\ast }b, $$

that is, we solve $ Rx={Q}^{\ast }b $ . Since $ R $ is a square upper triangular matrix, we solve this using back substitution.

A.4. Change of bases for moment comparison

We compute moments from images using the fast method⁽Reference Marshall, Mickelin, Shi and Singer ⁴⁴ ⁾ that produces the moments expanded in the Fourier Bessel basis. Thus, a change of bases is required for moment comparison. The Fourier Bessel basis has several nice properties that make it advantageous to use when computing the moment from images; it is orthonormal, frequency-ordered, steerable, provides fast radial convolutions, and has a fast transform.⁽Reference Marshall, Mickelin and Singer ⁴⁵ ⁾ The Fourier Bessel basis functions can be written in polar coordinates $ \left(r,\theta \right) $ as

(27)

$$ {\psi}_{n,k}\left(r,\theta \right)={c}_{n,k}{J}_n\left({\lambda}_{nk}r\right){e}^{in\theta}, $$

where $ {J}_n $ is a Bessel function of the first kind of order $ n $ , and $ {\lambda}_{nk} $ is the $ k $ -h smallest positive zero of $ {J}_n $ , and $ {c}_{n,k} $ is a normalization constant.

We create a change of basis matrix $ {(B)}_{\left(r,\theta \right),\left(n,k\right)}={\psi}_{n,k}\left(r,\theta \right) $ by sampling on a Cartesian grid $ \left(x,y\right)\in \left\{{r}_i\right\}\times \left\{{r}_i\right\} $ with the $ \left\{{r}_i\right\} $ grid defined as in Section 3.1, where $ \left(r,\theta \right) $ are the grid points $ \left(x,y\right) $ in polar coordinates. This yields the moments in real space

$$ {\displaystyle \begin{array}{c}{\mathbf{m}}_1\left(\Phi, \rho \right)\left(x,y\right)=B{\mathbf{m}}_1\left(\Phi, \rho \right)\left(n,k\right),\\ {}{\mathbf{m}}_2\left(\Phi, \rho \right)\left({x}_1,{y}_1,{x}_2,{y}_2\right)=B{\mathbf{m}}_2\left(\Phi, \rho \right)\left({n}_1,{k}_1,{n}_2,{k}_2\right){B}^{\ast }.\end{array}} $$

Now, we compute the NUFFT to convert the moments into radially sampled polar coordinates in Fourier space as in (19). In practice, we do this for $ {\mathbf{m}}_2 $ by taking each row (which is indexed by $ \left({x}_1,{y}_1\right) $ ), reshaping it into an image, and applying the transform. We then apply the same process to the columns indexed by $ \left({x}_2,{y}_2\right) $ .

A.5. Computational complexity

In the following, $ L $ is the molecule bandlimit, and see (1); $ P $ is the distribution bandlimit, and see (2); $ M $ is the number of projection images; and $ N $ is the image side length of the $ N\times N $ pixel images. We assume that $ P\le L $ .

There are three main steps for calculating the least-squares matrix for each structure in our database. We first calculate the least-squares matrices $ {\mathcal{U}}_{\mathrm{\ell},{\mathrm{\ell}}^{\prime}}^n $ for $ \mathrm{\mathcal{B}} $ as described in alg:BLSalg. This needs only to be done once and does not need to be recomputed for each molecule. Calculating this matrix takes $ O\left({PL}^5\right) $ time and uses $ O\left({PL}^5\right) $ space. For the calculation of the least-squares matrix itself, we precompute $ \left({\mathcal{A}}_{{\mathrm{\ell}}^{\prime }}\otimes {\mathcal{A}}_{\mathrm{\ell}}\right){\mathcal{U}}_{\mathrm{\ell},{\mathrm{\ell}}^{\prime}}^n, $ for $ \mathrm{\ell},{\mathrm{\ell}}^{\prime },n $ as described in (25). These intermediate steps take $ O\left({P}^2{L}^2{N}^2+{L}^3{N}^2\right) $ time and use $ O\left({L}^3{N}^2\right) $ space for forming the Kronecker product and subsequent matrix multiplication. Finally, the construction of the least-squares matrix $ A $ in (26) takes $ O\left({L}^3{N}^3\right) $ time for the scalar multiplication of a matrix for each $ n,\mathrm{\ell},{\mathrm{\ell}}^{\prime } $ , and the least squares use $ O\left({P}^2{N}^3\right) $ space. As such, the total computational complexity for calculating a least-squares matrix is $ O\left({P}^2{L}^5+{P}^2{L}^2{N}^2+{L}^3{N}^3\right) $ time and $ O\left({P}^2{N}^3+{L}^3{N}^2+{PL}^5\right) $ space.

The computation of moments from noisy projection images in the Fourier–Bessel basis takes $ O\left({MN}^3+{N}^4\right) $ time and uses $ O\left({MN}^2+{N}^3\right) $ space. To convert this to polar coordinates in Fourier space, we must first evaluate the moments in the Fourier–Bessel basis. This takes $ O\left({N}^2\log N\right) $ time for each expansion, and we require $ 2{N}^2 $ such expansions (see app:basis). Hence, in total, this step takes $ O\left({N}^4\log N\right) $ time and uses $ O\left({MN}^2+{N}^4\right) $ space. Converting into Fourier space using the NUFFT takes $ O\left({N}^4\log N\right) $ time and uses $ O\left({N}^4\log N\right) $ space, and we do this $ 2{N}^2 $ times for a total of $ O\left({N}^6\log N\right) $ time and space complexity. Storing the final moment uses $ O\left({N}^3\right) $ space (since the resulting matrix from the NUFFT is block circulant). Overall, computing moments from images and converting them to polar coordinates in Fourier space take $ O\left({N}^6\log N+{MN}^3\right) $ time and use $ O\left({MN}^2+{N}^6\log N\right) $ space.

A.6. NDCG score

The NDCG⁽Reference Järvelin and Kekäläinen ²⁴ ⁾ is calculated by taking the discounted cumulative gain (DCG) and normalizing by ideal discounted cumulative gain (IDCG):

(28)

$$ {\displaystyle \begin{array}{c} DCG=\sum \limits_i\frac{\mathrm{true}\ \mathrm{score}\ \mathrm{of}\ \mathrm{item}\hskip0.35em i}{\log \left(i+1\right)},\\ {} IDCG=\sum \limits_i\frac{i^{th}\hskip0.35em \mathrm{highest}\ \mathrm{true}\ \mathrm{score}}{\log \left(i+1\right)},\\ {} NDCG=\frac{DCG}{IDCG},\end{array}} $$

where $ i $ is enumerated in the order induced by the predicted scores.

The NDCG puts weight on scores that are agreed to be high by both metrics. However, our metric and the metrics we compare to are dissimilarity scores, so we prefer weight on scores that are considered low by both metrics. To remedy this, we use the reverse of the order enumerated by the predicted scores. For the true scores, we first normalize the scores to the range $ \left[0,1\right] $ and then take the exponential $ {e}^{-s} $ for each true score $ s $ .

B. Additional Results

B.1. Parameter selection

In the experiments, we set the bandlimit parameters to $ P=6 $ and $ L=25 $ . Note that this value of $ P $ is comparable to previous work as described in the study by Sharon et al.⁽Reference Sharon, Kileel, Khoo, Landa and Singer ⁸ ⁾, whereas the higher value of $ L $ allows for a more accurate representation of the molecule in spherical harmonics. Furthermore, the hyperparameter $ \lambda $ was set to be 1. As shown in tab:lambda below, varying $ \lambda $ does not greatly impact the performance of the metric.

Table B1. Effect of the value of the hyperparameter $ \lambda $ on the ranking induced by $ {d}_{\mathrm{iKam}} $

Note: $ {A}_i $ denotes the structure with the $ i $ th lowest value of $ {d}_{\mathrm{iKam}} $ In each row, the entry shaded green indicates the ground truth structure.

B.2. Additional t-SNE plots

This appendix includes t-SNE visualizations of the Zernike metric on our database and the alignment metric on a subset of our database of size 100. The alignment metric is restricted to a subset of size 100 since calculating pairwise distances for a 1420 is computationally taxing; see Section 4.2. Visually, the Zernike t-SNE seems to have fewer distinct clusters than the t-SNE plot generated using $ {d}_{\mathrm{vKam}} $ and also groups molecules with different numbers of atoms together. It seems possible that Zernike metric is less discriminative, although this may also be an artifact of t-SNE’s dimensionality reduction.

Figure B1. Additional t-SNE plots. (a) t-SNE plot of pairwise Euclidean alignment distances on a subset of size 100. (b) t-SNE plot of pairwise Zernike distances on the entire database. (c) t-SNE plot of pairwise $ {d}_{\mathrm{vKam}} $ distances on the entire database.

B.3. Robustness to number of images

Here, we examine the robustness of our metric to inaccuracies of moment estimation. Specifically, we vary the number of noisy synthetic projection images that the metric has access to and record the highest-ranking structures.

Table B2. Effect of the number of projection images used for moment estimation on the ranking induced by $ {d}_{\mathrm{iKam}} $

Note: $ {A}_i $ denotes the structure with the $ i $ th lowest value of $ {d}_{\mathrm{iKam}} $ . In each row, the entry shaded green indicates the ground truth structure. The relative error in each moment is between the moment estimated from noisy projection images and the moment calculated from their clean counterparts.

B.4. Additional experimental results

Table B3 reports the metric’s rankings using experimental images corresponding to the five structures resolved from EMPIAR-10076.

Table B3. Performance of $ {d}_{\mathrm{iKam}} $ on structures 001, 002, 003, 004, and 005 of EMPIAR-10076

Note: $ {A}_i $ denotes the structure with the $ i $ th lowest value of $ {d}_{\mathrm{iKam}} $ . The value of $ \log \left({d}_{\mathrm{iKam}}\right) $ is reported next to each structure in parenthesis. In each row, the entry shaded green indicates the ground truth structure.

References

Kam, Z (1980) The reconstruction of structure from electron micrographs of randomly oriented particles. Journal of Theoretical Biology 82(1), 15–39.CrossRef Google Scholar PubMed

Bhamre, T, Zhang, T and Singer, A (2017) Anisotropic twicing for single particle reconstruction using autocorrelation analysis. Preprint, arXiv:1704.07969.Google Scholar

Bhamre, T, Zhang, T and Singer, A (2015) Orthogonal matrix retrieval in cryo-electron microscopy. In 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI). Brooklyn, NY, USA: IEEE, pp. 1048–1052.CrossRef Google Scholar

Levin, E, Bendory, T, Boumal, N, Kileel, J and Singer, A (2018) 3D ab initio modeling in cryo-EM by autocorrelation analysis. In 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018). Washington, DC, USA: IEEE, pp. 1569–1573.CrossRef Google Scholar

Bandeira, AS, Blum-Smith, B, Kileel, J, Niles-Weed, J, Perry, A and Wein, AS (2023) Estimation under group actions: Recovering orbits from invariants. Applied and Computational Harmonic Analysis, 66, 236–319.CrossRef Google Scholar

Huang, S, Zehni, M, Dokmanić, I and Zhao, Z (2023) Orthogonal matrix retrieval with spatial consensus for 3D unknown view tomography. SIAM Journal on Imaging Sciences 16(3), 1398–1439.CrossRef Google Scholar

Bendory, T, Khoo, Y, Kileel, J, Mickelin, O and Singer, A (2023) Autocorrelation analysis for cryo-EM with sparsity constraints: improved sample complexity and projection-based algorithms. Proceedings of the National Academy of Sciences 120(18), e2216507120.CrossRef Google Scholar PubMed

Sharon, N, Kileel, J, Khoo, Y, Landa, B and Singer, A (2020) Method of moments for 3-D single particle ab initio modeling with non-uniform distribution of viewing angles. Inverse Problems 36(4), 044003.CrossRef Google Scholar

Bendory, T, Hadi, I and Sharon, N (2022) Compactification of the rigid motions group in image processing. SIAM Journal on Imaging Sciences 15(3), 1041–1078.CrossRef Google Scholar

Berman, HM (2000) The protein data bank. Nucleic Acids Research 28(1):235–242.CrossRef Google Scholar PubMed

NIST Digital Library of Mathematical Functions. Release 1.1.11 of 2023-09-15. In Olver, FWJ, Olde Daalhuis, AB, Lozier, DW, Schneider, BI, Boisvert, RF, Clark, CW, Miller, BR, Saunders, BV, Cohl, HS and McClain, MA (eds.). Available at https://dlmf.nist.gov/.Google Scholar

Ponce, C and Singer, A (2011) Computing steerable principal components of a large set of images and their rotations. IEEE Transactions on Image Processing 20(11), 3051–3062.CrossRef Google Scholar PubMed

Bredon, GE (1993) Topology and geometry (Vol. 139). Springer New York, NY: Springer Science & Business Media.CrossRef Google Scholar

Scheres, SHW (2012) RELION: implementation of a Bayesian approach to cryo-EM structure determination. Journal of Structural Biology 180(3), 519–530.CrossRef Google Scholar PubMed

Scheres, SHW (2012) A Bayesian view on cryo-EM structure determination. Journal of Molecular Biology 415(2), 406–418.CrossRef Google Scholar PubMed

Pettersen, EF, Goddard, TD, Huang, CC, Meng, EC, Couch, GS, Croll, TI, Morris, JH and Ferrin, TEF (2020) UCSF ChimeraX: Structure visualization for researchers, educators, and developers. Protein Science 30(1), 70–82.CrossRef Google Scholar PubMed

Rangan, AV (2022) Radial recombination for rigid rotational alignment of images and volumes. Inverse Problems 39(1), 015003.CrossRef Google Scholar

Harpaz, Y and Shkolnisky, Y (2023) Three-dimensional alignment of density maps in cryo-electron microscopy. Biological Imaging 3, e8.CrossRef Google Scholar PubMed

Bartesaghi, A, Sprechmann, P, Liu, J, Randall, G, Sapiro, G and Subramaniam, S (2008) Classification and 3d averaging with missing wedge correction in biological electron tomography. Journal of Structural Biology 162(3):436–450.CrossRef Google Scholar PubMed

Xu, M, Beck, M and Alber, F (2012) High-throughput subtomogram alignment and classification by fourier space constrained fast volumetric matching. Journal of Structural Biology 178(2), 152–164.CrossRef Google Scholar PubMed

Singer, A and Yang, R (2023) Alignment of density maps in Wasserstein distance. Preprint, arXiv:2305.12310.Google Scholar

Guzenko, D, Burley, SK and Duarte, JM (2020) Real time structural search of the Protein Data Bank. PLoS Computational Biology 16(7), e1007970.CrossRef Google Scholar PubMed

Cianfrocco, MA and Kellogg, EH (2020) What could go wrong? A practical guide to single-particle cryo-EM: From biochemistry to atomic models. Journal of Chemical Information and Modeling 60(5), 2458–2469.CrossRef Google Scholar

Järvelin, K and Kekäläinen, J (2002) Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) 20(4), 422–446.CrossRef Google Scholar

Van der Maaten, L and Hinton, G (2008) Visualizing data using t-SNE. Journal of Machine Learning Research 9(86), 2579−2605.Google Scholar

Yang, F, Guo, L, Li, Y, Wang, G, Wang, J, Zhang, C, Fang, GX, Chen, X, Liu, L, Yan, X and Liu, Q (2021) Structure, function and pharmacology of human itch receptor complexes. Nature 600(7887), 164–169.CrossRef Google Scholar PubMed

Watson, GS (1982) Distributions on the circle and sphere. Journal of Applied Probability 19(A), 265–280.CrossRef Google Scholar

Wright, G, Andén, J, Bansal, V, Xia, J, Langfield, C, Carmichael, J, Brook, R, Shi, Y, Heimowitz, A, Pragier, G, Sason, I, Moscovich, A, Shkolnisky, Y and and Singer, A (2023) ASPIRE – Algorithms for single particle reconstruction software package.Google Scholar

Verbeke, EJ, Mallam, AL, Drew, K, Marcotte, EM and Taylor, DW (2018) Classification of single particles from human cell extract reveals distinct structures. Cell Reports, 24(1), 259–268.e3.CrossRef Google Scholar PubMed

Shang, Z and Sigworth, FJ (2012) Hydration-layer models for cryo-em image simulation. Journal of Structural Biology 180(1), 10–16.CrossRef Google Scholar PubMed

Rosenthal, PB and Henderson, R (2003) Optimal determination of particle orientation, absolute hand, and contrast loss in single-particle electron cryomicroscopy. Journal of Molecular Biology 333(4), 721–745.CrossRef Google Scholar PubMed

Iudin, A, Korir, PK, Somasundharam, S, Weyand, S, Cattavitello, C, Fonseca, N, Salih, O, Kleywegt, GJ and Patwardhan, A (2022) EMPIAR: The electron microscopy public image archive. Nucleic Acids Research 51(D1), D1503–D1511.CrossRef Google Scholar

Punjani, A, Rubinstein, JL, Fleet, DJ and Brubaker, MA (2017) cryoSPARC: Algorithms for rapid unsupervised cryo-EM structure determination. Nature Methods 14(3), 290–296.CrossRef Google Scholar PubMed

Davis, JH, Tan, YZ, Carragher, B, Potter, CS, Lyumkis, D and Williamson, JR (2016) Modular assembly of the bacterial large ribosomal subunit. Cell 167(6), 1610–1622.e15.CrossRef Google Scholar PubMed

Turner, J and The wwPDB Consortium (2023) EMDB - The Electron Microscopy Data Bank.CrossRef Google Scholar

Khoo, Y, Paul, S and Sharon, N (2023) Deep neural-network prior for orbit recovery from method of moments. Journal of Computational and Applied Mathematics, 444, 115782.CrossRef Google Scholar

Biedenharn, LC and Louck, JD (1981) Angular Momentum in Quantum Physics: Theory and Application, Volume 8 of Encyclopedia of Mathematics and Its Applications. Reading, MA: Addison-Wesley Publishing Co..Google Scholar

Barnett, AH, Magland, J and Klinteberg, L (2019) A parallel nonuniform fast Fourier transform library based on an “exponential of semicircle”’ kernel. SIAM Journal on Scientific Computing 41(5), C479–C504.CrossRef Google Scholar

Barnett, AH (2021) Aliasing error of the

$ \exp \left(\beta \sqrt{1-{z}^2}\right) $ kernel in the nonuniform fast Fourier transform. Applied and Computational Harmonic Analysis 51, 1–16.CrossRef Google Scholar

Peng, L-M, Ren, G, Dudarev, S and Whelan, M (1996) Robust parameterization of elastic and absorptive electron atomic scattering factors. Acta Crystallographica Section A, Foundations of Crystallography 52, 257–276.CrossRef Google Scholar

Singer, A (2021) Wilson statistics: Derivation, generalization and applications to electron cryomicroscopy. Acta Crystallographica Section A, Foundations and Advances 77(Pt 5), 472–479.CrossRef Google Scholar

Driscoll, JR and Healy, DM (1994) Computing Fourier transforms and convolutions on the 2-sphere. Advances in Applied Mathematics 15(2), 202–250.CrossRef Google Scholar

Wieczorek, MA and Meschede, M (2018) SHTools: Tools for working with spherical harmonics. Geochemistry, Geophysics, Geosystems 19(8), 2574–2592.CrossRef Google Scholar

Marshall, NF, Mickelin, O, Shi, Y and Singer, A (2023) Fast principal component analysis for cryo-electron microscopy images. Biological Imaging 3, e2.CrossRef Google Scholar PubMed

Marshall, NF, Mickelin, O and Singer, A (2023) Fast expansion into harmonics on the disk: A steerable basis with fast radial convolutions. SIAM Journal on Scientific Computing 45(5), A2431–A2457.CrossRef Google Scholar

Figure 1. Comparison between $ {d}_{\mathrm{vKam}} $, the Zernike metric, and Euclidean alignment. a-c) A random size 100 subset of the database is selected. Then, pairwise similarity metrics are calculated and plotted, where each point represents a pair of structures. The NDCG score is calculated using the metric on the $ y $-axis as the predicted metric, and the metric on the $ x $-axis as the true metric. d) The procedure is repeated with 10 randomly selected size 100 subsets, and the mean ($ \mu $) and standard deviation ($ \sigma $) of the NDCG scores are calculated. The error bars and points visualize $ \mu \pm \sigma $.

Figure 2. 2D embedding of protein structures based on their similarity using $ {d}_{\mathrm{vKam}} $. The analytical moments of 1420 proteins were computed and compared using (6), and t-SNE was applied for visualization. Each node represents a single structure and is colored by the number of atoms. Distinct clusters containing homologous or similarly shaped structures suggest that $ {d}_{\mathrm{vKam}} $ provides interpretable results.

Figure 4. Histogram ranking of dissimilarities computed using $ {d}_{\mathrm{iKam}} $ on simulated noisy projection images generated from PDB-7VV3.

Figure 6. Comparison between $ {d}_{\mathrm{iKam}} $, computed from simulated images, and the Zernike metric, computed from volumes. Here, we repeat simulated experiments 100 times. Then, the size of the intersection of the top ten structures returned by $ {d}_{\mathrm{iKam}} $ and the Zernike metric is plotted as a histogram.