short communications\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

Journal logoFOUNDATIONS
ADVANCES
ISSN: 2053-2733

Rapid calculation of RMSDs using a quaternion-based characteristic polynomial

crossmark logo

aDepartment of Chemistry and Biochemistry, University of Colorado at Boulder, Boulder, CO 80309-0215, USA
*Correspondence e-mail: theobal@colorado.edu

(Received 15 April 2005; accepted 13 May 2005)

A common measure of conformational similarity in structural bioinformatics is the minimum root mean square deviation (RMSD) between the coordinates of two macromolecules. In many applications, the rotations relating the structures are not needed. Several common algorithms for calculating RMSDs require the computationally costly procedures of determining either the eigen decomposition or matrix inversion of a [3\times3] or [4\times4] matrix. Using a quaternion-based method, here a simple algorithm is developed that rapidly and stably determines RMSDs by circumventing the decomposition and inversion problems.

1. Introduction

Orthogonal rotations are commonly used for comparing macromolecular structures, and the root mean square deviation (RMSD) is a natural metric for quantitating the similarity of two optimally rotated structures (Flower, 1999[Flower, D. R. (1999). J. Mol. Graph. Model. 17, 238-244.]). Several least-squares algorithms have been proposed for finding the rotation that minimizes the squared distances between corresponding atoms in the structures. The most efficient of these methods requires the eigen decomposition (or inversion) of a key matrix constructed from the sums and squares of the atomic coordinates in the structures being compared. The popular method of Kabsch (1978[Kabsch, W. (1978). Acta Cryst. A34, 827-828.]) involves a [3\times3] key matrix, whereas the methods of Diamond (1988[Diamond, R. (1988). Acta Cryst., A44, 211-216.]), Horn (1987[Horn, B. K. P. (1987). J. Opt. Soc. Am. A Opt. Image Sci. Vis. 4, 629-642.]) and Kearsley (1989[Kearsley, S. K. (1989). Acta Cryst. A45, 208-210.]) use quaternion representations of the rotations with [4\times4] key matrices. In each of these methods, the minimizing rotation is formed from the eigenvectors of the key matrices and the minimizing RMSD is found from the eigenvalues of the key matrices.

In Kabsch's method, improper rotation matrices (rotoinversions) may arise and one must verify that the determinant of the rotation matrix is positive (Kabsch, 1978[Kabsch, W. (1978). Acta Cryst. A34, 827-828.]). Quaternion-based methods lack this complication since quaternion rotations are always proper. Thus, with quaternion-based methods the minimum RMSD can be calculated using the eigenvalues of the key matrix alone, without knowing the eigenvectors. If only RMSDs are desired, the quaternion-based methods can save a considerable amount of computation by avoiding a complete eigen decomposition of the key matrix.

Diamond (1988[Diamond, R. (1988). Acta Cryst., A44, 211-216.]) proposed the fastest method known for determining minimum RMSDs using an iterative algorithm. Diamond's method requires one [3\times3] matrix inversion per cycle in order to find the largest eigenvalue of a quaternion-based [4\times4] matrix. While matrix inversions are generally expensive, the inversion of a [3\times3] matrix can be calculated analytically and relatively quickly. Diamond's method, however, is unstable when the minimizing rotation is near 180°. This problem is especially grave when the coordinate differences are of magnitude similar to the coordinates themselves, such as is common when superimposing relatively small fragments of proteins or nucleic acids.

Based on Horn's quaternion superposition method (Horn, 1987[Horn, B. K. P. (1987). J. Opt. Soc. Am. A Opt. Image Sci. Vis. 4, 629-642.]), which is relatively unknown in structural biology, an extremely fast, numerically stable and accurate algorithm is presented here for determining minimum RMSDs. This algorithm does not require separate consideration of special problematic cases and it bypasses the computationally costly diagonalization and inversions commonly used to find the needed eigenvalues. Rather, a quick and simple Newton–Raphson root-finding method is used to determine the appropriate eigenvalue from the characteristic polynomial of the key matrix. This quaternion-based characteristic polynomial (QCP) algorithm requires significantly less computation than the alternate methods, including Diamond's.

2. The superposition problem

A molecular structure consisting of N atoms can be represented as an N×3 matrix, where the three columns correspond to the x, y and z coordinates. The superposition objective is to find the orthogonal rotation and translation that minimizes the squared Euclidean distance between the rows of two matrices corresponding to the two structures being compared. The translational component of the problem can be removed from the outset by translating each structure so that its respective centroid is at the origin. When superimposing structure [{\bf B}] onto structure [{\bf A}], in matrix notation the problem is to minimize the sum of squared errors E with respect to the orthogonal rotation [{\bf R}]:

[E = \Vert{ {\bf B} {\bf R} - {\bf A} }\Vert{}_F^{2},\eqno(1)]

where [\Vert{{\bf X}}\Vert{}_F] denotes the Frobenius (or Euclidean) norm of the matrix [{\bf X}],

[\Vert{{\bf X}}\Vert{}_F = [{{\rm tr}({{{\bf X}'}{\bf X}}})]{}^{1/2},\eqno(2)]

and [{{\bf X}'}] denotes the transpose of [{\bf X}] and [{\rm tr}\,{{\bf X}}] denotes the matrix trace of [{\bf X}]. Upon expansion of equation (1)[link], it can be shown that

[E = G_A + G_B - 2 {\rm tr}({{\bf M}{\bf R}}) = (N) \rm{RMSD}^2,\eqno(3)]

where GA is the inner product of structure [{\bf A}],

[G_A = {\rm tr}({{{\bf A}'} {\bf A}}) = \textstyle\sum\limits_i^N (x_{A,i}^2 + y_{A,i}^2 + z_{A,i}^2),]

and M is the inner product of the matrices A and B. The inner product matrix M is given by

[{\bf M} = {{\bf B}'} {\bf A} = \left[\matrix{ S_{xx} & S_{xy} & S_{xz} \cr S_{yx} & S_{yy} & S_{yz} \cr S_{zx} & S_{zy} & S_{zz} }\right],\eqno(4)]

where

[S_{xy} = \textstyle\sum\limits_i^N {x_{B,i} y_{A,i}}.\eqno(5)]

3. A quaternion-based characteristic quartic polynomial for calculating RMSDs

Horn (1987[Horn, B. K. P. (1987). J. Opt. Soc. Am. A Opt. Image Sci. Vis. 4, 629-642.]) has shown that the sum of squared errors E can be minimized by solving for the largest positive eigenvalue [\lambda_{\max}] of a [4\times4] symmetric key matrix K:

[\left[\matrix{ (S_{xx} + S_{yy} + S_{zz}) & S_{yz} - S_{zy} & S_{zx} - S_{xz} & S_{xy} - S_{yx} \cr S_{yz} - S_{zy} & (S_{xx} - S_{yy} - S_{zz}) & S_{xy} + S_{yx} & S_{zx} + S_{xz} \cr S_{zx} - S_{xz} & S_{xy} + S_{yx} & (-S_{xx} + S_{yy} - S_{zz}) & S_{yz} + S_{zy} \cr S_{xy} - S_{yx} & S_{zx} + S_{xz} & S_{yz} + S_{zy} & (-S_{xx} - S_{yy} + S_{zz}) }\right].]

The eigenvector corresponding to the largest eigenvalue of K is a quaternion equivalent to the optimal rotation, where

[{\rm RMSD} = [({{{G_A + G_B - 2 \lambda_{\max}})/{N}}}]{}^{1/2}.\eqno(6)]

Eigenvalues are usually determined by diagonalization of the matrix. Alternatively, one can find the eigenvalues of a symmetric matrix by locating the roots of the characteristic polynomial of the matrix. With [|{{\bf X}}|] denoting the determinant of matrix [{\bf X}], the characteristic equation for the key matrix [{\bf K}] is a quartic given by

[C_4 \lambda^4 + C_3 \lambda^3 + C_2 \lambda^2 + C_1 \lambda + C_0 = 0,]

where

[\eqalign{ C_4 & = 1\cr C_3 & = -{\rm tr}\,{{\bf K}} = 0\cr C_2 & = -2 {\rm tr}\,{{{\bf M}'}{\bf M}}\cr & = -2 (S_{xx}^2 + S_{xy}^2 + S_{xz}^2 + S_{yx}^2 + S_{yy}^2 + S_{yz}^2 + S_{zx}^2 + S_{zy}^2 + S_{zz}^2) \cr C_1 & = -8 |{{\bf M}}| = 8 (S_{xx} S_{yz} S_{zy} + S_{yy} S_{zx} S_{xz} + S_{zz} S_{xy} S_{yx}) \cr &\quad -8 (S_{xx} S_{yy} S_{zz} + S_{yz} S_{zx} S_{xy} + S_{zy} S_{yx} S_{xz}) \cr C_0 & = |{{\bf K}}| = D + E + F + G + H + I,}]

where

[\eqalign{ D & = (S_{xy}^2 + S_{xz}^2 - S_{yx}^2 - S_{zx}^2)^2\cr E & = [-S_{xx}^2 + S_{yy}^2 + S_{zz}^2 + S_{yz}^2 + S_{zy}^2 - 2 (S_{yy} S_{zz} - S_{yz} S_{zy})] \cr &\quad\times [-S_{xx}^2 + S_{yy}^2 + S_{zz}^2 + S_{yz}^2 + S_{zy}^2 + 2 (S_{yy} S_{zz} - S_{yz} S_{zy})] \cr F & = [-(S_{xz} + S_{zx}) (S_{yz} - S_{zy}) + (S_{xy} - S_{yx}) (S_{xx} - S_{yy} - S_{zz})] \cr &\quad\times [-(S_{xz} - S_{zx}) (S_{yz} + S_{zy}) + (S_{xy} - S_{yx}) (S_{xx} - S_{yy} + S_{zz})] \cr G & = [-(S_{xz} + S_{zx}) (S_{yz} + S_{zy}) - (S_{xy} + S_{yx}) (S_{xx} + S_{yy} - S_{zz})]\cr &\quad \times [-(S_{xz} - S_{zx}) (S_{yz} - S_{zy}) - (S_{xy} + S_{yx}) (S_{xx} + S_{yy} + S_{zz})] \cr H & = [(S_{xy} + S_{yx}) (S_{yz} + S_{zy}) + (S_{xz} + S_{zx}) (S_{xx} - S_{yy} + S_{zz})]\cr &\quad \times [-(S_{xy} - S_{yx}) (S_{yz} - S_{zy}) + (S_{xz} + S_{zx}) (S_{xx} + S_{yy} + S_{zz})] \cr I & = [(S_{xy} + S_{yx}) (S_{yz} - S_{zy}) + (S_{xz} - S_{zx}) (S_{xx} - S_{yy} - S_{zz})]\cr & \quad\times [-(S_{xy} - S_{yx}) (S_{yz} + S_{zy}) + (S_{xz} - S_{zx}) (S_{xx} + S_{yy} - S_{zz})]. }]

Using these coefficients, the characteristic equation can be reduced to the following simple form:

[P(\lambda) = \lambda^4 + C_2 \lambda^2 + C_1 \lambda + C_0 = 0.\eqno(8)]

Although the coefficient equations may appear somewhat horrendous, coding them is straightforward. With careful programming, by avoiding duplicate evaluations of several terms (such as [S_{yz} - S_{zy}] which appears four times), calculating the coefficients from the sums found in equation (4)[link] involves at most only 66 floating point operations (FLOPs).

Horn (1987[Horn, B. K. P. (1987). J. Opt. Soc. Am. A Opt. Image Sci. Vis. 4, 629-642.]) proposed solving this characteristic equation using Ferrari's method (Abramowitz & Stegun, 1972[Abramowitz, M. & Stegun, I. A. (1972). Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th ed., pp. 17-18. New York: Dover.]) for quartics. While analytically exact, Ferrari's method has seen little use as it is computationally slow and notoriously numerically unstable. Additionally, Horn (1987[Horn, B. K. P. (1987). J. Opt. Soc. Am. A Opt. Image Sci. Vis. 4, 629-642.]) omits expressions for C0 and an equation given for C2 contains an important typographical error.

Fortunately in this case, the largest positive root of the characteristic polynomial given in equation (8)[link] can be found very quickly and with excellent numerical stability using the Newton–Raphson algor­ithm. Successful use of the Newton–Raphson method requires a good initial guess for the desired root. From equation (3)[link] and noting that the largest positive eigenvalue of the key matrix K is equal to [{\rm tr}({{\bf M}{\bf R}})] with the optimal rotation, the largest positive root of the characteristic polynomial attains its maximum possible value when the two structures are in identical conformations (i.e. when [\rm{RMSD} = 0]). Thus, the maximum possible value of the largest positive root is constrained to be less than or equal to the average of the inner products of the the two structures, (GA + GB) / 2. Using this value as an initial guess in the Newton–Raphson algorithm is essentially fool-proof, since the algorithm will converge first on the largest root of the polynomial. The Newton–Raphson QCP algorithm can be described by the following simple pseudocode:

[\displaylines{\hbox{while}(| \lambda - \lambda_{\rm old} | \,\lt\, {\rm precision})\hfill\cr\quad\lambda_{\rm old} = \lambda\semi\hfill\cr\quad\lambda = \lambda - P(\lambda) / [{{\rm d}P(\lambda)}/{{\rm d}\lambda}]\semi\hfill}]

where the first derivative of the polynomial is given by [{\rm d}P(\lambda)/{\rm d}\lambda = 4 \lambda^3 + 2 C_2 \lambda + C_1].

The Newton–Raphson QCP algorithm given above is much faster than other common methods that require determination of eigenvectors, and it is also faster than methods using eigen decomposition for only eigenvalues. The efficiency and speed of the QCP Newton–Raphson algorithm were compared to the popular method using Householder reduction to tridiagonal form followed by QL decomposition with implicit shifts (H–QL method; Golub & Van Loan, 1996[Golub, G. H. & Van Loan, C. F. (1996). Matrix Computations, 3rd ed. Baltimore: Johns Hopkins University Press.]). Each iteration of the QCP algorithm requires merely 11 FLOPS, of which none are square roots. In contrast, the H–QL algorithm requires a comparable number of total FLOPs but over two dozen costly square roots. Diamond's (1988[Diamond, R. (1988). Acta Cryst., A44, 211-216.]) method is likewise much more efficient than the eigen decomposition methods and also requires no square roots. Each cycle of the Diamond iteration uses [\approx 70] FLOPS (accounting for symmetry considerations), plus 15 FLOPS for set-up. Since both the QCP and Diamond's methods take an average of about five iterations to converge to a relative precision of [10^{-6}], the QCP method uses only about one third of the operations of Diamond's method for eigenvalue determination. Thus, essentially all of the computational effort in the QCP method is expended in calculating the summations of the key matrix [{\bf M}] given in equation (4)[link], a step that is common to all methods and dependent on the number of atoms compared.

The QCP algorithm has been benchmarked on several computing platforms (OSX, Linux and Irix) for [\gt 10^{8}] independent superposition calculations with protein fragments ranging from 5 to 500 residues. In all cases, the same RMSD solution was found, within error, as the Kabsch and Kearsley methods. In contrast, especially with smaller fragments, Diamond's method converged on an incorrect solution (corresponding to the second largest eigenvalue) in a significant fraction of cases. In the author's experience, if the construction of the key matrices is ignored, the QCP algorithm is over four times as fast as Diamond's method and 30 to 70 times faster than the eigen decomposition methods. An ANSI C library implementing the QCP method is available at https://monkshood.colorado.edu/QCP/ or by request from the author.

Acknowledgements

I thank B. K. P. Horn for helpful discussion, an anonymous referee for alerting me to Diamond's scalar method and the American Cancer Society for funding.

References

First citationAbramowitz, M. & Stegun, I. A. (1972). Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th ed., pp. 17–18. New York: Dover.  Google Scholar
First citationDiamond, R. (1988). Acta Cryst., A44, 211–216.  CrossRef Web of Science IUCr Journals Google Scholar
First citationFlower, D. R. (1999). J. Mol. Graph. Model. 17, 238–244.  Web of Science PubMed CAS Google Scholar
First citationGolub, G. H. & Van Loan, C. F. (1996). Matrix Computations, 3rd ed. Baltimore: Johns Hopkins University Press.  Google Scholar
First citationHorn, B. K. P. (1987). J. Opt. Soc. Am. A Opt. Image Sci. Vis. 4, 629–642.  CrossRef Web of Science Google Scholar
First citationKabsch, W. (1978). Acta Cryst. A34, 827–828.  CrossRef IUCr Journals Web of Science Google Scholar
First citationKearsley, S. K. (1989). Acta Cryst. A45, 208–210.  CrossRef CAS Web of Science IUCr Journals Google Scholar

This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.

Journal logoFOUNDATIONS
ADVANCES
ISSN: 2053-2733
Follow Acta Cryst. A
Sign up for e-alerts
Follow Acta Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds