teaching and education\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

Journal logoJOURNAL OF
APPLIED
CRYSTALLOGRAPHY
ISSN: 1600-5767

From atoms to bonds, angles and torsions: mol­ecular metrics from crystal space, and two Excel implementations

CROSSMARK_Color_square_no_text.svg

aCurtin Institute for Computation, Discipline of Chemistry, Curtin University, GPO Box U1987, Perth, WA 6845, Australia
*Correspondence e-mail: l.glasser@curtin.edu.au

Edited by J. M. García-Ruiz, Instituto Andaluz de Ciencias de la Tierra, Granada, Spain (Received 11 March 2020; accepted 1 June 2020; online 16 July 2020)

Values of molecular bond lengths, bond angles and (less frequently) bond torsion angles are readily available from databases, from crystallographic software, and/or from interactive molecular and crystal visualization programs such as Jmol. However, the methods used to calculate these values are less well known. In this paper, the computational methods are described in detail, and live Excel implementations, which permit readers to readily perform the calculations for their own molecular systems, are provided. The methods described apply to both fractional coordinates in crystal space and Cartesian coordinates in Euclidean space (space in which the geometric postulates of Euclid are valid) and are vector/matrix based. In their simplest computational form, they are applied as algebraic expansions which are summed. They are also available in matrix formulations, which are readily manipulated and calculated using the matrix functions of Excel. In particular, their general formulation as metric matrices is introduced. The methods in use are illustrated by a detailed example of the calculations. This contribution provides a significant practical application which can also act as motivation for the study of matrix mathematics with respect to its many uses in chemistry.

1. Introduction

Students and teachers of chemistry are familiar with the lengths of and angles between chemical bonds, and even with torsion angles defining molecular conformations. They are also familiar with the source of these data, which generally arise from diffraction experiments, principally with X-ray sources but also with electron or neutron sources. However, the methods of calculation which yield these values are less familiar, often hidden in crystallographic computational programs, and not generally accessible to the non-expert. The difficulties of these calculations are compounded by the fact that the experimental results are generally presented in crystal spaces, which best represent the crystalline symmetry but which are not readily manipulated by those only familiar with everyday orthogonal Euclidean space described using Cartesian coordinates. Even for students who have the opportunity to perform a full X-ray structure determination (Kantardjieff, 2010[Kantardjieff, K. (2010). J. Appl. Cryst. 43, 1276-1282.]; Chapuis, 2011[Chapuis, G. (2011). Crystallogr. Rev. 17, 187-204.]; Aldeborgh et al., 2014[Aldeborgh, H., George, K., Howe, M., Lowman, H., Moustakas, H., Strunsky, N. & Tanski, J. M. (2014). J. Chem. Crystallogr. 44, 70-81.]; Gražulis et al., 2015[Gražulis, S., Sarjeant, A. A., Moeck, P., Stone-Sundberg, J., Snyder, T. J., Kaminsky, W., Oliver, A. G., Stern, C. L., Dawe, L. N., Rychkov, D. A., Losev, E. A., Boldyreva, E. V., Tanski, J. M., Bernstein, J., Rabeh, W. M. & Kantardjieff, K. A. (2015). J. Appl. Cryst. 48, 1964-1975.]), it appears that the structural details are obtained from the computational software, without students having the opportunity of acquainting themselves with the calculation methods.

The fundamental data of crystallography, that is, chemical composition, unit-cell dimensions, space-group symmetry and fractional atom coordinates, are reported as the results of the diffraction experiments. These data are made widely available by publication both in the primary literature and, since the 1990s, via deposition as crystallographic information files (Hall et al., 1991[Hall, S. R., Allen, F. H. & Brown, I. D. (1991). Acta Cryst. A47, 655-685.]; https://www.iucr.org/resources/cif) (*.cif), which the major crystallographic journals currently require; the CIFs are subject to extensive checks (https://journals.iucr.org/services/cif/checkcif.html) in order to ensure their integrity. These files are freely available (Glasser, 2016[Glasser, L. (2016). J. Chem. Educ. 93, 542-549.]) and large collections have been made into databases, which may be free or commercial, such as the Cambridge Crystallographic Database (CSD, for organic materials – now exceeding one million entries; Groom et al., 2016[Groom, C. R., Bruno, I. J., Lightfoot, M. P. & Ward, S. C. (2016). Acta Cryst. B72, 171-179.]; https://www.ccdc.cam.ac.uk/products/csd/), the Inorganic Crystal Structure Database (Zagorac et al., 2019[Zagorac, D., Müller, H., Ruehl, S., Zagorac, J. & Rehme, S. (2019). J. Appl. Cryst. 52, 918-925.]), the free Crystallography Open Database (Gražulis et al., 2009[Gražulis, S., Chateigner, D., Downs, R. T., Yokochi, A. F. T., Quirós, M., Lutterotti, L., Manakova, E., Butkus, J., Moeck, P. & Le Bail, A. (2009). J. Appl. Cryst. 42, 726-729.]), the free American Mineralogist Crystal Structure Database (Downs & Hall-Wallace, 2003[Downs, R. T. & Hall-Wallace, M. (2003). Am. Miner. 88, 247-250.]), the free RCSB Protein Data Bank (PDB; Berman et al., 2000[Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N. & Bourne, P. E. (2000). Nucleic Acids Res. 28, 235-242.]) and so forth (Glasser, 2016[Glasser, L. (2016). J. Chem. Educ. 93, 542-549.]). However, the CIF generally provides only the basic information, and the molecular metrics of bond lengths, bond angles and torsion angles around bonds may still need to be generated.

There are many molecular visualization programs which permit the user to import CIFs, display atoms, and select atom pairs to generate bond lengths, atom triplets to generate bond angles and even sequential atom quadruplets to generate bond torsions. Principal among these are the CSD's online WebCSD (Thomas et al., 2010[Thomas, I. R., Bruno, I. J., Cole, J. C., Macrae, C. F., Pidcock, E. & Wood, P. A. (2010). J. Appl. Cryst. 43, 362-366.]) with its free downloadable Mercury program (Macrae et al., 2020[Macrae, C. F., Sovago, I., Cottrell, S. J., Galek, P. T. A., McCabe, P., Pidcock, E., Platings, M., Shields, G. P., Stevens, J. S., Towler, M. & Wood, P. A. (2020). J. Appl. Cryst. 53, 226-235.]; https://www.ccdc.cam.ac.uk/Community/csd-community/freemercury/), and the free down­loadable Jmol (https://www.jmol.org/), Avogadro (https://avogadro.cc/), VESTA (Momma & Izumi, 2011[Momma, K. & Izumi, F. (2011). J. Appl. Cryst. 44, 1272-1276.]) and CrystalOgraph (https://www.epfl.ch/schools/sb/research/iphys/teaching/crystallography/crystalograph/) programs (but the last provides only bond lengths).

It is the purpose of the current paper to introduce the reader to the methods of calculation of molecular metrics from crystal space (Dunitz, 1995[Dunitz, J. D. (1995). X-ray Analysis and the Structure of Organic Molecules, 2nd ed. Basel: Verlag Helvetica Chimica Acta.]; Julian, 2014[Julian, M. M. (2014). Foundations of Crystallography with Computer Applications, 2nd ed. Abingdon-on-Thames: CRC Press.]). In order to allow these rather complex calculations to be readily performed by the reader, two separate live Excel implementations of the methods are supplied, in which users can easily insert their own data. One is a `black-box' implementation of the BASIC code in Appendix I of Dunitz's book (Dunitz, 1995[Dunitz, J. D. (1995). X-ray Analysis and the Structure of Organic Molecules, 2nd ed. Basel: Verlag Helvetica Chimica Acta.], pp. 495–497), while the second lays out the matrix calculations in detail, demonstrating the mathematical processes involved. These programs parallel an earlier implementation by the author using the proprietary MathCad software (Glasser, 1993[Glasser, L. (1993). Comput. Chem. 17, 107-108.]).

We do not delve into the complexities of reciprocal spaces applicable in crystallographic calculations, since this extends into specialist applications of crystallography. However, it is of interest to note that the direct and reciprocal lattices are mutual Fourier transforms, with the momentum difference between incoming and diffracted X-rays of a crystal being a reciprocal lattice vector (https://www.doitpoms.ac.uk/tlplib/reciprocal_lattice/index.php).

2. Calculations using Cartesian coordinates

The atomic data in CIFs are listed in fractional coordinates, x, y, z, but it may be simpler to calculate the molecular metrics using Cartesian coordinates, X, Y, Z. (It is, however, possible to obtain the same results directly from the fractional coordinates, and this is considered in a subsequent section.)

2.1. Cartesian coordinates, X, Y, Z

The following matrix equation (McRee, 1993a[McRee, D. E. (1993a). Practical Protein Crystallography. San Diego: Academic Press.],b[McRee, D. E. (1993b). Practical Protein Crystallography, https://www.sciencedirect.com/book/9780124860506/practical-protein-crystallography.]; https://www.ruppweb.org/Xray/tutorial/Coordinate%20system%20transformation.htm) (easily applied using the Excel array function MMULT) transforms fractional coordinates, x, y, z, in crystal space into Cartesian coordinates, X, Y, Z, using the crystal cell constants a, b, c, α, β, γ:

[\left(\matrix{X \cr Y \cr Z }\right) & = \left[ \matrix { a & b\cos \gamma & c \cos \beta \cr 0 & b\sin \gamma & \displaystyle{{c(\cos \alpha - \cos\beta \cos\gamma)}\over{\sin \gamma}} \cr 0 & 0 & \displaystyle{{V}\over{ab\sin \gamma}} }\right]\cdot \left(\matrix{x \cr y \cr z}\right), \eqno(1a)]

where the volume of the unit cell is

[V = abc(1 - \cos^2\alpha - \cos^2\beta - \cos^2\gamma + 2 \cos\alpha \cos\beta \cos\gamma)^{1/2}. \eqno(1b)]

The OpenBabel Chemical Formatter (O'Boyle et al., 2011[O'Boyle, N. M., Banck, M., James, C. A., Morley, C., Vandermeersch, T. & Hutchison, G. R. (2011). J. Cheminform. 3, 33. ]; https://www.cheminfo.org/Chemistry/Cheminformatics/FormatConverter/index.html) provides a convenient facility to convert CIFs directly to the Cartesian XYZ format.

The volume, V, may equivalently be obtained in matrix terms (using the Excel function MDETERM) as the square root of the determinant of the metric matrix, G, which is introduced in Section 3[link].

2.2. Vectors

We use bold face, e.g. v, to represent a vector, while |v| or italic v represents the magnitude (length – a scalar) of the vector. The italic form is used in algebraic expressions.

Vectors are described in terms of their coordinates along basis axes. For a general vector pi pointing from the coordinate origin to a point i (e.g. an atom centre) along the crystal axes a, b, c,

[{{\bf{p}}_i} = \left({\matrix{ {{x_i}{\bf{a}}} \cr {{y_i}{\bf{b}}} \cr {{z_i}{\bf{c}}} \cr } } \right) \to \left({\matrix{ {{X_i}} \cr {{Y_i}} \cr {{Z_i}} \cr } } \right), \eqno(2a)]

where the arrow represents transformation through equation (1a)[link] from coordinates in crystal space to Cartesian coordinates.

The vector length is calculated using the square root of the dot (or scalar) product function of the vector with itself:

[|{{\bf{p}}_i}{|^2} = {{\bf{p}}_i}\cdot{{\bf{p}}_i} = \left({\matrix{ {{X_i}} \cr {{Y_i}} \cr {{Z_i}} \cr } } \right)\cdot\left({\matrix{ {{X_i}} \cr {{Y_i}} \cr {{Z_i}} \cr } } \right) = X_i^2 + Y_i^2 + Z_i^2. \eqno(2b)]

For a general pair of vectors, p and q at an angle θ, the dot product yields a scalar value:

[\eqalignno{{\bf p}\cdot{\bf q} & = |{\bf p}| |{\bf q}| \cos\theta = \left( \matrix{X_p \cr Y_p \cr Z_p } \right) \cdot\left(\matrix{X_q \cr Y_q \cr Z_q}\right) \cr & = X_p X_q + Y_pY_q + Z_p Z_q , & (2c)}]

where each of these vectors is referenced to the coordinate origin. In Excel, the dot product can be simulated by the function SUMPRODUCT.

In the operation of the dot product function, the cosine first provides a projection of one vector onto the other; then the product function multiplies and sums the components together [equation (2c)[link]]. In physics, for example, this may correspond to work as the product of force times distance in the same direction as determined by the angle θ between the vectors. When the two vectors are parallel (as in the determination of a bond length), θ is zero and the cosine multiplier has the value one.

The vector length rji between points i and j, corresponding to a bond length, is obtained from the coordinate differences with the vector being multiplied by itself:

[\eqalignno{ r_{ji}^2 & = |{\bf r}_{ji}|^2 = {\bf r}_{ji}\cdot{\bf r}_{ji} = (X_j - X_i)^2 + (Y_j-Y_i)^2 + (Z_j - Z_i)^2 \cr & = \Delta X_{ji}^2 + \Delta Y_{ji}^2 + \Delta Z_{ji}^2. & (2d)}]

Bond angles, θ, can be calculated using either the dot (scalar) product or the cross (vector) product function; the latter generates a pseudovector norm (defined below), n, orthogonal to the vector pair rji and sjk which lie at an angle θ with respect to one another. The value of this vector norm n is only required if a torsion angle is to be calculated, as will be seen in the Torsion paragraph below.

The bond angle is calculated with respect to the atom sequence i—j—k with each bond vector, rji and sjk, referenced to the coordinates of the central atom j of the triplet; for example, ΔXr = (XjXi).

θ can be determined from the arc cosine of the dot product (Cockcroft, 2006[Cockcroft, J. K. (2006). Molecular Geometry, https://pd.chem.ucl.ac.uk/pdnn/refine2/geometry.htm.]):

[\eqalignno{ rs &= {\bf r}_{ji}\cdot{\bf s}_{jk} = |{\bf r}_{ji}||{\bf s}_{jk}| \cos\theta_{ijk} = \left(\matrix{\Delta X_r \cr \Delta Y_r \cr \Delta Z_r}\right)\cdot\left(\matrix{\Delta X_s \cr \Delta Y_s \cr \Delta Z_s}\right) \cr & = \Delta X_r \Delta X_s + \Delta Y_r \Delta Y_s + \Delta Z_r \Delta Y_s. & (3a)}]

Algebraically (Cockcroft, 2006[Cockcroft, J. K. (2006). Molecular Geometry, https://pd.chem.ucl.ac.uk/pdnn/refine2/geometry.htm.])

[rs = {\bf r}\cdot{\bf s} = |{\bf r}||{\bf s}| \cos\theta, ]

where

[{\bf r} = \Delta x_{ r} {\bf a} + \Delta y_{r}{\bf b} + \Delta z_{r} {\bf c}\quad{\rm and}\quad {\bf s} = \Delta x_s{\bf a} + \Delta y_s {\bf b } + \Delta z_s{\bf c}, ]

[\eqalign{{\bf r }\cdot{\bf s} & = (\Delta x_r{\bf a} + \Delta y_r{\bf b} + \Delta z_r{\bf c} ) \cdot (\Delta x_s{\bf a} + \Delta y_s{\bf b} + \Delta z_s{\bf c}) \cr & = \Delta x_r \Delta x_s({\bf a}\cdot{\bf a}) + \Delta y_r\Delta y_s({\bf b}\cdot{\bf b}) + \Delta z_r \Delta z_s({\bf c}\cdot{\bf c}) \cr & \quad + (\Delta y_r \Delta z_s + \Delta y_s \Delta z_r) ({\bf b}\cdot{\bf c}) \cr & \quad+ (\Delta z_r \Delta x_s + \Delta z_s \Delta x_r)({\bf c}\cdot{\bf a}) \cr & \quad + (\Delta x_r \Delta y_s + \Delta x_s \Delta y_r )({\bf a}\cdot{\bf b}), }]

and

[\eqalignno{\cos\theta & = [a^2\Delta x_r \Delta x_s + b^2 \Delta y_r \Delta y_s + c^2 \Delta z_r \Delta z_s \cr & \quad + bc\cos\alpha(\Delta y_r \Delta z_s + \Delta y_s \Delta z_r) \cr & \quad + ca\cos\beta(\Delta z_r \Delta x_s + \Delta z_s \Delta x_r)\cr & \quad + ab\cos\gamma(\Delta x_r \Delta y_s + \Delta x_s \Delta y_r)]/rs. &(3b)}]

Alternatively, the cross product, r × s, yields the pseudovector, n, which is normal to the plane defined by r and s. The bond angle, θijk, is determined using the arc sine function:

[\eqalignno{ {{\bf{r}}_{ji}} \times {{\bf{s}}_{jk}} & = |{{\bf{r}}_{ji}}{\rm{| |}}{{\bf{s}}_{jk}}|\sin {\theta _{ijk}}\,\,{{\bf{n}}_{ijk}}\left(\matrix{ \Delta {X_r} \hfill \cr \Delta {Y_r} \hfill \cr \Delta {Z_r} \hfill \cr} \right) \times\left(\matrix{ \Delta {X_s} \hfill \cr \Delta {Y_s} \hfill \cr \Delta {Z_s} \hfill \cr} \right) \cr & = \left(\matrix{ \Delta {Y_r}\Delta {Z_s} - \Delta {Z_r}\Delta {Y_s} \hfill \cr \Delta {Z_r}\Delta {X_s} - \Delta {X_r}\Delta {Z_s} \hfill \cr \Delta {X_r}\Delta {Y_s} - \Delta {Y_r}\Delta {X_s} \hfill \cr} \right). & (3c)}]

A pseudovector (or axial vector), being perpendicular to the plane of the three atoms forming the bond, changes sign when converted to its mirror image, so that the cross product is non-commutative:

[{\bf{r \times s}} = - {\bf{s \times r}}. \eqno(3d)]

The standard selection for the sign of the torsion is positive when r rotates to s according to the right-hand rule. The cross product, [{\bf{r \times s}} = |{\bf{r}}|{\rm{|}}{\bf{s}}|\sin \theta], is the signed area of the parallelogram bounded by the vectors r and s. Similarly, the (scalar) triple product, [{\bf{a}} \cdot {\bf{b \times c}}] (https://en.wikipedia.org/wiki/Triple_product), represents the signed volume, V, of the unit cell with cell constants a, b, c, α, β, γ [for the algebraic form of V, see equation (1b)[link]]. Excel does not have a cross-product function, but a user-defined cross-product function is listed in the supplementary information file.

Torsion angles, ϕ, are calculated as the twist around the central bond jk of the two planes defined by the orthogonal pseudovectors of the angle triplets ijk and jkl (see Figs. 1[link] and 2[link]). Thus, we use the dot product of the pair of triplets to generate the cosine of the torsion angle, [\cos {\phi _{ijkl}}], from which we evaluate ϕijkl through the arc cosine function:

[\eqalignno{\phi_{ijkl} & = \cos^{-1}\left[ {{ ({\bf r}_{ji} \times{\bf r}_{jk}) \cdot({\bf r}_{kj} \times {\bf r}_{kl})}\over {|{\bf r}_{ji}\cdot{\bf r}_{jk}||{\bf r}_{kj}\cdot{\bf r}_{kl}|}}\right] \cr & = \cos^{-1} \left[{{({\bf r}_{ji} \times{\bf r}_{jk})\cdot({\bf r}_{kj}\times{\bf r}_{kl})}\over{rs}}\right].&(4a)}]

[Figure 1]
Figure 1
The three bond lengths (1.48, 1.51 and 1.44 Å), two bond angles (109.1 and 110.7°) and single torsion angle (+59.8°) for the atom sequence N—C—C—O(H) in L-serine. The diagram was prepared using Jmol and the CIF available as supporting information to the article by Moggach et al. (2005[Moggach, S. A., Allan, D. R., Morrison, C. A., Parsons, S. & Sawyer, L. (2005). Acta Cryst. B61, 58-68.]) (CSD-LSERIN11.cif from the CSD).
[Figure 2]
Figure 2
View down the atom sequence (H)O—C—C—N in L-serine, highlighting the right-hand screw twist around the central C—C bond relating the O atom to the N atom, yielding a torsion angle of +59.8°. The diagram was prepared using Jmol and the CSD-LSERIN11.cif file (Moggach et al., 2005[Moggach, S. A., Allan, D. R., Morrison, C. A., Parsons, S. & Sawyer, L. (2005). Acta Cryst. B61, 58-68.]).

It remains to determine the sign of ϕijkl (between −180 and +180°), which is accomplished in equation (4b)[link] by calculating first the cross product of the orthogonal axial vectors of the two planes defining the bond angles (this cross product generates a new vector, now parallel to the planes) and then the scalar dot product of this vector with the vector representing the central bond jk:

[{\rm sign\,\,of}\,\,({\bf r}_{kj}\times{\bf r}_{lk})\cdot{\bf r}_{jk}. \eqno(4b)]

By the right-hand rule, if these parallel vectors point in the same direction the torsion angle is positive, but it is negative if they point in opposite directions.

3. The metric matrix and crystal (vector) space

Converting from the fractional coordinates of crystal (vector) space to Cartesian coordinates of standard space is convenient, as shown in Section 4[link], but it takes no account of the symmetry of crystals. It is sometimes appropriate to perform geometry in the crystal space directly (for example, to generate symmetry-related atoms absent from the asymmetric unit data provided in the CIF) (De Graef & McHenry, 2011[De Graef, M. & McHenry, M. E. (2011). Structure of Materials: Additional Material, https://som.web.cmu.edu/frames.html.], 2012[De Graef, M. & McHenry, M. E. (2012). Structure of Materials: An Introduction to Crystallography, Diffraction and Symmetry, 2nd ed. Cambridge University Press.]; https://dictionary.iucr.org/Asymmetric_unit). This introduces the scalar metric matrix, G (also known as the metric tensor, g, when it use extends to nonlinear processes and is no longer a scalar), which uses the six crystal constants, a, b, c, α, β, γ, as the basis vectors in evaluations of dot products:

[\eqalignno{ r^2 & = {\bf r}\cdot{\bf r} = (\Delta x{\bf a} + \Delta y{\bf b} + \Delta z{\bf c})\cdot (\Delta x{\bf a} + \Delta y{\bf b} + \Delta z{\bf c} )\cr & = \Delta x^2({\bf a}\cdot{\bf a}) + \Delta y^2({\bf b}\cdot{\bf b}) + \Delta z^2({\bf c}\cdot{\bf c}) + 2\Delta y\Delta z({\bf b}\cdot{\bf c})\cr & \quad + 2\Delta z\Delta x({\bf c}\cdot{\bf a}) + 2\Delta x\Delta y({\bf a}\cdot{\bf b}).&(5a)}]

We collect the basis vectors into a metric matrix, G, where

[\eqalignno{ {\bf G} & = \left(\matrix{{\bf a}\cdot{\bf a} & {\bf a}\cdot{\bf b} & {\bf a}\cdot{\bf c} \cr {\bf b}\cdot{\bf a} & {\bf b}\cdot{\bf b} & {\bf b}\cdot{\bf c} \cr {\bf c}\cdot{\bf a} & {\bf c}\cdot{\bf b} & {\bf c}\cdot{\bf c}}\right)\cr & = \left(\matrix{a^2 & ab\cos\gamma & ac\cos\beta \cr ba\cos\gamma & b^2 & bc\cos\alpha \cr ca\cos\beta & cb\cos\alpha & c^2}\right), & (5b)}]

so that

[r^2 = |{\bf r}|^2 = {\bf r}\cdot{\bf r} = (\matrix{\Delta x & \Delta y & \Delta z})\,{\bf G}\left(\matrix{\Delta x \cr \Delta y\cr \Delta z}\right). ]

Defining

[\left(\matrix{\Delta x \cr \Delta y\cr \Delta z}\right) = X,]

so that [(\Delta x \,\, \Delta y \,\,\Delta z) = X^{\rm T}], where the superscript T represents the transpose,

[r^2 = |{\bf r}|^2 = {\bf r}\cdot{\bf r} = X^{\rm T}{\bf G} X,]

[\eqalignno{r^2 & = a^2\Delta x^2 + b^2\Delta y^2+c^2\Delta z^2+2bc\cos\alpha\Delta y\Delta z\cr & \quad + 2ca\cos\beta\Delta z\Delta x + 2ab\cos\gamma\Delta x\Delta y.&(5c)}]

As noted earlier, the volume of the unit cell is equal to the square root of the determinant of G, which can be found in Excel using the function MDETERM.

In computation, it is convenient to record the six terms in equation (5c)[link] a2, b2, c2,…, abcosγ individually, since they might be used repeatedly for both bond length and bond angle calculations (Cockcroft, 2006[Cockcroft, J. K. (2006). Molecular Geometry, https://pd.chem.ucl.ac.uk/pdnn/refine2/geometry.htm.]). In the algebraic expansion, equation (5c)[link] provides the terms which need to be summed in order to calculate a bond length. However, it is simpler in Excel to calculate the metric matrix expression of equation (5c)[link] directly using the MMULT array function, without requiring the six-term expansion.

The metric matrix can be used to calculate bond angles directly, without having to calculate bond lengths |r| and |s| independently, as follows:

[{\bf r}\cdot{\bf s}= |{\bf r }||{\bf s}|\cos\theta = \left(\matrix{\Delta x_r \cr \Delta y_r \cr \Delta z_r}\right)\cdot\left(\matrix{\Delta x_s \cr \Delta y_s\cr\Delta z_s}\right). ]

This relation can also be expressed in terms of the metric matrix G:

[\eqalignno{ {\bf r}\cdot{\bf s} & = |{\bf r}||{\bf s}|\cos\theta = \left(\matrix{\Delta x_r & \Delta y_r & \Delta z_r \cr \Delta x_s & \Delta y_s& \Delta z_s}\right)\,{\bf G}\left(\matrix{\Delta x_r & \Delta x_s \cr \Delta y_r & \Delta y_s \cr \Delta z_r & \Delta z_s}\right) \cr & = (\matrix{{\bf r}& {\bf s}})\,{\bf G}\left(\matrix{{\bf r}\cr {\bf s}}\right) = \left(\matrix{{\bf r}\cdot{\bf r} & {\bf s}\cdot{\bf r} \cr {\bf r}\cdot{\bf s} & {\bf s}\cdot{\bf s}}\right) . & (6)}]

Hence

[\theta = \cos^{-1} \left[{{{\bf r}\cdot{\bf s}}\over{({\bf r}\cdot{\bf r})^{1/2}({\bf s}\cdot{\bf s})^{1/2}}}\right] = \cos^{-1}({\bf r}\cdot{\bf s}/rs).]

Torsion angle calculations are generally regarded as being too complex to be formulated with fractional coordinates, and are often performed using Cartesian coordinates, as discussed above. However, the BASIC program provided by Dunitz (1995[Dunitz, J. D. (1995). X-ray Analysis and the Structure of Organic Molecules, 2nd ed. Basel: Verlag Helvetica Chimica Acta.]) does not have this limitation.

4. An example calculation (for L-valine) using both algebraic expansions and vector methods

For this exercise we will demonstrate how the equations introduced in Sections 2[link] and 3[link] are implemented in practice, using the example of L-valine (Torii & Iitaka, 1970[Torii, K. & Iitaka, Y. (1970). Acta Cryst. B26, 1317-1326.]) with the data and results as appear in the Valine worksheet of the supplementary workbook gj5247sup2.xlsx.

Atom numbering:

[Scheme 1]

Unit cell constants: a = 9.71, b = 5.27, c = 12.06 Å, α = 90, β = 90.8, γ = 90°.

Transformation matrix for conversion from fractional coordinates, x, y, z, to Cartesian coordinates, X, Y, Z:

[\left(\matrix{X \cr Y \cr Z}\right) = \left[\matrix{a & b\cos\gamma & c\cos\beta \cr 0 & b\sin\gamma & \displaystyle{{c(\cos\alpha - \cos\beta\cos\gamma)}\over{\sin\gamma}} \cr 0 & 0 & \displaystyle{{V}\over{ab\sin\gamma}}}\right] \cdot\left(\matrix{x \cr y \cr z}\right),\eqno(1a)]

where the volume of the unit cell

[V = abc(1- \cos^2\alpha - \cos^2\beta - \cos^2\gamma + 2 \cos\alpha \cos\beta \cos\gamma)^{1/2}, \eqno(1b)]

[V_{\rm cell} = 6.17071. ]

Note: in Excel, angles must be in radians, where rads = degs*pi()/180.

Algebraic result, after matrix multiplication:

[\left({\matrix{ X \cr Y \cr Z \cr } } \right) = \left[{\matrix{ {xa + yb\cos \gamma + zc\cos \beta } \cr {yb\sin \gamma + zc(\cos \alpha - \cos \beta \cos \gamma)/\sin \gamma } \cr {zV/ab\sin \gamma } \cr } } \right]. \eqno(1c)]

The fractional atomic coordinates input from the LVALIN.cif file from the CSD are given in Table 1[link].

Table 1
The input fractional atomic coordinates

  C1 C7 C8 C9 C10 N1 O1 O4
x −0.2234 −0.3649 −0.4025 −0.4574 −0.2757 −0.3753 −0.1295 −0.2084
y −0.123 −0.0062 0.0412 −0.1998 0.1525 0.2332 0.0167 −0.3534
z 0.3635 0.3457 0.2224 0.1658 0.1597 0.4116 0.3996 0.3367

Coordinate transformation, from fractional to Cartesian, using atom C1 as an example:

[\eqalign{\left({\matrix{ X \cr Y \cr Z \cr } } \right) & = \left[{\matrix{ { - 0.2234 \times 9.71 + {\rm{ }}\cdots} \cr { - 0.123 \times 5.27 \sin (90 \times 3.142/180) + {\rm{ }}\cdots} \cr {0.3635 \times 617.1/{\rm{ }}\cdots} \cr } } \right] \cr & = \left({\matrix{ { - 2.2304} \cr { - 0.6482} \cr {4.3834} \cr } } \right).}]

Alternatively, by direct matrix multiplication following substitutions into equation (1a)[link]

[\eqalign{\left({\matrix{ X \cr Y \cr Z \cr } } \right) & = \left({\matrix{ {9.71} & 0 & { - 0.1684} \cr 0 & {5.27} & 0 \cr 0 & 0 & {12.06} \cr } } \right)\left({\matrix{ { - 0.2234} \cr { - 0.123} \cr {0.3635} \cr } } \right) \cr & = \left({\matrix{ { - 2.2304} \cr { - 0.6482} \cr {4.3834} \cr } } \right). }]

The resulting Cartesian atomic coordinates are given in Table 2[link].

Table 2
The resulting Cartesian atomic coordinates

X −2.2304 −3.6014 −3.9457 −4.4693 −2.7039 −3.7135 −1.3247 −2.0803
Y −0.6482 −0.0327 0.2171 −1.0529 0.8037 1.2290 0.0880 −1.8624
Z 4.3834 4.1687 2.6819 1.9994 1.9258 4.9634 4.8187 4.0602

Bond lengths [equation (2d)[link] or (5c)[link]]:

Using Cartesian coordinates,

[\eqalignno{r^2 & = |{\bf r}_{ji}|^2 = {\bf r}_{ji}\cdot{\bf r}_{ji} = |{\bf r}_{ji}||{\bf r}_{ji}| \cr &= (X_j - X_i)^2 + (Y_j - Y_i)^2 + (Z_j-Z_i)^2 \cr & = \Delta X_{ji}^2 + \Delta Y_{ji}^2 + \Delta Z_{ji}^2.& (2d)}]

Using fractional coordinates,

[\eqalignno{ {r^2} & = {a^2}\Delta {x^2} + {b^2}\Delta {y^2} + {c^2}\Delta {z^2} + 2bc\cos \alpha \Delta y\Delta z \cr & \quad + 2ca\cos \beta \Delta z\Delta x + 2ab\cos \gamma \Delta x\Delta y. & (5c)}]

The resulting bond lengths are C8—C9 = 1.534 Å, C7—C8 = 1.547 Å and C1—O4 = 1.265 Å.

Bond angles [equations (3a)[link] or (6[link])]:

A bond angle is calculated with respect to the atom sequence ijk with each bond vector, rji and sjk, referenced to the coordinates of the central atom j of the triplet.

Consider the Cartesian coordinates for the bond angle O1—C1—C7 (Table 3[link]), extracted from the supplementary Valine spreadsheet.

Table 3
Cartesian coordinates for the O1—C1—C7 bond angle

  Bond r = O1—C1   Bond s = C7—C1
ΔXr 0.9057 ΔXs −1.3710
ΔYr 0.7362 ΔYs 0.6155
ΔZr 0.4353 ΔZs −0.2146
r = |r| 1.246 s = |s| 1.518

Using the scalar dot product with Cartesian coordinates, and expanding algebraically,

[\eqalignno{ rs & = {\bf r}_{ji} \cdot{\bf s}_{jk} = |{\bf r}_{ji}||{\bf s}_{jk}| \cos\theta_{ijk} = \left(\matrix{\Delta X_r \cr \Delta Y_r \cr \Delta Z_r}\right)\cdot \left(\matrix{\Delta X_s \cr \Delta Y_s \cr \Delta Z_s}\right) \cr & = \Delta X_r \Delta X_s + \Delta Y_r \Delta Y _s + \Delta Z_r \Delta Z_s. & (3a)}]

In our Excel Valine worksheet, the bond angle for O1—C1—C7 is calculated in cell J24 using the Excel expression

= ACOS(SUMPRODUCT(M24:M26,O24:O26)/(M27*O27)) = 117.80°

where M24:M26 and O24:O26 refer to the Cartesian coordinate differences: ΔXr, ΔYr, ΔZr for bond O1—C1 and ΔXs, ΔYs, ΔZs for bond C7—C1, normalized by dividing by the lengths of the respective bonds, M27 and O27. Note that the bond differences are all calculated with respect to the central atom, C1.

Expressing this calculation in matrix terms [equation (6)[link]],

[{\bf r} = \left(\matrix{0.9057 \cr 0.7362 \cr 0.4353}\right), \quad {\bf s} = \left(\matrix{-1.371 \cr 0.6155 \cr -0.2146}\right), \quad \theta = \cos^{-1}({\bf r}\cdot{\bf s}/rs), ]

[\theta= \cos^{-1}(-0.8820/1.246\times 1.518) = 117.8^{\circ}, ]

where r·s is determined in Excel as SUMPRODUCT(|r|,|s|).

Algebraically, with fractional coordinates,

[\eqalignno{ \cos \theta & = [ {a^2}\Delta {x_r}\Delta {x_s} + {b^2}\Delta {y_r}\Delta {y_s} + {c^2}\Delta {z_r}\Delta {z_s} \cr & \quad + bc\cos \alpha (\Delta {y_r}\Delta {z_s} + \Delta {y_s}\Delta {z_r})\cr & \quad + ca\cos \beta (\Delta {z_r}\Delta {x_s} + \Delta {z_s}\Delta {x_r}) \cr & \quad + ab\cos \gamma (\Delta {x_r}\Delta {y_s} + \Delta {x_s}\Delta {y_r})] /rs.& (3b)}]

Alternatively, a bond angle can be found using the vector cross product:

[\eqalignno{{{\bf{r}}_{ji}}\times {{\bf{s}}_{jk}} & = |{{\bf{r}}_{ji}}{\rm{| |}}{{\bf{s}}_{jk}}|\sin {\theta _{ijk}}\,\,{{\bf{n}}_{ijk}}\left(\matrix{ \Delta {X_r} \hfill \cr \Delta {Y_r} \hfill \cr \Delta {Z_r} \hfill \cr} \right){\bf{ \times }}\left(\matrix{ \Delta {X_s} \hfill \cr \Delta {Y_s} \hfill \cr \Delta {Z_s} \hfill \cr} \right) \cr & = \left(\matrix{ \Delta {Y_r}\Delta {Z_s} - \Delta {Z_r}\Delta {Y_s} \hfill \cr \Delta {Z_r}\Delta {X_s} - \Delta {X_r}\Delta {Z_s} \hfill \cr \Delta {X_r}\Delta {Y_s} - \Delta {Y_r}\Delta {X_s} \hfill \cr} \right). &(3c)}]

The cross product can be determined using the user-defined function CVp (listed in the supplementary file) with the pair of bonds in Table 3[link]. This generates a vector normal to the plane of the bond pair:

[\eqalign{|{{\bf{r}}_{ji}}|{\rm{|}}{{\bf{s}}_{jk}}{\rm{|}}\sin {\theta _{ijk}} & = rs\sin {\theta _{ijk}} = \left(\matrix{ \Delta {X_r} \hfill \cr \Delta {Y_r} \hfill \cr \Delta {Z_r} \hfill \cr} \right){\bf{ \times }}\left(\matrix{ \Delta {X_s} \hfill \cr \Delta {Y_s} \hfill \cr \Delta {Z_s} \hfill \cr} \right) \cr & = \left({\matrix{ {{X_{rs}}} \cr {{Y_{rs}}} \cr {{Z_{rs}}} \cr } } \right) = \left({\matrix{ { - 0.4259} \cr { - 0.4024} \cr {1.5668} \cr } } \right). }]

The length of this vector is |vrs| = (Xrs2 + Yrs2 + Zrs2)1/2 = 1.673. Hence

[\eqalign{{\theta _{rs}} & = {\sin ^{ - 1}}({v_{rs}}/rs) = {\sin ^{ - 1}}[1.673/(1.246 \times 1.518)] \cr & = 1.085\,\,{\rm{ rad = 62}}{\rm{.}}{{\rm{2}} }\,\,{\rm{ or \,\,117}}{\rm{.}}{{\rm{8}}^ \circ. }}]

A choice needs to be made between the angle and its supplement. Note that the incorrect acute angle is also returned by the algebraic method unless the bond vectors are first referenced to the central atom j of the sequence ijk.

Torsion angles are best found by the complex method of the Torsion worksheet, using equations (4a)[link] and (4b)[link], or by simple substitution into the gj5247sup1.xlsx workbook.

5. Conclusions

Procedures by which bond lengths, bond angles and torsion angles can be calculated from either Cartesian or fractional crystal coordinates, individually or by use of metric matrices, are illustrated and also demonstrated using live Excel spreadsheets. These procedures exemplify the otherwise hidden methods used in crystallographic and molecular visualization programs.

6. Supplementary files

Supporting information:

(i) An Excel macro-enabled workbook, gj5247sup1.xls, which calculates torsion angles, bond angles and bond lengths using fractional coordinate data inserted by the user into the worksheet. This Excel file contains a macro.

(ii) An Excel workbook, gj5247sup2.xlsx, consisting of four worksheets labelled SF6, Serine, Valine and Torsion, which lays out the matrix calculations involved in calculating molecular geometry from fractional coordinates. All the calculations are performed live, using standard Excel functions.

(iii) The gj5247sup3.pdf file describes the contents and operations within the four worksheets of gj5247sup2.xls, and describes a user-defined Excel cross-product function.

Supporting information


Acknowledgements

I thank both referees for their detailed and helpful corrections and comments which have saved me from much embarrassment. This contribution was inspired by the `Molecular Geometry' analysis of J. K. Cockcroft (2006[Cockcroft, J. K. (2006). Molecular Geometry, https://pd.chem.ucl.ac.uk/pdnn/refine2/geometry.htm.]).

References

First citationAldeborgh, H., George, K., Howe, M., Lowman, H., Moustakas, H., Strunsky, N. & Tanski, J. M. (2014). J. Chem. Crystallogr. 44, 70–81.  Web of Science CSD CrossRef CAS Google Scholar
First citationBerman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N. & Bourne, P. E. (2000). Nucleic Acids Res. 28, 235–242.  Web of Science CrossRef PubMed CAS Google Scholar
First citationChapuis, G. (2011). Crystallogr. Rev. 17, 187–204.  Web of Science CrossRef Google Scholar
First citationCockcroft, J. K. (2006). Molecular Geometry, https://pd.chem.ucl.ac.uk/pdnn/refine2/geometry.htmGoogle Scholar
First citationDe Graef, M. & McHenry, M. E. (2011). Structure of Materials: Additional Material, https://som.web.cmu.edu/frames.htmlGoogle Scholar
First citationDe Graef, M. & McHenry, M. E. (2012). Structure of Materials: An Introduction to Crystallography, Diffraction and Symmetry, 2nd ed. Cambridge University Press.  Google Scholar
First citationDowns, R. T. & Hall-Wallace, M. (2003). Am. Miner. 88, 247–250.  Web of Science CrossRef CAS Google Scholar
First citationDunitz, J. D. (1995). X-ray Analysis and the Structure of Organic Molecules, 2nd ed. Basel: Verlag Helvetica Chimica Acta.  Google Scholar
First citationGlasser, L. (1993). Comput. Chem. 17, 107–108.  CrossRef CAS Web of Science Google Scholar
First citationGlasser, L. (2016). J. Chem. Educ. 93, 542–549.  Web of Science CrossRef CAS Google Scholar
First citationGražulis, S., Chateigner, D., Downs, R. T., Yokochi, A. F. T., Quirós, M., Lutterotti, L., Manakova, E., Butkus, J., Moeck, P. & Le Bail, A. (2009). J. Appl. Cryst. 42, 726–729.  Web of Science CrossRef IUCr Journals Google Scholar
First citationGražulis, S., Sarjeant, A. A., Moeck, P., Stone-Sundberg, J., Snyder, T. J., Kaminsky, W., Oliver, A. G., Stern, C. L., Dawe, L. N., Rychkov, D. A., Losev, E. A., Boldyreva, E. V., Tanski, J. M., Bernstein, J., Rabeh, W. M. & Kantardjieff, K. A. (2015). J. Appl. Cryst. 48, 1964–1975.  Web of Science CrossRef IUCr Journals Google Scholar
First citationGroom, C. R., Bruno, I. J., Lightfoot, M. P. & Ward, S. C. (2016). Acta Cryst. B72, 171–179.  Web of Science CrossRef IUCr Journals Google Scholar
First citationHall, S. R., Allen, F. H. & Brown, I. D. (1991). Acta Cryst. A47, 655–685.  CrossRef CAS Web of Science IUCr Journals Google Scholar
First citationJulian, M. M. (2014). Foundations of Crystallography with Computer Applications, 2nd ed. Abingdon-on-Thames: CRC Press.  Google Scholar
First citationKantardjieff, K. (2010). J. Appl. Cryst. 43, 1276–1282.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationMacrae, C. F., Sovago, I., Cottrell, S. J., Galek, P. T. A., McCabe, P., Pidcock, E., Platings, M., Shields, G. P., Stevens, J. S., Towler, M. & Wood, P. A. (2020). J. Appl. Cryst. 53, 226–235.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationMcRee, D. E. (1993a). Practical Protein Crystallography. San Diego: Academic Press.  Google Scholar
First citationMcRee, D. E. (1993b). Practical Protein Crystallography, https://www.sciencedirect.com/book/9780124860506/practical-protein-crystallographyGoogle Scholar
First citationMoggach, S. A., Allan, D. R., Morrison, C. A., Parsons, S. & Sawyer, L. (2005). Acta Cryst. B61, 58–68.  Web of Science CSD CrossRef CAS IUCr Journals Google Scholar
First citationMomma, K. & Izumi, F. (2011). J. Appl. Cryst. 44, 1272–1276.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationO'Boyle, N. M., Banck, M., James, C. A., Morley, C., Vandermeersch, T. & Hutchison, G. R. (2011). J. Cheminform. 3, 33.   Google Scholar
First citationThomas, I. R., Bruno, I. J., Cole, J. C., Macrae, C. F., Pidcock, E. & Wood, P. A. (2010). J. Appl. Cryst. 43, 362–366.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationTorii, K. & Iitaka, Y. (1970). Acta Cryst. B26, 1317–1326.  CSD CrossRef CAS IUCr Journals Web of Science Google Scholar
First citationZagorac, D., Müller, H., Ruehl, S., Zagorac, J. & Rehme, S. (2019). J. Appl. Cryst. 52, 918–925.  Web of Science CrossRef CAS IUCr Journals Google Scholar

© International Union of Crystallography. Prior permission is not required to reproduce short quotations, tables and figures from this article, provided the original authors and source are cited. For more information, click here.

Journal logoJOURNAL OF
APPLIED
CRYSTALLOGRAPHY
ISSN: 1600-5767
Follow J. Appl. Cryst.
Sign up for e-alerts
Follow J. Appl. Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds