research papers\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

Journal logoFOUNDATIONS
ADVANCES
ISSN: 2053-2733

A method to estimate statistical errors of properties derived from charge-density modelling

CROSSMARK_Color_square_no_text.svg

aInstitut Galien Paris Sud, UMR CNRS 8612, Université Paris Sud, Faculté de Pharmacie, Université Paris-Saclay, 5 rue Jean-Baptiste Clément, Châtenay-Malabry, 92296, France, bLaboratoire Structures, Propriétés et Modélisation des Solides (SPMS) UMR CNRS 8580, Ecole CentraleSupélec, 3 rue Joliot-Curie, Gif-sur-Yvette Cedex, 91192, France, cCRM2, UMR CNRS 7036, Institut Jean Barriol, Université de Lorraine, Vandoeuvre les Nancy Cedex, France, and dThe Barcelona Institute of Science and Technology, Institute of Chemical Research of Catalonia (ICIQ), Avinguda Països Catalans 16, Tarragona, 43007, Spain
*Correspondence e-mail: christian.jelsch@univ-lorraine.fr

Edited by A. Altomare, Institute of Crystallography - CNR, Bari, Italy (Received 16 January 2018; accepted 13 March 2018; online 3 May 2018)

Estimating uncertainties of property values derived from a charge-density model is not straightforward. A methodology, based on calculation of sample standard deviations (SSD) of properties using randomly deviating charge-density models, is proposed with the MoPro software. The parameter shifts applied in the deviating models are generated in order to respect the variance–covariance matrix issued from the least-squares refinement. This `SSD methodology' procedure can be applied to estimate uncertainties of any property related to a charge-density model obtained by least-squares fitting. This includes topological properties such as critical point coordinates, electron density, Laplacian and ellipticity at critical points and charges integrated over atomic basins. Errors on electrostatic potentials and interaction energies are also available now through this procedure. The method is exemplified with the charge density of compound (E)-5-phenylpent-1-enylboronic acid, refined at 0.45 Å resolution. The procedure is implemented in the freely available MoPro program dedicated to charge-density refinement and modelling.

1. Introduction

Errors on electron-density-derived properties, such as topological characteristics or electrostatic potential, are generally poorly addressed in the relevant literature. To the best of our knowledge, no available computer software designed for charge-density analysis on the basis of multipolar modelling computes properly analytical standard deviations on electron-density-derived properties. For instance, in the XD2006 program (Volkov et al., 2006[Volkov, A., Macchi, P., Farrugia, L. J., Gatti, C., Mallinson, P. R., Richter, T. & Koritsanszky, T. (2006). XD2006. Revision 5.34. University of New York at Buffalo, New York, USA.]), there is a feature that allows one to compute estimated uncertainties of the electron density ρ(r), of the Laplacian ∇2ρ and of dipole moment values using the variance–covariance matrix, but it only accounts for the contributions of some of the parameters used in the Hansen & Coppens (1978[Hansen, N. K. & Coppens, P. (1978). Acta Cryst. A34, 909-921.]) model, i.e. monopole and multipole populations. It implies that the propagation of errors due to the contributions of the atomic coordinates and of the contraction/expansion coefficients κ and κ′ is not taken into account. This could lead, consequently, to an overall underestimation of standard deviations on electron-density-derived properties.

Estimating uncertainties on properties derived from a charge-distribution model is yet essential to avoid any false or over-interpretation of these properties. When several experimental X-ray diffraction data sets collected during distinct and independent measurements are available for the same compound, it becomes possible to study the reproducibility of the refined charge-density model and to estimate uncertainties of derived properties through the determination of their sample standard deviations (SSDs). Such an approach was followed in a few studies, but often with questionable statistical significance given the sometimes very sparse sampling used [down to two models (Dittrich et al., 2002[Dittrich, B., Koritsánszky, T., Grosche, M., Scherer, W., Flaig, R., Wagner, A., Krane, H. G., Kessler, H., Riemer, C., Schreurs, A. M. M. & Luger, P. (2002). Acta Cryst. B58, 721-727.]; Grabowsky et al., 2008[Grabowsky, S., Pfeuffer, T., Morgenroth, W., Paulmann, C., Schirmeister, T. & Luger, P. (2008). Org. Biomol. Chem. 6, 2295-2307.]), a larger sample (up to four data sets) but varying experimental temperatures or setups (Messerschmidt et al., 2005[Messerschmidt, M., Scheins, S. & Luger, P. (2005). Acta Cryst. B61, 115-121.]; Förster et al., 2006[Förster, D., Wagner, A., Hübschle, C. B., Paulmann, C. & Luger, P. (2006). Z. Naturforsch. B, 62, 696-704.])].

Closely related but still different compounds (such as peptide bond properties in different amino acids) were also investigated (Flaig et al., 1999[Flaig, R., Koritsánszky, T., Janczak, J., Krane, H.-G., Morgenroth, W. & Luger, P. (1999). Angew. Chem. Int. Ed. 38, 1397-1400.]). In an article dedicated to the transferability of atomic parameters in alanyl-X-alanine-type tripeptides, Grabowsky et al. (2008[Grabowsky, S., Pfeuffer, T., Morgenroth, W., Paulmann, C., Schirmeister, T. & Luger, P. (2008). Org. Biomol. Chem. 6, 2295-2307.]) computed the global average of the standard deviations (noted experimental reproducibility indices [{\overline \sigma}_{{\rm rep}\semi{\rm exp}}^0]) obtained in those studies, for various electron-density-derived properties of the QTAIM (quantum theory of atoms in molecules; Bader, 1990[Bader, R. F. W. (1990). Atoms in Molecules: a Quantum Theory, 1st ed. International Series of Monographs on Chemistry 22. Oxford: Clarendon Press.]; Bader et al., 1987[Bader, R. F. W., Carroll, M. T., Cheeseman, J. R. & Chang, C. (1987). J. Am. Chem. Soc. 109, 7968-7979.]) framework. For instance, they obtained, this way, average experimental errors [{\overline \sigma}_{{\rm rep}\semi{\rm exp}}^0] (ρ) = 0.07 e Å−3 and [{\overline \sigma}_{{\rm rep}\semi{\rm exp}}^0] (∇2ρ) = 3.3 e Å−5 associated, respectively, with electron density and with Laplacian values at the bond critical points.

The most comprehensive and statistically sound reproducibility study on a wide range of electron-density-derived parameters was undertaken by Kamiński et al. (2014[Kamiński, R., Domagała, S., Jarzembska, K. N., Hoser, A. A., Sanjuan-Szklarz, W. F., Gutmann, M. J., Makal, A., Malińska, M., Bąk, J. M. & Woźniak, K. (2014). Acta Cryst. A70, 72-91.]). They used 13 independently collected high-resolution X-ray diffraction data sets of α-oxalic acid dihydrate. From these data, obtained using similar experimental setups, they derived 13 oxalic acid charge-density models which were refined following identical strategies. This approach allowed them to analyse the normality of the error distribution in experimental data and in residual electron densities using the Shapiro–Wilk statistical test and, more importantly, to obtain very informative results in terms of dispersion of structural/charge-density model parameters and of charge-density-derived property values. They have shown, for instance, that among the multipole model parameters, the valence populations present large reproducibility deviations, reaching up to 40% of the corresponding atomic net charge. Conversely, multipole populations were characterized by moderate dispersions. Thus high reproducibility was achieved among the refined models. The multipole populations expected to be close to zero due to atom local symmetry were indeed statistically negligible. In the same way, concerning charge-density-derived properties, Kamiński et al. (2014[Kamiński, R., Domagała, S., Jarzembska, K. N., Hoser, A. A., Sanjuan-Szklarz, W. F., Gutmann, M. J., Makal, A., Malińska, M., Bąk, J. M. & Woźniak, K. (2014). Acta Cryst. A70, 72-91.]) were able to evidence a significantly smaller dispersion of electron-density values on weak intermolecular (hydrogen bonds) critical points [10−3 < [\sigma ({{\rho _{\rm CP}}} )] < 3 × 10−2 e Å−3] compared with covalent bonds [3 × 10−2 < [\sigma ({{\rho _{\rm BCP}}} )] < 6 × 10−2 e Å−3] (CP = critical point, BCP = bond critical point) and, in any case, lower than the [{\overline \sigma}_{{\rm rep}\semi{{\exp}}}^0] (ρ) value of 0.07 e Å−3 obtained by Grabowsky et al. (2008[Grabowsky, S., Pfeuffer, T., Morgenroth, W., Paulmann, C., Schirmeister, T. & Luger, P. (2008). Org. Biomol. Chem. 6, 2295-2307.]). The methodology proposed by Kamiński et al. (2014[Kamiński, R., Domagała, S., Jarzembska, K. N., Hoser, A. A., Sanjuan-Szklarz, W. F., Gutmann, M. J., Makal, A., Malińska, M., Bąk, J. M. & Woźniak, K. (2014). Acta Cryst. A70, 72-91.]) provides standard deviations on any properties derived from the charge-density model, as well as possible rules of thumb for property uncertainties in any charge-density model of comparable quality. However, this approach is very resource- and time-consuming as it implies the collection of a statistically significant number of diffraction data sets at subatomic resolution. The uncertainties obtained may also not account totally for all systematic errors present in the data measurements.

Krause et al. (2017[Krause, L., Niepötter, B., Schürmann, C. J., Stalke, D. & Herbst-Irmer, R. (2017). IUCrJ, 4, 420-430.]) recently presented a method based on Rfree calculations. Sample standard deviations computed on the relevant models refined on subsets of the measured reflections (for example, 20 subsets of 95% reflections) can yield a rough estimate of the standard deviation on topological properties of the electron density. However, the Rfree method has two drawbacks. Firstly, when strong reflections are omitted (put in the test set), the results of these refinements versus the remaining data are significantly influenced. This effect does not have much impact on the refinement of protein structures (which have poor R factors and a large number of reflections) but is crucial for the refinement of quantitative electron densities.

Secondly, the estimated uncertainty on a derived property obtained using this method depends on the number N of complementary Rfree refinements performed. The discrepancy between the refined models decreases with N, as the number of free reflections omitted in the validation sets decreases proportionally to 1/N.

Here, we present a method allowing the estimation of uncertainties on properties derived from a charge-density model. This method consists of a statistical Monte Carlo random sampling procedure, based on the variance–covariance matrix obtained after the convergence of the least-squares refinement.

The least-squares method is widely used for the structural and charge-density refinement of crystal structures. The optimization procedure that uses the matrix of normal equations has a great power of convergence. The inversion of the full normal matrix also provides the variance–covariance matrix of the refined parameters and permits one to determine the precision of the refined structure model (Hamilton, 1964[Hamilton, W. C. (1964). Statistics in Physical Science. New York, USA: The Ronald Press Company.]).

The current study addresses the uncertainty on properties related to the precision of measurements. The accuracy of properties which is related to systematic errors in measurements is however a different issue.

In the present paper, the methodology for estimation of uncertainties is illustrated with the charge-density analysis of an organic compound: (E)-5-phenylpent-1-enylboronic acid (hereafter noted BOH2, Fig. 1[link]). The unique electronic and physicochemical properties of boronic acid make this kind of compound very useful as a pharmaceutical agent. Boronic acids are strong Lewis acids. They can be used as enzyme inhibitors in Suzuki cross-coupling reactions, Diels–Alder reactions, carb­oxy­lic acid activation or selective reduction of aldehydes, among many other uses (Yang et al., 2003[Yang, W. Q., Gao, X. M. & Wang, B. H. (2003). Med. Res. Rev. 23, 346-368.]). In recent years, boronic acids have also been reported as interesting building blocks in covalent organic frameworks (Côté et al., 2007[Côté, A. P., El-Kaderi, H. M., Furukawa, H., Hunt, J. R. & Yaghi, O. M. (2007). J. Am. Chem. Soc. 129, 12914-12915.]; Spitler & Dichtel, 2010[Spitler, E. L. & Dichtel, W. R. (2010). Nat. Chem. 2, 672-677.]; Ding et al., 2011[Ding, X. S., Guo, J., Feng, X. A., Honsho, Y., Guo, J. D., Seki, S., Maitarad, P., Saeki, A., Nagase, S. & Jiang, D. L. (2011). Angew. Chem. Int. Ed. 50, 1289-1293.]). To the best of our knowledge, this article is the first experimental charge-density study of a boronic acid compound.

[Figure 1]
Figure 1
Structure of (E)-5-phenylpent-1-enylboronic acid.

2. Experiment

2.1. Crystallization

For the current experiment, crystals were grown by slow evaporation of an ethanol/water solution of the compound BOH2 in a few days at room temperature. A single, colourless crystal of dimensions 0.34 × 0.18 × 0.10 mm was selected for the diffraction experiment. The compound crystallized in the centrosymmetric space group Pbca. More data on the orthorhombic crystal of BOH2 are given in Table 1[link].

Table 1
Crystal data and diffraction data collection statistics for the BOH2 molecule

Crystal data  
Chemical formula C11H15BO2
Molecular weight 190.039
Crystal system, space group Orthorhombic, Pbca
Temperature (K) 90 (1)
a, b, c (Å) 7.52004[Allen, F. H. & Bruno, I. J. (2010). Acta Cryst. B66, 380-386.](9), 9.38374 (13), 30.7120 (5)
Volume (Å3), Z 2167.23 (5), 8
Radiation type Mo Kα
λ (Å) 0.71073
F (000) 816
Crystal shape and colour Block and colourless
Crystal dimensions (mm) 0.34 × 0.18 × 0.10
Data collection  
Diffractometer Rigaku MicroMax-007HF
Absorption correction CrysAlisPro 1.171.38.37f
Absorption coefficient μ (mm−1) 0.077
Tmin, Tmax 0.472, 0.999
sinθmax/λ−1) 1.12
No. measured, unique reflections 117 942, 12 055
No. reflections (I > 2σ) 10 680
Completeness (%) at sinθmax/λ 96.5
Rint 3.06%
Refinement  
Weighting scheme Whkl = 3.3/σI2
wR2(I), R(F) 3.62%, 2.70%
Goodness of fit 1.0
†Rigaku Oxford Diffraction (2015[Rigaku Oxford Diffraction (2015). CrysAlisPro 1.171.38.37f. Rigaku Oxford Diffraction, Yarnton, England.]).
‡At sinθmax/λ = 1.22 Å−1.

2.2. Data collection

A single-crystal high-resolution and highly redundant X-ray data collection of the BOH2 compound was performed on a Rigaku MicroMax-HF rotating-anode diffractometer equipped with a Pilatus 200K hybrid pixel detector using Mo Kα radiation (λ = 0.71073 Å). The crystal was mounted on a Kapton micromount. The data collection was carried out at 90 (1) K under a stream of nitrogen using the Oxford 700 Plus Cryosystems gas-flow apparatus.

The diffraction data were collected using ω scans of 0.5° intervals with the CrystalClear-SM Expert 2.1b29 software (Rigaku, 2013[Rigaku (2013). CrystalClear SM Expert 2.1 b29. Rigaku Corporation, Tokyo, Japan.]) up to a resolution of 0.41 Å (sinθ/λ < 1.22 Å−1). The exposure times were 5 and 40 s per frame for low- and high-resolution data, respectively. Data reduction and absorption correction were performed using the Crys­AlisPro 1.171.38.37f package (Rigaku Oxford Diffraction, 2015[Rigaku Oxford Diffraction (2015). CrysAlisPro 1.171.38.37f. Rigaku Oxford Diffraction, Yarnton, England.]); the internal R(I) factor was 3.06% for all reflections (Table 1[link]).

2.3. Structure solution and refinement

The structure of the BOH2 compound has already been determined (Gelbrich et al., 2000[Gelbrich, T., Sampson, D. & Hursthouse, M. B. (2000). University of Southampton, Crystal Structure Report Archive.]). In our study, the structure of BOH2 was solved using the SIR2014 software (Burla et al., 2015[Burla, M. C., Caliandro, R., Carrozzini, B., Cascarano, G. L., Cuocci, C., Giacovazzo, C., Mallamo, M., Mazzone, A. & Polidori, G. (2015). J. Appl. Cryst. 48, 306-309.]). In particular all the H atoms were located in the difference Fourier map. An initial independent atom model (IAM) refinement was undertaken using the SHELXL2014 software (Sheldrick, 2015[Sheldrick, G. M. (2015). Acta Cryst. C71, 3-8.]).

2.4. Multipolar refinement

The charge-density model was refined against diffraction intensities using the program MoPro (Guillot et al., 2001[Guillot, B., Viry, L., Guillot, R., Lecomte, C. & Jelsch, C. (2001). J. Appl. Cryst. 34, 214-223.]; Jelsch et al., 2005[Jelsch, C., Guillot, B., Lagoutte, A. & Lecomte, C. (2005). J. Appl. Cryst. 38, 38-54.]). The program is based on the multipolar scattering factor formalism of Hansen & Coppens (1978[Hansen, N. K. & Coppens, P. (1978). Acta Cryst. A34, 909-921.]) and allows the definition of restraints on stereochemistry, thermal motion and charge-density parameters. Data resolution was truncated at 0.45 Å as the very high resolution reflections showed decreasing values of 〈Fo2〉/〈Fc2〉 well below unity, as verified with the XDRK software (Zhurov et al., 2008[Zhurov, V. V., Zhurova, E. A. & Pinkerton, A. A. (2008). J. Appl. Cryst. 41, 340-349.]). For the same reason, an I/σI > 0.35 cutoff was applied. The evolution of 〈Fo2〉/〈Fc2〉 as a function of reciprocal resolution s is shown in the supporting information.

The multipole expansion was done at the octupolar level for B, C and O atoms and the dipole level for H atoms. The core and valence spherical scattering factors were calculated using the wavefunctions for isolated atoms from Su & Coppens (1998[Su, Z. & Coppens, P. (1998). Acta Cryst. A54, 646-652.]) and the anomalous dispersion coefficients were taken from Kissel et al. (1995[Kissel, C., Speranza, F. & Milicevic, V. (1995). J. Geophys. Res. 100(B8), 14999-15007.]).

The MoPro program has numerous functionalities with respect to constraints, restraints and similarity applying to the stereochemistry and charge density. For the H atoms, the values of anisotropic Uij parameters were fixed to those obtained from the SHADE3 server (Madsen & Hoser, 2014[Madsen, A. Ø. & Hoser, A. A. (2014). J. Appl. Cryst. 47, 2100-2104.]). The H—X distances of H atoms were restrained to the values obtained from neutron diffraction studies (Allen & Bruno, 2010[Allen, F. H. & Bruno, I. J. (2010). Acta Cryst. B66, 380-386.]) with a restraint sigma σrest of 0.01 Å. Distance X—H similarity restraints were also applied to chemically equivalent groups (σrest = 0.01 Å).

The charge-density model was subsequently refined against diffraction intensities. The electron-density maps, local topological properties and intermolecular electrostatic energies were computed using the VMoPro module of the MoPro suite (Guillot et al., 2001[Guillot, B., Viry, L., Guillot, R., Lecomte, C. & Jelsch, C. (2001). J. Appl. Cryst. 34, 214-223.]; Jelsch et al., 2005[Jelsch, C., Guillot, B., Lagoutte, A. & Lecomte, C. (2005). J. Appl. Cryst. 38, 38-54.]), while the molecular view with thermal ellipsoids and the isosurface representations were produced with MoProViewer (Guillot et al., 2014[Guillot, B., Enrique, E., Huder, L. & Jelsch, C. (2014). Acta Cryst. A70, C279.]).

Automatic restraints of chemical equivalence and local symmetry (Domagała & Jelsch, 2008[Domagała, S. & Jelsch, C. (2008). J. Appl. Cryst. 41, 1140-1149.]) were applied to the electron-density parameters such as contraction/expansion κ and κ′, valence and multipole populations Pval and Plm. The optimal weight σopt of the restraints applying to the charge-density parameters (atom equivalence and local symmetry) was set to 0.2, as determined by minimizing the global Rfree factor (Brünger, 1992[Brünger, A. T. (1992). Nature, 355, 472-475.]; Zarychta et al., 2011[Zarychta, B., Zaleski, J., Kyzioł, J., Daszkiewicz, Z. & Jelsch, C. (2011). Acta Cryst. B67, 250-262.]). The parameters κ and κ′ of H atoms were restrained to be similar (σrest = 0.02).

The molecular parameters including scale factor, xyz, Uij, Pval, Plm, κ and κ′ were refined together with the block diagonal option and finally using the full normal matrix until convergence, yielding wR2(I) = 3.6%. The crystallographic details of the refinement are given in Table 1[link].

The topological charges were integrated on atomic basins using the program BADER (Tang et al., 2009[Tang, W., Sanville, E. & Henkelman, G. (2009). J. Phys. Condens. Matter, 21, 084204.]). A parallelepiped embedding the BOH2 molecule extracted from the crystal lattice was defined with a margin of 3 Å around the atomic nuclei. For each deviating model, the total electron density of the molecule inside this parallelepiped was computed using the program VMoPro, with a grid step of 0.05 Å along each direction and then saved as a Gaussian cube file. Then, the program BADER was used for atomic basin definition and charge integration. The sum of the integrated electron charges was smaller than the total number of electrons in the molecule with an average lack of 0.47 e (SSD = 0.0028 e) for a total number of 102 electrons. The unattributed electron charge was evenly redistributed on the 29 integrated atomic basin charges.

3. Methodology

3.1. Least-squares refinement and uncertainties

The least-squares refinement is implemented in MoPro (Guillot et al., 2001[Guillot, B., Viry, L., Guillot, R., Lecomte, C. & Jelsch, C. (2001). J. Appl. Cryst. 34, 214-223.]; Jelsch et al., 2005[Jelsch, C., Guillot, B., Lagoutte, A. & Lecomte, C. (2005). J. Appl. Cryst. 38, 38-54.]), software dedicated to charge-density refinement. A multipolar charge-density model defined according to the formalism of Hansen & Coppens (1978[Hansen, N. K. & Coppens, P. (1978). Acta Cryst. A34, 909-921.]) can be refined for crystal structures when ultra high resolution X-ray diffraction data have been measured. For macromolecular structures, the transferability principle can be used to define a multipolar electron-density model. When the refinement is performed against the reflection intensities, the minimized function E is defined as

[E = \textstyle\sum\limits _{{\bf H}}{ W}_{{\bf H}} [ I_{\bf H}^{\rm obs} -{I}_{\bf H}^{\rm calc}({\bf X}) ]^{2} \eqno (1)]

where [I_{\bf{H}}^{{\rm{calc}}}] and [I_{\bf{H}}^{{\rm{obs}}}] are the calculated and observed reflection intensities, respectively, and [{\bf{X}}] is the vector of the model parameters being considered in the corresponding refinement stage. The factor WH represents a weight for each reflection H. This weight can be taken as the squared inverse estimated error of the measured intensity.

The structure-factor amplitude is obtained by summation over all atoms a in the asymmetric unit and all symmetry operators s of the space group, as follows:

[{F}_{{\bf H}} = \textstyle\sum\limits_{a}\sum\limits_{s}{f}_{a}\left(\parallel {\bf H}\parallel\right)\exp\left[-{{\bf H}}^{\rm t}s({\beta }){\bf H}\right]\exp\left[2i\pi {\bf H}s({X}_{a})\right] \eqno (2)]

where, for each atom a, fa is the atom scattering factor, β is the dimensionless thermal tensor and Xa is the atom coordinates.

The least-squares refinement is performed iteratively. At each refinement cycle of n parameters in the model and after linearization of the calculated reflection intensities around the current vector [{\bf X}], the minimization of E is performed by solving the matrix system of normal equations:

[{\bf A}\,{\boldDelta }{\bf X} = {\bf V} \eqno (3)]

where [{\bf A}] is the n2 symmetric normal matrix, [\boldDelta {\bf X}] is the unknown shift vector to apply to the n variables refined. V is a vector of dimension n with elements like

[{V_i} = \sum_{\bf{H}} W_{\bf{H}}{{\partial I_{\bf{H}}^{{\rm calc}}} \over {\partial {x_i}}}({\bf{X}})\left [{I_{\bf{H}}^{{\rm{obs}}} - I_{\bf{H}}^{{\rm{calc}}}({\bf{X}} )} \right]. \eqno (4)]

The normal matrix element Aij concerning the refined parameters xi and xj is obtained from the summation of the products of the intensity (or structure-factor) derivatives over the reflections H:

[A_{ij} = \sum_{\bf{H}} W_{\bf{H}}{{\partial I_{\bf{H}}^{{\rm{calc}}}} \over {\partial {x_i}}}({\bf{X}} ){{\partial I_{\bf{H}}^{{\rm{calc}}}} \over {\partial {x_j}}}({\bf{X}} ). \eqno (5)]

The normal matrix elements [equation (5[link])], through the calculated intensities, incorporate implicitly the contribution of symmetry-related atoms in the unit cell as can be seen in equation (2[link]). At the refinement convergence, the variance–covariance matrix of the model parameters is obtained from B, the inverse of matrix A (Hamilton, 1964[Hamilton, W. C. (1964). Statistics in Physical Science. New York, USA: The Ronald Press Company.]). The ith diagonal term of the matrix B provides an estimated standard deviation (e.s.d.), noted σ(xi), of the parameter xi. If the weighting scheme WH used in the least-square function is not properly scaled, all e.s.d.'s have to be multiplied by the goodness of fit (GOF ≠ 1):

[\sigma ({x}_{i}) = {\rm GOF}\times {B}_{ii}^{1/2}. \eqno (6)]

The correlation coefficient Cij between the parameters xi and xj in the refinement is obtained by the equation

[{C}_{ij} = {B}_{ij}/(B_{ii} B_{jj})^{1/2}. \eqno (7)]

3.2. Generation of randomly deviating charge-density models

The procedure is started from the converged charge-density model at Xmin. The values of the parameter vector X are assumed to be distributed according to a multidimensional Gaussian probability density function with mean [\mu = {{\bf X}}^{\min}] and variance–covariance matrix [\Sigma = {\rm GOF}^{2} \times{\bf B}].

If there were no correlations between parameters, the matrices A and B would be diagonal and the shifts dxi to apply to each parameter to obtain a deviating model would be the e.s.d.'s [\sigma({{x_i}} )] multiplied by a random number:

[{dx}_{i} = \sigma ({x}_{i}) {R}_{i}. \eqno (8)]

In other words,

[{\bf d}{\bf X} = {\rm GOF}\times {\bf B}^{{{1}/{2}}} {\bf R} \eqno (9)]

where R is a vector of random and independent real numbers normally distributed with a zero average and a unitary variance.

In real situations, the variance–covariance matrix B is symmetric positive-definite but not diagonal as parameters show some correlations. The deviating parameter vector X values are generated using the following practical procedure.

Since the normal matrix A is symmetric definite-positive, the matrix A is orthogonally diagonalized at the [{{\bf X}}^{\min}] value where E is minimal leading to the expression

[{\bf A} = {\bf Q}^{\rm t} {\bf D Q} \eqno (10a)]

where Q is an orthogonal matrix and D is diagonal which contains the strictly positive eigenvalues of A.

Therefore, its inverse matrix B can be written as

[{\bf B} = {\bf Q}^{\rm t} {\bf D}^{-1} {\bf Q} \eqno (10b)]

and the matrix S, a square root matrix of B, can be obtained as

[{\bf S} = {\bf Q}^{\rm t} {\bf D}^{-1/2}{\bf Q}. \eqno (11)]

The deviating parameter vector X is obtained by applying

[{\bf X} = {\bf dX} + {\bf X}^{\min} = {\rm GOF}\times {\bf S} {\bf R} + {\bf X}^{\min}. \eqno (12)]

Each element of the vector dX is a linear combination of the R elements and thus the vector X follows a multivariate Gaussian distribution. The mean vector [{\bf E}({\bf X})] of this distribution is equal to [{\bf X}^{{\min}}]:

[\eqalignno{{\bf E}({\bf X}) &= {\bf E}({\rm GOF}\times{\bf S}{\bf R}) + {\bf E}({{\bf X}}^{\min})&\cr & = {\rm GOF}\times {\bf S}{\bf E}({\bf R}) + ({{\bf X}}^{\min}) = {{\bf X}}^{\min}. & (13)}]

By propagation of uncertainties, the variance–covariance matrix of this multivariate Gaussian distribution is defined, using expression (9)[link], as

[\eqalignno{{\bf cov}({\bf X})& = {\rm GOF}\times {\bf S}\,{\bf cov}({\bf R})\,{\bf S}^{\rm t}\times {\rm GOF}&\cr &= {\rm GOF}\times {\bf S}\,{\bf Id}\, {\bf S}^{\rm t}\times {\rm GOF} ={\rm GOF}^{2}\times{\bf B}. & (14)}]

The events of the normal distribution of the vector R are generated using a random Gaussian number generator with zero mean and unitary sigma. The software MoPro generates random Gaussian numbers using the `ratio of uniform deviates' method introduced by Kinderman & Monahan (1977[Kinderman, A. J. & Monahan, J. F. (1977). ACM Trans. Math. Software (TOMS), 3, 257-260.]) and augmented with quadratic bounding curves by Leva (1992[Leva, J. L. (1992). ACM Trans. Math. Software, 18, 449-453.]). To avoid rare events which would lead to meaningless deviating charge-density models, the algorithm is modified to generate random numbers following a truncated Gaussian function. This modification consists of reducing the infinity support of the Gaussian probability density function to a [−4; 4] interval and of normalizing the resulting function in order to obtain a unitary variance.

Following the Monte Carlo procedure described in this section, several deviating charge-density models are generated using equation (12[link]). The studied properties are computed on all these models and the SSDs are deduced from the sample values. The method is applied in the current study to the BOH2 molecule.

The number of deviating models required depends on the expected precision of SSDs. For any property P, assuming it follows a normal distribution with [({\mu}_{P}\semi {\sigma}_{P})], if a sample of N events, [({p_i})_{i \in [1\semi N]}], is taken from its distribution, the SSD, estimator of [{\sigma _P}], can be defined as

[{\rm SSD} = \left[{{\sum _{i = 1}^{N}{({p}_{i}-\langle p\rangle )}^{2}}\over{N-1}}\right]^{1/2}. \eqno (15)]

The quantity [(N-1)^{1/2} {\rm SSD}/{\sigma}_{P}] follows a χ probability distribution with N - 1 degrees of freedom (for more details, see Appendix A[link]). This implies that the expected relative standard deviation of the estimator SSD can be approximated as follows:

[\sigma_{\rm SSD}/\mu_{\rm SSD} \simeq 1/[2(N-1)]^{1/2} \eqno (16)]

where [{\mu _{\rm SSD}}] is the expected uncertainty value and [{\sigma _{\rm SSD}}] the standard deviation of the estimator SSD. We can select a number of events large enough to have an expected relative standard deviation value [equation (16[link])] smaller than a limit value es (see Fig. S1 in the supporting information). This information is relevant to estimate the number of deviating models necessary for a proper estimation of model property uncertainties. Expression (16)[link] is however only strictly valid for a sample of random values from a normally distributed population. It is used in our study to estimate the uncertainty of the SSD for derived properties, assuming that their distributions are normal. For example, using N = 20 deviating models, for any derived property SSD, a relative standard deviation of 16% is expected. This precision is enough to estimate standard deviations of the considered properties with one significant digit.

The method is tested on the charge-density model of the BOH2 molecule, by generating a series of 20 randomly deviating models from which various derived properties are calculated, along with their SSDs.

For some examples of derived properties, a larger sample of 500 models has been used to obtain population histograms and to check the nature of population distributions. These histograms (Laplacian and ellipticity at the bond critical point, electrostatic energy) are provided in Fig. S2. It appears almost all properties have unimodal and Gaussian-like population distributions. The histogram of wR2(I) factors is also shown; the value for the refined model is 3.616%, while for the perturbed structures the R values are always higher and the average wR2(I) is 3.694% with a SSD of 0.007.

4. Results

4.1. Geometry properties: distances and angles

The good accordance between SSD and e.s.d. values has been verified for the parameters used to describe the structure and charge density. For example, Fig. S3 shows the agreement between e.s.d.'s issued from the least-square normal matrix inversion using equation (6[link]) and the SSD values obtained from 20 deviating structures for the atomic fractional coordinates.

The SSD has been calculated for the bond distances and angles. For comparison, e.s.d.'s have also been retrieved from the error propagation method implemented in the MoPro software, and the relative differences between the SSD and e.s.d., |SSD − e.s.d.|/e.s.d., are calculated to check the reliability of the error propagation method.

In the plots SSD versus e.s.d. for the interatomic distances and angles between non-H atoms (Fig. 2[link]), the points are distributed along the y = x line. Moreover, the maximal value of [| {\rm SSD} - {\rm e.s.d.} |/{\rm e.s.d.}] is 25% for interatomic distances and 30% for interatomic angles, which implies a good agreement between SSD and e.s.d. if only one significant digit is expected.

[Figure 2]
Figure 2
Scatter plot of the sample standard deviations (SSDs) versus the estimated standard deviations (e.s.d.) computed with MoPro for interatomic distances between non-H atoms, and angles between three non-H atoms. The first bisector is plotted as the black dashed line.

4.2. Electron density

The statistical procedure used to estimate standard deviations can be extended to any molecular property, including the static electron density. The static deformation electron density in the (C11, B1, O2) plane is considered as an example in Fig. 3[link](a). The SSD map (Fig. 3[link]b) in the (C11, B1, O2) plane shows significant features near atomic nuclei, which is expected as the electron density takes large values and varies drastically in their vicinity with the nuclei coordinate shifts. These features around nuclei are anisotropic, which can be related to the positive and negative multipolar deformation density in the map (Fig. 3[link]a) due to the formation of covalent bonds or the electron lone pairs in the O-atom case.

[Figure 3]
Figure 3
Static deformation electron density in the (C11, B1, O2) plane. Atoms C9, C10 and O1 are also in the plane. (a) Deformation with contours of ± 0.05 e Å−3. Blue solid line, positive; red dotted lines, negative. (b) Sample standard deviation, SSD, of the deformation density deduced from 20 models with contours of ± 0.005 e Å−3.

The SSD(ρ) level is found to be below 0.015 e Å−3 on the covalent bonds between non-H atoms; for bonds involving H atoms SSD(ρ) is, in comparison, slightly higher, but still below 0.020 e Å−3.

4.3. Topology of covalent bonds

For each deviating model, covalent bond topological analysis is performed using the software VMoPro and the results processed by statistical analysis for SSD estimation. For each covalent bond, the distances between bonded atomic nuclei and the corresponding BCP position are reported, with the topological properties, in Table 2[link].

Table 2
Properties of critical points (CPs) for covalent bonds

Bond lengths and distances of CPs to the two bonded atoms are given. The electron density, Laplacian value and ellipticity at CP positions are also reported. The SSD uncertainties are given in parentheses. Some remarkable values are indicated in bold. For the XY bonds (non-H atoms), the e.s.d.'s obtained by standard propagation error are also reported in brackets.

Atom Distance (Å)      
X Y (X, Y)   (X, CP) (Y, CP) ρ(rCP) (e Å−3) 2ρ (e Å−5)
C4 C5 1.3935 (3) [4] 0.68 (2) 0.72 (2) 2.184 (8) −20.7 (4) 0.21 (2)
C3 C4 1.3929 (3) [4] 0.683 (9) 0.710 (9) 2.18 (2) −20.3 (6) 0.222 (9)
C5 C6 1.3955 (3) [3] 0.696 (9) 0.700 (9) 2.11 (2) −19.5 (4) 0.25 (1)
C2 C3 1.3959 (4) [4] 0.643 (8) 0.753 (8) 2.11 (2) −19.1 (3) 0.20 (1)
C1 C6 1.3975 (2) [3] 0.685 (8) 0.712 (7) 2.166 (9) −19.8 (4) 0.235 (7)
C1 C2 1.4014 (3) [3] 0.706 (8) 0.695 (8) 2.17 (2) −19.3 (4) 0.20 (1)
C1 C7 1.5079 (2) [3] 0.784 (6) 0.724 (6) 1.70 (2) −11.5 (3) 0.019 (9)
C7 C8 1.5340 (2) [3] 0.776 (5) 0.758 (5) 1.592 (7) −9.9 (2) 0.101 (8)
C8 C9 1.5331 (2) [3] 0.775 (5) 0.758 (5) 1.62 (1) −9.9 (3) 0.15 (2)
C9 C10 1.5001 (2) [2] 0.712 (6) 0.788 (6) 1.73 (1) −13.2 (3) 0.17 (1)
C10 C11 1.3439 (2) [2] 0.682 (8) 0.662 (8) 2.28 (2) −21.6 (4) 0.334 (8)
B1 C11 1.5614 (2) [2] 0.495 (3) 1.066 (3) 1.30 (2) −7 (1) 0.09 (3)
B1 O1 1.3711 (2) [2] 0.4496 (6) 0.9217 (6) 1.35 (1) 19.8 (8) 0.002 (7)
B1 O2 1.3762 (2) [2] 0.4519 (6) 0.9251 (6) 1.33 (2) 17.5 (9) 0.03 (2)
C4 H4 1.066 (8)   0.701 (8) 0.366 (6) 1.78 (2) −16.4 (5) 0.047 (6)
C5 H5 1.059 (7)   0.717 (7) 0.342 (6) 1.80 (2) −17.7 (4) 0.065 (5)
C3 H3 1.060 (9)   0.726 (7) 0.334 (5) 1.80 (2) −16.7 (4) 0.056 (6)
C6 H6 1.064 (7)   0.719 (9) 0.345 (5) 1.81 (2) −17.2 (5) 0.056 (6)
C2 H2 1.063 (9)   0.726 (8) 0.337 (6) 1.80 (2) −15.5 (3) 0.053 (6)
C7 H71 1.056 (6)   0.676 (6) 0.380 (5) 1.745 (8) −13.5 (3) 0.031 (5)
C7 H72 1.063 (6)   0.687 (6) 0.376 (6) 1.74 (2) −14.3 (4) 0.049 (4)
C8 H81 1.063 (7)   0.689 (5) 0.374 (6) 1.72 (2) −14.6 (4) 0.048 (9)
C8 H82 1.046 (8)   0.679 (8) 0.367 (6) 1.758 (8) −14.9 (4) 0.03 (2)
C9 H91 1.037 (8)   0.65 (1) 0.383 (8) 1.691 (9) −14.0 (3) 0.09 (2)
C9 H92 1.050 (7)   0.674 (8) 0.376 (6) 1.65 (2) −11.8 (4) 0.024 (8)
C10 H10 1.055 (7)   0.682 (7) 0.373 (5) 1.75 (1) −16.8 (3) 0.084 (7)
C11 H11 1.045 (7)   0.679 (5) 0.366 (5) 1.748 (9) −14.7 (4) 0.031 (9)
O1 H1 0.932 (9)   0.719 (5) 0.214 (6) 2.36 (4) −31 (1) 0.006 (2)
O2 H2 0.936 (8)   0.724 (4) 0.211 (5) 2.39 (3) −31 (1) 0.022 (2)

The intramolecular bonds involving the B atom have the largest uncertainties on the Laplacian values. The B—O bonds in particular show positive ∇2ρCP Laplacian values and the largest uncertainties among covalent bonds between non-H atoms, as boron is a very light element with respect to oxygen. The B—O bonds also have the most accurate distances X—CP and Y—CP with uncertainties below 10−3 Å (Table 2[link]). Among the X—H bonds, the ones with O atoms have the largest (in magnitude) Laplacian values and SSDs; the relative uncertainties are however similar, around 2.4% for all X—H bonds (Table 2[link]).

Uncertainties of electron densities ρ and Laplacian values ∇2ρ at the CPs show, in the case of XY bonds (hereafter, X and Y stand for non-H atoms), average values around 0.010 e Å−3 and 0.42 e Å−5 while their maxima reach, respectively, 0.014 e Å−3 and 0.93 e Å−5. In the case of X—H bonds, the average uncertainties of ρ and ∇2ρ are quite comparable with the previous ones, with, respectively, 0.014 e Å−3 and 0.42 e Å−5 and maximal values of 0.031 e Å−3 and 0.98 e Å−5. It must be noted that, in both cases, uncertainties are dramatically below the root mean square discrepancies reported by Grabowsky et al. (2008[Grabowsky, S., Pfeuffer, T., Morgenroth, W., Paulmann, C., Schirmeister, T. & Luger, P. (2008). Org. Biomol. Chem. 6, 2295-2307.]) in a study of the charge densities of peptides. The SSD(ρCP) values are in accordance with those found in the SSD map of the static deformation density (Fig. 3[link]b). The ρ electron density and its relative SSD are shown in Fig. 4[link] along the B1—O1 bond path and the SSD(ρ) error is two orders of magnitude smaller than ρ. The errors on ρ on the B1—O1 bond are comparatively lower than those on the C—O bond of oxalic acid exemplified in the Kamiński et al. (2014[Kamiński, R., Domagała, S., Jarzembska, K. N., Hoser, A. A., Sanjuan-Szklarz, W. F., Gutmann, M. J., Makal, A., Malińska, M., Bąk, J. M. & Woźniak, K. (2014). Acta Cryst. A70, 72-91.]) study. The mean error over density 〈SSD(ρ)/ρ〉 is 1% while for the Laplacian 〈SSD(∇2ρ)/∇2ρ〉 reaches 3%.

[Figure 4]
Figure 4
Plots of the electron density ρ and its relative SSD along the B1—O1 bond path. The ρ plot is shown in logarithmic scale.

The SSD values of the (λ1, λ2, λ3) Hessian matrix ∂2ρ/∂xixj eigenvalues at the bond CPs are shown in Table S1. Examples of population histograms for the Laplacian and ellipticity at the CP of the bond B1—O1 are shown in Figs. S2(a), S2(b). It is relevant to note that the ellipticity at an electron-density CP, which is, by definition, positive, can have a drastically asymmetric statistical density distribution when its reference value derived from the converged model is small relative to its SSD (Fig. S2b).

The plot of SSD values of distances X⋯CP versus Y⋯CP for the XY covalent bonds (non-H atoms) is illustrated in Fig. 5[link]. A remarkable equality between uncertainties in the distances X⋯CP and Y⋯CP can be observed, the SSDs being generally in the 2 × 10−4–4 × 10−4 Å range. This result can be simply explained by the high accuracy of heavy-atom nucleus positions relative to BCP positions, making the uncertainty of the CP position the predominant cause of error. This is confirmed by the lower order of magnitude of the XY distance uncertainties compared with the ones on distances X⋯CP and Y⋯CP (Fig. 5[link], Table 2[link]).

[Figure 5]
Figure 5
Plot of SSD values of X⋯CP and Y⋯CP distances for all XY covalent bonds between non-H atoms. The SSDs of the XY bond distances are also shown.

The X⋯CP and H⋯CP distance SSDs involving X—H bonds are higher, mostly in the 4 × 10−4–8 × 10−4 Å range (Fig. S4). The observed SSD values of X⋯CP and H⋯CP distances are, in this case, more dissimilar, but of the same order of magnitude as the d(X, H) SSD. It has to be recalled here that H-atom positions were restrained during the model refinement (§2.4[link]); therefore the d(X⋯H) values and their uncertainties obtained depend partly on the distance restraints used.

The knowledge of uncertainties is crucial to assess the pertinence of discussions on the property values. For instance, the histogram of ellipticities with SSDs on the C—C bond CPs allows one to compare the values visually (Fig. 6[link]). With respect to SSD values, the formally double bond C10=C11 clearly has a higher ellipticity than all other bonds. Among the four formally single bonds, the differences between values are generally significant as the standard deviation between values (0.067) is 5.7 times larger than the average SSD uncertainty (0.012). The discrepancies among the aromatic bonds are less meaningful with a standard deviation between values of 0.020, which is only two times larger than the average SSD uncertainty (0.011).

[Figure 6]
Figure 6
Histogram showing the ellipticity of the C—C bonds. Error bars correspond to the SSD values. The average and root mean square deviation (r.m.s.d.) for the six C—C bonds in the aromatic cycle are also shown. Bonds are distinguished by type: aromatic in red, double in green and single in blue.

4.4. Topology of intermolecular interactions

Intermolecular interactions play a key role in crystal engineering which is an important field in chemical crystallography; therefore estimation of errors on their properties is extremely timely. In the BOH2 crystal packing, 17 unique interatomic contacts shorter than 3 Å were identified between the reference molecule and its environment, involving eight distinct neighbour molecules (Table 3[link]). The intermolecular (3,−1) CP search has been done using the software VMoPro on the 20 deviating models. All the O⋯H hydrogen bonds show non-ambiguous bond paths between the two atoms. Two of the intermolecular contact CPs have unstable bond paths, in the sense that they lead to different linked atoms within the deviating models (Table 3[link]). The first non-stable bond path involves the phenyl H4 atom of the molecule (−x + [{5\over 2}], y − ½, z) which is connected to the phenyl C atoms of the reference molecule, C1 in 13 deviating models and C6 in the seven others. The second ambiguous bond path involves another weak phenyl⋯phenyl interaction between the H3 atom of the reference molecule and either C4 (three in 20 cases) or H4 of the molecule (x − ½, y, −z + ½) (17 in 20 cases). The C⋯H contacts can be considered as very weak hydrogen bonds [respectively, ρ = 0.0364 (8) and ρ = 0.0432 (9) e Å−3] with the phenyl moiety as acceptor. Moreover, two reported van der Waals contacts between H atoms, at d(H⋯H) > 2.7 Å, yield a CP and bond path detected only in some of the deviating models and are reported in italics in Table 3[link]. Globally, the bond paths and CPs are found to be stable in the models perturbed at standard deviation in all the strongest interactions and most of the weaker ones.

Table 3
Properties of critical points (CPs) of intermolecular interactions involving molecules at distance shorter than 3 Å from any atom of the reference BOH2 molecule

Each CP is identified by its two major contributing atoms A1 and A2 which are linked by the corresponding bond path. Two CPs of weak interatomic contacts with ambiguous bond path are in bold (linked atoms are not stable). Some CPs are not detected with all deviating models (occurrence < 20) and are shown in italics. For each pair of major contributing atoms, their occurrence number, interatomic distance and distances between CP and atoms are given. The electron density, Laplacian ∇2ρ and ellipticity values at CP position are also reported. The SSD uncertainties are given in parentheses.

        Distance (Å)    
Symmetry code A1 A2 Frequency (A1, A2) (A1, CP) (A2, CP) ρCP (e Å−3) 2ρCP (e Å−5)
(i) O1 H2O 20 1.824 (7) 1.193 (4) 0.631 (9) 0.207 (5) 3.41 (8)
                 
(ii) O2 H1O 20 1.767 (9) 1.171 (4) 0.60 (1) 0.221 (7) 4.0 (2)
H10 O1 20 2.580 (6) 1.086 (6) 1.521 (4) 0.055 (2) 0.801 (6)
                 
(iii) O1 H92 20 2.801 (5) 1.592 (3) 1.226 (4) 0.0345 (6) 0.489 (5)
                 
(iv) H81 H91 13 2.711 (9) 1.43 (2) 1.32 (2) 0.0185 (8) 0.256 (4)
H81 C4 20 2.887 (4) 1.170 (5) 1.719 (7) 0.0504 (8) 0.580 (6)
H92 H11 17 2.775 (9) 1.45 (2) 1.36 (2) 0.0109 (6) 0.176 (4)
H71 H11 20 2.777 (7) 1.327 (9) 1.479 (5) 0.0153 (6) 0.227 (3)
H71 H91 20 2.663 (6) 1.351 (7) 1.319 (8) 0.0230 (6) 0.307 (4)
                 
(v) C11 H5 20 2.939 (6) 1.768 (5) 1.217 (6) 0.0363 (7) 0.417 (3)
O2 H6 20 2.906 (7) 1.737 (8) 1.210 (7) 0.026 (2) 0.368 (5)
                 
(vi) C1 H4 13 2.916 (9) 1.82 (1) 1.179 (7) 0.0364 (8) 0.452 (5)
C6 H4 7 2.957 (6) 1.816 (4)
H71 H5 20 2.262 (8) 1.177 (8) 1.121 (7) 0.040 (2) 0.564 (7)
                 
(vii) H72 H3 20 2.376 (8) 1.194 (6) 1.182 (8) 0.035 (2) 0.483 (4)
H2 H4 20 2.793 (8) 1.45 (2) 1.45 (2) 0.0146 (5) 0.198 (6)
                 
(viii) H2 C3 20 2.972 (7) 1.240 (7) 1.777 (7) 0.0353 (6) 0.408 (5)
H2 H4 17 2.425 (5) 1.153 (6) 1.352 (7) 0.0432 (9) 0.528 (5)
H2 C3 3 2.972 (7)
Symmetry operators: (i) −x, −y + 2, −z; (ii) −x + ½, y − ½, z; (iii) −x + 1, −y + 2, −z; (iv) −x + [{3\over 2}], y − ½, z; (v) x − 1, y, z; (vi) −x + [{5\over 2}], y − ½, z; (vii) −x + 2, y − ½, −z + ½; (viii) x − ½, y, −z + ½.

For the properties at the intermolecular bond CPs which are systematically detected in all models, the uncertainties of ρCP and Laplacian ∇2ρCP values are of the same magnitude as those shown by Kamiński et al. (2014[Kamiński, R., Domagała, S., Jarzembska, K. N., Hoser, A. A., Sanjuan-Szklarz, W. F., Gutmann, M. J., Makal, A., Malińska, M., Bąk, J. M. & Woźniak, K. (2014). Acta Cryst. A70, 72-91.]) and do not exceed 6 and 4% in relative values, respectively. Similar uncertainty could also be observed in the intermolecular area from the static deformation density SSD map (Fig. 3[link]b), where SSD values tend to be lower than the 0.005 e Å−3 contour level outside of the molecule.

The mean error over density at the intermolecular CPs 〈SSD(ρ)/ρ〉 is 3%. For the Laplacian, the mean 〈SSD(∇2ρ)/|∇2ρ|〉 is 3.7% on the two hydrogen bonds while it reaches only 1.3% on the weaker interactions. The relative errors are similar for ρCP on the covalent bonds and non-bonded interactions. Conversely, Laplacian values generally have a lower relative error on weak interactions compared with strong hydrogen bonds or covalent bonds.

The SSD of GCP and VCP, the kinetic and potential energy density (Espinosa et al., 1998[Espinosa, E., Molins, E. & Lecomte, C. (1998). Chem. Phys. Lett. 285, 170-173.]), respectively, derived from ρCP and ∇2ρCP, can also be computed. The dissociation energy EHB = −VCP of the two O⋯H—O hydrogen bonds present in the BOH2 crystal packing was estimated. For O1⋯H2O, EHB = 37.9 (9) and for O2⋯H1O, EHB = 41 (2) kJ mol−1; the relative errors are therefore 2.3 and 5.6%, respectively.

4.5. Atomic charges

The atoms in molecules (AIM) topological analysis is extended to the integrated topological properties. The series of topological analysis results are used to estimate the uncertainties of atomic basin charges (Table 4[link]). The integrated charge SSDs are found to be higher for C atoms (0.03 to 0.06 e) than for H, B and O atoms in the current structure (below 0.02 e in general). The average SSD value over all atomic charges is 0.024 e and the maximal SSD is obtained for the C5 atom of the phenyl moiety with 0.058 e. Such values are smaller but of the same order of magnitude as the typical uncertainties of atomic valence populations obtained from the variance–covariance matrix at the end of the multipolar refinement (Table 4[link]). The SSDs of atomic basin electronic charges Qtopo are plotted against the e.s.d. of atomic valence populations Pval (Fig. 7[link]). The valence population e.s.d. and SSD values are in good agreement (Table 4[link], Fig. S9). For most of the atoms, the valence population e.s.d. and SSD values are larger than the SSD of the corresponding integrated atomic basin charge. However, as SSD values of topological charges are consistently around a few tenths of an electron while their values can vary by several orders of magnitude (between 3 × 10−3 e for H72 and 2.4 e for the B atom), the corresponding relative uncertainties of Qtopo atomic charges can reach high values, especially for atoms bearing low integrated charges. For some H atoms, the uncertainty is larger than their weak charge (H71, H72, H82, H91) (Table 4[link]). For the O atoms which bear a negative Qtopo charge of about −1.3 e, the relative error [{\rm SSD}({{Q_{{\rm{topo}}}}} )/|{Q_{{\rm{topo}}}}|] reaches on the other hand only 1.4%. The strongly positive atomic charge Qtopo of the B atom bonded with these two O atoms leads to a low relative error of 0.6%.

Table 4
Atomic charges in electrons along with their SSD values

Qtopo (e) charges are integrated over the atomic basins of the molecule isolated from the crystal. Qval = NvalPval are the atomic charges derived from the valence populations. The e.s.d. values are the estimated standard deviations of Pval directly derived from the full normal matrix inversion.

Atom Qtopo SSD Qval e.s.d. SSD
C1 −0.093 0.035 −0.017 0.054 0.067
C2 −0.048 0.038 −0.208 0.057 0.054
C3 −0.262 0.035 −0.128 0.063 0.036
C4 −0.116 0.048 −0.112 0.062 0.047
C5 −0.197 0.058 −0.227 0.059 0.059
C6 −0.160 0.045 −0.175 0.054 0.059
C7 0.020 0.034 −0.112 0.047 0.057
C8 −0.012 0.031 −0.25 0.048 0.051
C9 0.026 0.027 −0.241 0.048 0.051
C10 −0.024 0.028 −0.238 0.043 0.052
C11 −0.823 0.029 0.228 0.055 0.059
H2 0.094 0.015 0.129 0.024 0.019
H3 0.126 0.016 0.090 0.024 0.021
H4 0.068 0.017 0.064 0.025 0.024
H5 0.136 0.017 0.144 0.023 0.019
H6 0.110 0.025 0.131 0.022 0.027
H71 −0.005 0.010 0.069 0.023 0.021
H72 0.003 0.014 0.054 0.023 0.024
H81 0.060 0.014 0.159 0.019 0.020
H82 0.005 0.017 0.136 0.021 0.025
H91 −0.009 0.021 0.126 0.024 0.027
H92 0.043 0.013 0.163 0.021 0.026
H10 0.054 0.017 0.157 0.021 0.020
H11 0.067 0.013 0.022 0.021 0.023
H1O 0.584 0.012 0.355 0.015 0.014
H2O 0.561 0.008 0.325 0.015 0.010
O1 −1.326 0.019 −0.241 0.024 0.025
O2 −1.291 0.018 −0.237 0.024 0.020
B1 2.409 0.013 −0.171 0.057 0.048
[Figure 7]
Figure 7
Sample standard deviation (SSD) of atomic basin electronic charge Qelec plotted versus the estimated standard deviation (e.s.d.) of atomic valence population Pval.

The two definitions of charges Qtopo and Pval derived are generally in good agreement, except for the B and O atoms which show much larger Qtopo charges. When the charge integration is carried out on the pro-molecule with spherical neutral atoms (IAM), the B atom turns out to have Qtopo = +1.60 e charge, while for the O1 atom Qtopo is −1.04 e, values which are far from the zero charge of a neutral atom. Therefore, the raw topological charges are not always to be compared with the Pval derived charges when atoms with very dissimilar atomic numbers, such as B and O, form a covalent bond. Non-zero Qtopo charges were recently reported by Stachowicz et al. (2017[Stachowicz, M., Malinska, M., Parafiniuk, J. & Woźniak, K. (2017). Acta Cryst. B73, 643-653.]) for a CaF2 crystal when using the IAM model.

4.6. Electrostatic potential

The 0.001 a.u. (a.u. = atomic units) electron-density iso­surface of the isolated molecule was chosen to map the molecular electrostatic potential Φ and its sample standard deviation [{\rm SSD}(\varphi )] (Fig. 8[link]a). On this surface, the SSD of the electrostatic potential on the molecular surface lies between 5 × 10−3 and 2 × 10−2 e Å−1 and the average `signal over uncertainty' ratio [\langle | \varphi |/{\rm SSD}(\varphi)\rangle] reaches 4.8. As depicted in Fig. 8[link] and Fig. S5, there is no clear correlation between the electrostatic potential SSD on the isosurface and its absolute value on the electron-density isosurface (correlation coefficient = 17%). Regions of highest [{\rm SSD}(\varphi )] can be seen nearby the H2, H3 and H4 phenyl ring H atoms and close to the B atom (blue patches on Fig. 8[link]b). These locally large [{\rm SSD}(\varphi )] values can be explained by the fact that these H atoms present the largest thermal displacement parameters (2.9 < Beq < 3.2 Å2) in the BOH2 compound, leading to larger uncertainties on their positions. Similarly, the high [{\rm SSD}(\varphi )] values observed in the vicinity of the B atom can be explained by an e.s.d. on its valence population that is twice as large as those of their neighbour O atoms (Table 4[link]), locally increasing the SSD of the molecular electrostatic potential. Molecular surface points which are mostly under the electrostatic influence of these atoms show consequently particularly large [{\rm SSD}(\varphi )] values. Nearly 90% of the considered surface points present [{\rm SSD}(\varphi )] values lying between 0.008 and 0.016 e Å−1, distributed around the 0.012 e Å−1 average value and spanning the whole electrostatic potential values range (−0.16 to +0.32 e Å−1). The three-dimensional distribution of [{\rm SSD}(\varphi )] values is presented in Fig. S6 by the mean of three superimposed 0.04, 0.02 and 0.01 e Å−1 isosurfaces. As expected, the [{\rm SSD}(\varphi )] increases strongly in close vicinity to the atomic nuclei, where electrostatic potential variations become large due to the perturbed nuclei positions and valence populations in the 20 considered models contributing to the statistics. The volume of space located between the 0.01 and 0.02 e Å−1 isosurfaces of [{\rm SSD}(\varphi )] encompasses typical intermolecular interaction distances, i.e. regions where electrostatic potential is usually interpreted. The [\varphi /{\rm SSD}(\varphi )] ratio is useful to estimate the electrostatic potential statistical significance on various regions of the electron-density surface (Kamiński et al., 2014[Kamiński, R., Domagała, S., Jarzembska, K. N., Hoser, A. A., Sanjuan-Szklarz, W. F., Gutmann, M. J., Makal, A., Malińska, M., Bąk, J. M. & Woźniak, K. (2014). Acta Cryst. A70, 72-91.]). This property, mapped on the electron-density surface, is represented in Fig. 8[link](c). The electrostatic potential is therefore statistically very significant in regions of strong values, with [\varphi /{\rm SSD}(\varphi )] reaching 16 in our case. Conversely, [\varphi /{\rm SSD}(\varphi )] becomes lower than unity when the electrostatic potential falls below ∼0.02 e Å−1, in absolute value, which can be interpreted as a broadening of the zero potential contour regions on the molecular surface, as represented in white in Fig. 8[link](c) using a significance criterion of [| \varphi |/{\rm SSD}(\varphi ) \,\gt\, 2]. Regions located either side of this low-potential stripe can then be considered as either electropositive or electronegative with a high degree of confidence.

[Figure 8]
Figure 8
Electrostatic properties mapped on the 0.001 a.u. electron-density surface of the BOH2 compound. (a) Electrostatic potential φ, (b) SSD(φ) and (c) electrostatic potential divided by the SSD value [φ/SSD(φ)].

4.7. Electrostatic energy

Eight unique dimers of molecules in contact have been identified in the crystal packing. Considering each dimer, the intermolecular electrostatic energy is computed for the 20 perturbed models using the EP/MM method (Volkov et al., 2004[Volkov, A., Koritsanszky, T. & Coppens, P. (2004). Chem. Phys. Lett. 391, 170-175.]) as implemented in the software VMoPro, and the SSD is calculated as the uncertainty estimator (Table 5[link]).

Table 5
Total electrostatic interaction energy between interacting dimers in the crystal and the standard deviation in the sample

The energy summation was performed with a unitary coefficient for all dimers except for the involutional symmetry operators (−x, −y + 2, −z and −x + 1, −y + 2, −z) which were counted as half. Non-involutional symmetry operators f form two equivalent dimers around the reference molecule, with operators f and f−1. The SSDs were computed on 20 deviating models generated using the full least-squares normal matrix (`SSD all parameters') and using the reduced normal matrix obtained excluding the contributions of Uij parameters (`SSD no Uij').

Symmetry Eelec (kJ mol−1) SSD all parameters SSD no Uij
x, −y + 2, −z −62.2 4.2 5.1
x + ½, y − ½, z −37.2 3.2 2.7
x + 1, −y + 2, −z −16.5 1.7 1.7
x + [{3\over 2}], y − ½, z −9.1 3.0 2.0
x − 1, y, z −1.1 2.0 1.7
x + [{5\over 2}], y − ½, z 0.5 2.4 2.3
x + 2, y − ½, −z + ½ 2.4 1.8 1.2
x − ½, y, −z + ½ 6.2 3.1 2.3
Sum −77.7 14.8 8.9

Absolute errors of intermolecular electrostatic interaction energies in dimers of BOH2 molecules appear consistently of a few kJ mol−1. Therefore the SSD relative error is as low as 7% for the largest value Eelec = −62.2 kJ mol−1 but for the weakest interactions the SSD is larger than Eelec itself. Such large relative errors confirm clearly that weak electrostatic interaction energies of a few kJ mol−1 cannot be interpreted as either stabilizing or destabilizing. This is perfectly in line with the chemical accuracy in computational chemistry, generally considered to be around 5 kJ mol−1 (or 1 kcal mol−1) (Perdew et al., 1999[Perdew, J. P., Kurth, S., Zupan, A. & Blaha, P. (1999). Phys. Rev. Lett. 82, 2544-2547.]). For the energy summed over all dimers, the error reaches 19%. As the energy value results from an integration product between the electron density and the electrostatic potential, the relative errors of the two factors accumulate to yield a larger relative error.

Examples of Gaussian-like population histograms for electro­static interaction energies are shown in the supporting information for symmetry operations (x − 1, y, z) and (−x, −y + 2, −z) (Fig. S2).

4.8. Parameters to take into account

In the method proposed, the generation of a series of deviating models is done by the calculation of the square root matrix S [equation (11[link])] which is obtained after diagonalization of the full normal matrix A whatever the derived property of interest. In practice, the procedure bears some similarity to a refinement step [equation (3[link])], but the inverted normal matrix B is replaced by its square root S [equations (11[link]) and (12[link])] and the vector V is replaced by random numbers R.

The SSDs of the dimers' electrostatic energy obtained from 20 deviating models generated starting with the reduced least-squares normal matrix obtained excluding the contributions of the Uij thermal displacement parameters are also shown in Table 5[link]. Nearly all these SSDs are smaller compared with the standard procedure where the full normal matrix is used. The SSD of the total Eelec value is significantly reduced from 14.8 to 8.9 kJ mol−1 when thermal displacement parameters are excluded from the normal matrix. Although the Uij parameters are not directly involved in the equation describing the static electron density and the electrostatic potential, they do have an impact on the magnitude of SSD values.

This is due to the properties of the inversion of the symmetric positive-definite matrix. It is demonstrated in Appendix B[link] that, when more parameters are refined, the diagonal elements of the inverted normal matrix B = A−1 take larger values. Consequently, when the number of refined parameters is increased, the e.s.d.'s of parameters become larger and the SSDs of derived properties also tend to increase. This is especially the case when there are significant correlations between parameters. Obtaining very high resolution in the diffraction data set tends actually to globally diminish the correlations between parameters (Jelsch et al., 2000[Jelsch, C., Teeter, M. M., Lamzin, V., Pichon-Pesme, V., Blessing, R. H. & Lecomte, C. (2000). Proc. Natl Acad. Sci. USA, 97, 3171-3176.]) and helps in the deconvolution between thermal displacement and charge-density parameters.

Some properties may involve only part of the parameters, such as electrostatic interaction energy between molecular fragments. If this property depends only on a few atom parameters, the `square root matrix S calculation' step [equation (11[link])] could in principle be performed considering the reduced normal matrix corresponding only to these specific atomic parameters. This will however lead to an underestimation of SSD values. It is therefore recommended that SSDs are obtained using a full normal matrix issued from all parameters. For this reason, thermal parameters should be taken into account in the normal matrix calculation when generating the perturbed structures, although they do not have a direct impact on the charge density.

5. Conclusion

At the convergence of a least-squares crystallographic refinement against diffraction data, the e.s.d.'s of the parameters used to model the molecular structure and electron density can be directly retrieved. However, the uncertainties on derived molecular properties are not readily available. To estimate the errors of properties, series of models at `standard deviation' from the final refined model can be easily generated by using vectors of random numbers and a square root of the inverted normal matrix. The SSDs obtained for the properties derived from a sample of such deviating structures can be used as estimated values of their uncertainties. For instance, samples of 20 perturbed structures yield SSD values with an expected relative precision of 16%. The average value of properties P in the perturbed models appears to be generally within one SSD from the final refined value; in the case of topological integrated charges and electrostatic energies, it was, for instance, found that [| \langle P\rangle - P_{{\rm refined}}|/{\rm SSD}(P ) \,\lt\, 1.1].

In the BOH2 structure, the SSD of the electron density at the XY bond CPs is in the range 0.01 to 0.04 e Å−3, which represents 0.5 to 2% in relative value. The average SSD on the corresponding Laplacian values is 0.42 e Å−3 and the average relative error SSD(∇2ρ)/|∇2ρ| is 3%. The average uncertainty on the ellipticity on XY bond CPs is found to be around 0.01 and is usually not dependent on the ellipticity value ranging here from 0.002 to 0.33. For X—H bonds, the average SSD() is 0.007, while the maximal value is 0.012. For interacting molecular dimers of the BOH2 molecule in the crystal, the error on the electrostatic energy is typically in the 2 to 4 kJ mol−1 range. Intermolecular topological bond paths were found to be stable and preserved in most of the 17 interactions, except for four weak contacts. The SSD of the electrostatic potential on the molecular surface lies between 5 × 10−3 and 2 × 10−2 e Å−1. High absolute values of electrostatic potential, which are usually interpreted as electronegative or electropositive sites, are shown to be significant with high signal-over-noise ratios.

The availability of estimated errors is important for the proper interpretation of experimental charge-density results, for instance, in the comparison of properties among similar chemical groups in a molecule, or of independent molecules in the asymmetric unit. Discrepancies found in the properties of chemically equivalent atoms or of covalent bonds are physically meaningful only if they are significantly larger than the estimated error.

The comparison of closely related but different compounds such as topological properties in different peptides as investigated by Flaig et al. (1999[Flaig, R., Koritsánszky, T., Janczak, J., Krane, H.-G., Morgenroth, W. & Luger, P. (1999). Angew. Chem. Int. Ed. 38, 1397-1400.]) and Grabowsky et al. (2008[Grabowsky, S., Pfeuffer, T., Morgenroth, W., Paulmann, C., Schirmeister, T. & Luger, P. (2008). Org. Biomol. Chem. 6, 2295-2307.]) is also more pertinent when an estimation of errors is available.

One should also recall that the actual errors obtained by the SSD method give information about the precision but may not take into account the effects of systematic errors on model accuracy. The structural and charge-density parameters may be driven away from their `true' values to compensate for the systematic errors, while the crystallographic R factors may not be significantly worsened.

APPENDIX A

For any parameter Y distributed following a normal distribution with [({{\mu _y}\semi {\sigma _y}} )], if a sample of N events, [{({{y_i}} )_{i \in [1\semi N]}}], is taken from its distribution, the sample standard deviation s, biased estimator of [{\sigma _y}], can be defined as

[s = \left[{{\sum _{i = 1}^{N}{({y}_{i}-\langle y\rangle )}^{2}}\over{N-1}}\right]^{1/2}. \eqno (17)]

The quantity [ (N-1){s}^{2}/{\sigma }_{y}^{2}] follows a [{\chi^2}] probability distribution with N - 1 degrees of freedom, with a s2 sample variance. The standard deviation [{\sigma _{{\chi ^2}}}] of a χ probability distribution function with k degrees of freedom is defined as

[{\sigma _{{\chi ^2}}} = (2k)^{1/2}. \eqno (18)]

Using this expression, the standard deviation [{\sigma _{{s^2}}}] of the sample variance s2 distribution can be derived,

[{\sigma _{{s^2}}} = \left ({2 \over {N - 1}}\right)^{1/2} \sigma _y^2, \eqno (19)]

and, by propagation of the uncertainty, the standard deviation [{\sigma _s}] of the sample standard deviation s distribution is approximated by

[{\sigma }_{s} \simeq {{{\sigma }_{{s}^{2}} }\over{{ 2 \mu }_{s}}} \simeq {{{\sigma }_{y}^{2}}\over{[2(N-1)]^{1/2}{ \mu }_{s}}} \eqno (20)]

where [{\mu _s}] is the mean of the sample standard deviation s distribution.

As an estimator of [{\sigma _y}], the expected value [{\mu _s}] of the sample standard deviation s is well known to be approximately equal to [{\sigma _y}] and thus the relative standard deviation on [{\mu }_{s}] becomes

[{{{\sigma }_{s}}\over{{ \mu }_{s}}} \simeq {{1}\over{[2(N-1)]^{1/2}}}. \eqno (21)]

APPENDIX B

The goal of this appendix is to discuss the way, after the convergence of an m+n parameter model refinement, to calculate the e.s.d.'s of only m parameters. The standard way is to derive the e.s.d.'s from the diagonal terms of the inverse matrix of the full (m+n, m+n) normal matrix A. To reduce computational resources, one could deduce e.s.d.'s starting from a principal (m, m) submatrix U of the normal matrix A corresponding to the m parameters considered. This second method will lead to underestimated e.s.d.'s as explained below. This demonstration uses the properties of symmetric positive-definite matrices.

In the simple case of a refinement with only two parameters (x1 and x2), the normal matrix can be written as:

[{\bf A} = \left(\matrix{u^2 & ruw\cr ruw &w^2}\right)\eqno (22)]

where r is the ratio of the weighted scalar product between the two sets of intensity partial derivatives [{{\partial I_{\bf H}^{\rm calc}}/{\partial x_1}}] and [{{\partial I_{\bf H}^{\rm calc}}/{\partial x_2}}] [see equation (4)[link]] and the product of their weighted norms. r can be considered as the cosine between two vectors and therefore −1 ≤ r ≤ 1.

The inverted normal matrix is then

[{\bf A}^{-1} = {{1}\over{1 - r^2}} \left(\matrix { u^{-2} & -ru^{-1} w^{-1} \cr -ru^{-1} w^{-1} & w^{-2}}\right). \eqno (23)]

One can note that −r is equal to the correlation coefficient between the two parameters. The e.s.d.'s are increased, as they are divided by (1-r2) when the two parameters are refined together, illustrating the strong impact of a large parameter correlation.

In the general case, let us suppose the full normal matrix A is invertible, positive-definite and decompose it into

[{\bf A} = \left(\matrix { {\bf U} & {\bf V} \cr {\bf V}^{\rm t} & {\bf W}}\right) \eqno (24)]

with U and W principal (m,m) and (n,n) submatrices of A corresponding, respectively, to the m parameters of interest and the n extra ones, and V its off-diagonal (m,n) submatrix.

As principal submatrices of the positive-definite matrix A, U and W are positive-definite matrices. The inverse matrix of A, noted [{\bf A}^{-1}], is also positive-definite and can be partitioned into four blocks as

[{\bf A}^{-1} = \left(\matrix { {\bf C} & {\bf D} \cr {\bf D}^{\rm t} & {\bf E}}\right) \eqno (25)]

where C and E are the positive-definite principal (m,m) and (n,n) submatrices, respectively, and D is the off-diagonal (m,n) submatrix.

The product [{\bf A A}^{-1}] yields the identity matrix [{\bf I}_{m+n}]:

[\left(\matrix { {\bf U C} + {\bf V D}^{\rm t} & {\bf U D} + {\bf V E} \cr {\bf V}^{\rm t} {\bf C} + {\bf W D}^{\rm t} & {\bf V}^{\rm t} {\bf D} + {\bf W E}}\right) = {\bf I}_{m+n}. \eqno (26)]

By identification, the following relations apply:

[{\bf V}^{\rm t} {\bf C} + {\bf W D}^{\rm t} = {\bf 0} \eqno (27a)]

[{\bf U D} + {\bf V E} = {\bf 0} \eqno (27b)]

[{\bf U C} + {\bf V D}^{\rm t} = {\bf I}_{m} \eqno (27c)]

[{\bf V}^{\rm t} {\bf D} + {\bf W E} = {\bf I}_{n}. \eqno (27d)]

As mentioned previously, the submatrix U is invertible. Thus, the equations (27b)[link] and (27c)[link] imply, respectively, that the matrices D and C satisfy

[{\bf D} = -{\bf U}^{-1} {\bf V E} \eqno (28)]

[{\bf C} = {\bf U}^{-1} - {\bf U}^{-1} {\bf V D}^{\rm t}. \eqno (29)]

Then, the combination of equations (28)[link] and (29)[link] yields

[{\bf C} = {\bf U}^{-1} + {\bf U}^{-1} {\bf V E V}^{\rm t} {\bf U}^{-1}. \eqno (30)]

As a principal submatrix of the positive matrix A−1, E is also a positive-definite matrix and its Cholesky decomposition is written, with L a lower triangular matrix, as follows:

[{\bf E} = {\bf L L}^{\rm t}. \eqno (31)]

Then, the combination of equations (30)[link] and (31)[link] gives

[{\bf C} = {\bf U}^{-1} + {\bf U}^{-1} {\bf V L L}^{\rm t} {\bf V}^{\rm t} {\bf U}^{-1}. \eqno (32)]

Let us define the (m,n) matrix M as [{\bf M} = {\bf U}^{-1} {\bf V L}]; equation (32)[link] becomes

[{\bf C} = {\bf U}^{-1} + {\bf M M}^{\rm t}. \eqno (33)]

The diagonal elements of the product matrix [{\bf M M}^{\rm t}], noted T, are positive numbers:

[T_{ii} = \textstyle\sum\limits_{j = 1,n} M^{2}_{ij}\ge 0. \eqno (34)]

Therefore, the first m diagonal element of [{\bf A}^{-1}] is augmented compared with those of matrix [{\bf U}^{-1}]:

[A^{-1}_{ii} = U^{-1}_{ii} + T_{ii} \ge U^{-1}_{ii}. \eqno (35)]

Assuming GOF = 1, equation (35)[link] implies the following inequality between the e.s.d.'s of the m parameters of interest derived from the full normal matrix A [e.s.d.Aii = (Aii-1)1/2] and those derived from the reduced matrix U [e.s.d.Uii = (Uii-1)1/2]:

[{\rm e.s.d.}^A_{ii} \ge {\rm e.s.d.}^U_{ii}. \eqno (36)]

In other words, deducing the e.s.d.'s on the m parameters from a reduced normal matrix U leads to an underestimation of uncertainties. This conclusion is valid whatever the subset of parameters considered. One can note that in the particular case in which there were no correlations between the m parameters and the remaining n others, the matrix V would be equal to zero. As a result, A would be a block diagonal matrix, M = 0 and A-1ii = U-1ii: the e.s.d.'s of parameters would not be underestimated if deduced from the reduced normal matrix U.

Supporting information


Computing details top

Program(s) used to refine structure: MoPro (J. Appl. Cryst. 2005, 38, 38-54).

(E)-5-Phenyl-pent-1-enyl-boronic acid top
Crystal data top
C11H15BO2F(000) = 816
Mr = 190.04Dx = 1.165 Mg m3
Orthorhombic, PbcaMo Kα radiation, λ = 0.71073 Å
Hall symbol: -p_2ac_2abCell parameters from 50963 reflections
a = 7.5200 (1) Åθ = 2.6–60.0°
b = 9.3837 (1) ŵ = 0.08 mm1
c = 30.7120 (5) ÅT = 100 K
V = 2167.21 (5) Å3Block, colourless
Z = 80.34 × 0.18 × 0.10 mm
Data collection top
Rigaku MicroMax-HF Pilatus 200K
diffractometer
11803 independent reflections
Radiation source: fine-focus sealed tube10496 reflections with > 2.0σ(I)
Confocal Max Flux optic monochromatorRint = 0.031
Detector resolution: 5.8140 pixels mm-1θmax = 52.2°, θmin = 2.7°
Fullsphere data collection, phi and ω scansh = 016
Absorption correction: multi-scan
CrysAlisPro 1.171.38.37f (Rigaku Oxford Diffraction, 2015) Empirical absorption correction using spherical harmonics, implemented in SCALE3 ABSPACK scaling algorithm.
k = 020
Tmin = 0.472, Tmax = 0.999l = 068
117942 measured reflections
Refinement top
Refinement on F2Primary atom site location: structure-invariant direct methods
Least-squares matrix: fullSecondary atom site location: difference Fourier map
R[F2 > 2σ(F2)] = 0.026Hydrogen site location: difference Fourier map
wR(F2) = 0.039All H-atom parameters refined
S = 0.95 w = 1/[3.3*σ2(Fo2)]
14958 reflections(Δ/σ)max < 0.001
574 parametersΔρmax = 0.32 e Å3
0 restraintsΔρmin = 0.16 e Å3
Special details top

Experimental. CrysAlisPro 1.171.38.37f (Rigaku Oxford Diffraction, 2015) Empirical absorption correction using spherical harmonics, implemented in SCALE3 ABSPACK scaling algorithm.

Refinement. Refinement of F2 against reflections. The threshold expression of F2 > 2sigma(F2) is used for calculating R-factors(gt) and is not relevant to the choice of reflections for refinement. R-factors based on F2 are statistically about twice as large as those based on F, and R-factors based on ALL data will be even larger.

Fractional atomic coordinates and isotropic or equivalent isotropic displacement parameters (Å2) top
xyzUiso*/Ueq
B10.251627 (19)0.997843 (15)0.028837 (5)0.011002 (12)
C10.96386 (2)0.89957 (2)0.172477 (6)0.015083 (13)
C20.92648 (3)0.97351 (3)0.211051 (7)0.019388 (17)
C31.02734 (3)1.09177 (3)0.223317 (8)0.02290 (2)
C41.16784 (3)1.13760 (3)0.197292 (8)0.022720 (19)
C51.20751 (3)1.06375 (3)0.159143 (8)0.021551 (17)
C61.10594 (3)0.94602 (2)0.146808 (7)0.018120 (16)
C70.85128 (3)0.77401 (2)0.159105 (8)0.018915 (15)
C80.67225 (3)0.81543 (2)0.138779 (7)0.016543 (14)
C90.68873 (2)0.88966 (2)0.094495 (6)0.015373 (13)
C100.51313 (2)0.893953 (19)0.071361 (5)0.014376 (13)
C110.43514 (2)1.006991 (19)0.052461 (6)0.012834 (13)
H20.819 (3)0.9396 (12)0.2309 (6)0.0371 (12)
H30.9964 (12)1.1450 (18)0.2528 (10)0.0408 (13)
H41.242 (2)1.230 (3)0.2063 (3)0.0396 (12)
H51.313 (3)1.0982 (12)0.1390 (7)0.0369 (13)
H61.1366 (11)0.8916 (18)0.1174 (9)0.0332 (11)
H100.4449 (18)0.796 (3)0.07101 (15)0.0307 (10)
H110.5012 (16)1.105 (2)0.05400 (12)0.0276 (8)
H710.922 (2)0.712 (2)0.1362 (8)0.0356 (10)
H720.8237 (10)0.711 (2)0.1870 (9)0.0378 (11)
H810.598 (2)0.720 (3)0.13438 (19)0.0355 (10)
H820.602 (2)0.881 (2)0.1604 (7)0.0327 (10)
H910.7387 (16)0.992 (3)0.09795 (18)0.0289 (10)
H920.778 (3)0.8308 (18)0.0752 (6)0.0320 (10)
H1O0.226 (2)1.200 (3)0.0175 (2)0.0252 (11)
H2O0.058 (5)0.8762 (5)0.0102 (6)0.0264 (12)
O10.163964 (19)1.115603 (16)0.013417 (5)0.012248 (10)
O20.16948 (2)0.868002 (16)0.023395 (6)0.014655 (11)
Atomic displacement parameters (Å2) top
U11U22U33U12U13U23
B10.00972 (5)0.00844 (5)0.01485 (6)0.00044 (4)0.00137 (4)0.00029 (4)
C10.01166 (5)0.01836 (6)0.01523 (6)0.00001 (5)0.00236 (5)0.00324 (5)
C20.01600 (6)0.02737 (8)0.01479 (7)0.00194 (6)0.00015 (5)0.00141 (6)
C30.02169 (8)0.03025 (10)0.01677 (8)0.00307 (8)0.00256 (7)0.00358 (8)
C40.01826 (7)0.02603 (9)0.02388 (9)0.00496 (7)0.00482 (7)0.00090 (7)
C50.01485 (7)0.02505 (9)0.02475 (9)0.00372 (6)0.00160 (6)0.00188 (7)
C60.01488 (6)0.02091 (7)0.01856 (7)0.00037 (6)0.00202 (5)0.00102 (6)
C70.01656 (6)0.01716 (6)0.02302 (8)0.00075 (5)0.00683 (6)0.00576 (6)
C80.01272 (6)0.02000 (7)0.01691 (7)0.00187 (5)0.00238 (5)0.00355 (5)
C90.01142 (5)0.01929 (6)0.01541 (6)0.00064 (5)0.00238 (4)0.00201 (5)
C100.01289 (5)0.01346 (5)0.01678 (6)0.00061 (4)0.00458 (4)0.00010 (4)
C110.01071 (5)0.01216 (6)0.01563 (6)0.00073 (4)0.00272 (5)0.00079 (4)
H20.030 (5)0.053 (5)0.028 (5)0.008 (2)0.006 (2)0.003 (2)
H30.040 (5)0.053 (6)0.030 (6)0.000 (2)0.002 (2)0.015 (3)
H40.036 (5)0.036 (5)0.047 (5)0.011 (3)0.009 (2)0.006 (2)
H50.025 (5)0.049 (5)0.036 (5)0.011 (2)0.004 (3)0.006 (2)
H60.029 (5)0.043 (5)0.027 (5)0.0004 (18)0.002 (2)0.007 (3)
H100.029 (4)0.019 (4)0.044 (5)0.004 (2)0.0095 (17)0.0027 (17)
H110.025 (3)0.021 (4)0.037 (4)0.007 (2)0.0035 (14)0.0028 (15)
H710.030 (4)0.037 (4)0.040 (5)0.004 (2)0.007 (2)0.007 (3)
H720.035 (4)0.041 (4)0.037 (5)0.0087 (18)0.0088 (19)0.017 (3)
H810.034 (4)0.027 (4)0.046 (5)0.009 (2)0.0125 (18)0.0084 (18)
H820.030 (4)0.039 (4)0.029 (4)0.004 (2)0.002 (2)0.002 (2)
H910.030 (4)0.025 (5)0.032 (4)0.006 (2)0.0037 (15)0.0017 (16)
H920.028 (4)0.038 (4)0.031 (4)0.011 (2)0.001 (2)0.005 (2)
H1O0.023 (4)0.016 (5)0.036 (5)0.004 (3)0.0049 (17)0.0012 (18)
H2O0.021 (5)0.017 (4)0.040 (4)0.0015 (16)0.010 (3)0.0012 (15)
O10.01025 (4)0.00765 (4)0.01884 (6)0.00014 (4)0.00214 (4)0.00036 (3)
O20.01175 (5)0.00762 (4)0.02459 (7)0.00039 (4)0.00509 (5)0.00004 (4)
Geometric parameters (Å, º) top
B1—O11.3711 (2)C7—C81.5340 (3)
B1—O21.3762 (2)C7—H711.06 (3)
B1—C111.5614 (2)C7—H721.06 (3)
C1—C71.5079 (3)C8—C91.5331 (3)
C1—C61.3975 (3)C8—H821.05 (3)
C1—C21.4014 (3)C8—H811.06 (3)
C2—C31.3959 (4)C9—C101.5001 (2)
C2—H21.06 (3)C9—H911.04 (3)
C3—C41.3929 (4)C9—H921.05 (3)
C3—H31.06 (4)C10—C111.3439 (2)
C4—C51.3935 (4)C10—H101.06 (3)
C4—H41.07 (3)C11—H111.04 (2)
C5—C61.3955 (3)H1O—O10.93 (4)
C5—H51.06 (3)H2O—O20.94 (4)
C6—H61.06 (3)
O1—B1—O2117.113 (12)C8—C7—H71108.1 (6)
O1—B1—C11122.759 (12)C8—C7—H72107.3 (6)
O2—B1—C11120.118 (13)H71—C7—H72109.1 (9)
C7—C1—C6121.287 (16)C9—C8—C7113.902 (14)
C7—C1—C2120.296 (18)C9—C8—H82109.7 (7)
C6—C1—C2118.414 (17)C9—C8—H81108.2 (5)
C1—C2—C3120.846 (19)C7—C8—H82109.5 (7)
C1—C2—H2119.3 (9)C7—C8—H81107.6 (5)
C3—C2—H2119.8 (9)H82—C8—H81107.8 (8)
C4—C3—C2120.18 (2)C8—C9—C10111.174 (13)
C4—C3—H3121 (1)C8—C9—H91111.1 (6)
C2—C3—H3119.1 (9)C8—C9—H92108.2 (6)
C3—C4—C5119.43 (2)C10—C9—H91110.0 (6)
C3—C4—H4120.0 (8)C10—C9—H92108.0 (6)
C5—C4—H4120.6 (8)H91—C9—H92108.3 (8)
C4—C5—C6120.31 (2)C9—C10—C11127.586 (13)
C4—C5—H5120.0 (9)C9—C10—H10114.2 (6)
C6—C5—H5119.7 (9)C11—C10—H10118.2 (7)
C1—C6—C5120.812 (17)C10—C11—B1122.888 (13)
C1—C6—H6119.7 (9)C10—C11—H11117.8 (6)
C5—C6—H6119.5 (9)B1—C11—H11119.3 (7)
C1—C7—C8113.930 (14)B1—O1—H1O113.6 (9)
C1—C7—H71109.2 (7)B1—O2—H2O112.5 (9)
C1—C7—H72109.1 (6)
B1—C11—C10—C9179.41 (3)C7—C8—C9—H9247 (1)
B1—C11—C10—H102.0 (4)C8—C9—C10—C11131.14 (4)
C1—C7—C8—C967.25 (3)C8—C9—C10—H1047 (1)
C1—C7—C8—H8256 (2)C9—C8—C7—H7154 (2)
C1—C7—C8—H81172.9 (4)C9—C8—C7—H72171.8 (4)
C1—C6—C5—C40.39 (3)C9—C10—C11—H110.9 (3)
C1—C6—C5—H5179.1 (3)C10—C9—C8—H8271 (2)
C1—C2—C3—C40.32 (3)C10—C9—C8—H8146 (1)
C1—C2—C3—H3179.6 (3)C10—C11—B1—O1173.27 (3)
C2—C1—C7—C878.15 (3)C10—C11—B1—O25.59 (3)
C2—C1—C7—H71161.0 (8)C11—C10—C9—H917.7 (4)
C2—C1—C7—H7242 (1)C11—C10—C9—H92110 (2)
C2—C1—C6—C50.40 (3)C11—B1—O1—H1O2.1 (4)
C2—C1—C6—H6179.9 (3)C11—B1—O2—H2O177.0 (4)
C2—C3—C4—C50.48 (3)H2—C2—C3—H31.2 (5)
C2—C3—C4—H4178.1 (3)H3—C3—C4—H42.6 (5)
C3—C4—C5—C60.83 (3)H4—C4—C5—H50.9 (6)
C3—C4—C5—H5179.5 (3)H5—C5—C6—H60.4 (5)
C3—C2—C1—C7178.62 (3)H10—C10—C9—H91170.9 (6)
C3—C2—C1—C60.75 (3)H10—C10—C9—H9271 (2)
C4—C3—C2—H2179.5 (3)H10—C10—C11—H11177.7 (5)
C4—C5—C6—H6179.1 (3)H11—C11—B1—O16.4 (4)
C5—C4—C3—H3178.8 (3)H11—C11—B1—O2174.7 (4)
C5—C6—C1—C7178.97 (3)H71—C7—C8—H82177.5 (5)
C6—C1—C7—C8101.20 (3)H71—C7—C8—H8166 (2)
C6—C1—C7—H7119.7 (8)H72—C7—C8—H8265 (2)
C6—C1—C7—H72139 (1)H72—C7—C8—H8152 (1)
C6—C1—C2—H2180.0 (3)H81—C8—C9—H91169.0 (6)
C6—C5—C4—H4177.8 (3)H81—C8—C9—H9272 (2)
C7—C1—C6—H60.5 (3)H82—C8—C9—H9152 (1)
C7—C1—C2—H20.6 (3)H82—C8—C9—H92170.3 (6)
C7—C8—C9—C10165.61 (3)H1O—O1—B1—O2179.0 (4)
C7—C8—C9—H9171 (2)H2O—O2—B1—O12.0 (4)
Hydrogen-bond geometry (Å, º) top
D—H···AD—HH···AD···AD—H···A
C10—H10···O1i1.06 (3)2.58 (2)3.4297 (2)137 (1)
O1—H1O···O2ii0.93 (4)1.77 (4)2.6967 (2)176 (1)
O2—H2O···O1iii0.94 (4)1.82 (4)2.7549 (2)177 (1)
Symmetry codes: (i) x+1/2, y1/2, z; (ii) x+1/2, y+1/2, z; (iii) x, y+2, z.
 

Acknowledgements

We would like to thank Dr Alejandro Díaz-Moscoso for the synthesis of the compound.

References

First citationAllen, F. H. & Bruno, I. J. (2010). Acta Cryst. B66, 380–386.  Web of Science CSD CrossRef CAS IUCr Journals Google Scholar
First citationBader, R. F. W. (1990). Atoms in Molecules: a Quantum Theory, 1st ed. International Series of Monographs on Chemistry 22. Oxford: Clarendon Press.  Google Scholar
First citationBader, R. F. W., Carroll, M. T., Cheeseman, J. R. & Chang, C. (1987). J. Am. Chem. Soc. 109, 7968–7979.  CrossRef CAS Web of Science Google Scholar
First citationBrünger, A. T. (1992). Nature, 355, 472–475.  PubMed Web of Science Google Scholar
First citationBurla, M. C., Caliandro, R., Carrozzini, B., Cascarano, G. L., Cuocci, C., Giacovazzo, C., Mallamo, M., Mazzone, A. & Polidori, G. (2015). J. Appl. Cryst. 48, 306–309.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationCôté, A. P., El-Kaderi, H. M., Furukawa, H., Hunt, J. R. & Yaghi, O. M. (2007). J. Am. Chem. Soc. 129, 12914–12915.  Web of Science PubMed Google Scholar
First citationDing, X. S., Guo, J., Feng, X. A., Honsho, Y., Guo, J. D., Seki, S., Maitarad, P., Saeki, A., Nagase, S. & Jiang, D. L. (2011). Angew. Chem. Int. Ed. 50, 1289–1293.  CrossRef CAS Google Scholar
First citationDittrich, B., Koritsánszky, T., Grosche, M., Scherer, W., Flaig, R., Wagner, A., Krane, H. G., Kessler, H., Riemer, C., Schreurs, A. M. M. & Luger, P. (2002). Acta Cryst. B58, 721–727.  Web of Science CSD CrossRef CAS IUCr Journals Google Scholar
First citationDomagała, S. & Jelsch, C. (2008). J. Appl. Cryst. 41, 1140–1149.  Web of Science CrossRef IUCr Journals Google Scholar
First citationEspinosa, E., Molins, E. & Lecomte, C. (1998). Chem. Phys. Lett. 285, 170–173.  Web of Science CrossRef CAS Google Scholar
First citationFlaig, R., Koritsánszky, T., Janczak, J., Krane, H.-G., Morgenroth, W. & Luger, P. (1999). Angew. Chem. Int. Ed. 38, 1397–1400.  Web of Science CrossRef CAS Google Scholar
First citationFörster, D., Wagner, A., Hübschle, C. B., Paulmann, C. & Luger, P. (2006). Z. Naturforsch. B, 62, 696–704.  Google Scholar
First citationGelbrich, T., Sampson, D. & Hursthouse, M. B. (2000). University of Southampton, Crystal Structure Report Archive.  Google Scholar
First citationGrabowsky, S., Pfeuffer, T., Morgenroth, W., Paulmann, C., Schirmeister, T. & Luger, P. (2008). Org. Biomol. Chem. 6, 2295–2307.  Web of Science CSD CrossRef PubMed CAS Google Scholar
First citationGuillot, B., Enrique, E., Huder, L. & Jelsch, C. (2014). Acta Cryst. A70, C279.  CrossRef IUCr Journals Google Scholar
First citationGuillot, B., Viry, L., Guillot, R., Lecomte, C. & Jelsch, C. (2001). J. Appl. Cryst. 34, 214–223.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationHamilton, W. C. (1964). Statistics in Physical Science. New York, USA: The Ronald Press Company.  Google Scholar
First citationHansen, N. K. & Coppens, P. (1978). Acta Cryst. A34, 909–921.  CrossRef CAS IUCr Journals Web of Science Google Scholar
First citationJelsch, C., Guillot, B., Lagoutte, A. & Lecomte, C. (2005). J. Appl. Cryst. 38, 38–54.  Web of Science CrossRef IUCr Journals Google Scholar
First citationJelsch, C., Teeter, M. M., Lamzin, V., Pichon-Pesme, V., Blessing, R. H. & Lecomte, C. (2000). Proc. Natl Acad. Sci. USA, 97, 3171–3176.  Web of Science CrossRef PubMed CAS Google Scholar
First citationKamiński, R., Domagała, S., Jarzembska, K. N., Hoser, A. A., Sanjuan-Szklarz, W. F., Gutmann, M. J., Makal, A., Malińska, M., Bąk, J. M. & Woźniak, K. (2014). Acta Cryst. A70, 72–91.  Web of Science CSD CrossRef IUCr Journals Google Scholar
First citationKinderman, A. J. & Monahan, J. F. (1977). ACM Trans. Math. Software (TOMS), 3, 257–260.  CrossRef Google Scholar
First citationKissel, C., Speranza, F. & Milicevic, V. (1995). J. Geophys. Res. 100(B8), 14999–15007.  CrossRef Google Scholar
First citationKrause, L., Niepötter, B., Schürmann, C. J., Stalke, D. & Herbst-Irmer, R. (2017). IUCrJ, 4, 420–430.  Web of Science CSD CrossRef CAS PubMed IUCr Journals Google Scholar
First citationLeva, J. L. (1992). ACM Trans. Math. Software, 18, 449–453.  CrossRef Google Scholar
First citationMadsen, A. Ø. & Hoser, A. A. (2014). J. Appl. Cryst. 47, 2100–2104.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationMesserschmidt, M., Scheins, S. & Luger, P. (2005). Acta Cryst. B61, 115–121.  Web of Science CSD CrossRef CAS IUCr Journals Google Scholar
First citationPerdew, J. P., Kurth, S., Zupan, A. & Blaha, P. (1999). Phys. Rev. Lett. 82, 2544–2547.  CrossRef CAS Google Scholar
First citationRigaku (2013). CrystalClear SM Expert 2.1 b29. Rigaku Corporation, Tokyo, Japan.  Google Scholar
First citationRigaku Oxford Diffraction (2015). CrysAlisPro 1.171.38.37f. Rigaku Oxford Diffraction, Yarnton, England.  Google Scholar
First citationSheldrick, G. M. (2015). Acta Cryst. C71, 3–8.  Web of Science CrossRef IUCr Journals Google Scholar
First citationSpitler, E. L. & Dichtel, W. R. (2010). Nat. Chem. 2, 672–677.  CrossRef CAS Google Scholar
First citationStachowicz, M., Malinska, M., Parafiniuk, J. & Woźniak, K. (2017). Acta Cryst. B73, 643–653.  CSD CrossRef IUCr Journals Google Scholar
First citationSu, Z. & Coppens, P. (1998). Acta Cryst. A54, 646–652.  Web of Science CrossRef CAS IUCr Journals Google Scholar
First citationTang, W., Sanville, E. & Henkelman, G. (2009). J. Phys. Condens. Matter, 21, 084204.  Web of Science CrossRef PubMed Google Scholar
First citationVolkov, A., Koritsanszky, T. & Coppens, P. (2004). Chem. Phys. Lett. 391, 170–175.  Web of Science CrossRef CAS Google Scholar
First citationVolkov, A., Macchi, P., Farrugia, L. J., Gatti, C., Mallinson, P. R., Richter, T. & Koritsanszky, T. (2006). XD2006. Revision 5.34. University of New York at Buffalo, New York, USA.  Google Scholar
First citationYang, W. Q., Gao, X. M. & Wang, B. H. (2003). Med. Res. Rev. 23, 346–368.  CrossRef CAS Google Scholar
First citationZarychta, B., Zaleski, J., Kyzioł, J., Daszkiewicz, Z. & Jelsch, C. (2011). Acta Cryst. B67, 250–262.  Web of Science CSD CrossRef CAS IUCr Journals Google Scholar
First citationZhurov, V. V., Zhurova, E. A. & Pinkerton, A. A. (2008). J. Appl. Cryst. 41, 340–349.  Web of Science CrossRef CAS IUCr Journals Google Scholar

This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.

Journal logoFOUNDATIONS
ADVANCES
ISSN: 2053-2733
Follow Acta Cryst. A
Sign up for e-alerts
Follow Acta Cryst. on Twitter
Follow us on facebook
Sign up for RSS feeds