Cryo-EM single-particle structure refinement and map calculation using Servalcat

Yamashita, K.; Palmer, C.M.; Burnley, T.; Murshudov, G.N.

doi:10.1107/S2059798321009475

research papers

STRUCTURAL
BIOLOGY

ISSN: 2059-7983

Volume 77| Part 10| October 2021| Pages 1282-1291

https://doi.org/10.1107/S2059798321009475

Open

access

Cryo-EM single-particle structure refinement and map calculation using Servalcat

Keitaro Yamashita,^a ^* Colin M. Palmer,^b Tom Burnley ^b and Garib N. Murshudov ^a ^*

^aMRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, United Kingdom, and ^bScientific Computing Department, UKRI Science and Technology Facilities Council, Rutherford Appleton Laboratory, Harwell Campus, Didcot OX11 0FA, United Kingdom
^*Correspondence e-mail: kyamashita@mrc-lmb.cam.ac.uk, garib@mrc-lmb.cam.ac.uk

Edited by A. Perrakis, Netherlands Cancer Institute, The Netherlands (Received 4 May 2021; accepted 11 September 2021; online 29 September 2021)

In 2020, cryo-EM single-particle analysis achieved true atomic resolution thanks to technological developments in hardware and software. The number of high-resolution reconstructions continues to grow, increasing the importance of the accurate determination of atomic coordinates. Here, a new Python package and program called Servalcat is presented that is designed to facilitate atomic model refinement. Servalcat implements a refinement pipeline using the program REFMAC5 from the CCP4 package. After the refinement, Servalcat calculates a weighted F_o − F_c difference map, which is derived from Bayesian statistics. This map helps manual and automatic model building in real space, as is common practice in crystallography. The F_o − F_c map helps in the visualization of weak features including hydrogen densities. Although hydrogen densities are weak, they are stronger than in the electron-density maps produced by X-ray crystallography, and some H atoms are even visible at ∼1.8 Å resolution. Servalcat also facilitates atomic model refinement under symmetry constraints. If point-group symmetry has been applied to the map during reconstruction, the asymmetric unit model is refined with the appropriate symmetry constraints.

Keywords: cryo-EM; structure refinement; REFMAC5; Servalcat.

1. Notation

F_T: Fourier transform of unknown true map (complex values).

F_n: Fourier transform of noise in the observed map (complex values).

F_o1, F_o2: Fourier transforms of the two unweighted and unsharpened half maps from independent reconstructions (complex values).

F_o: Fourier transform of the observed full map, (F_o₁ + F_o₂)/2.

F_c: Fourier transform of calculated map from atomic coordinates (complex values).

E: structure factors normalized in resolution bins, F/(〈|F|²〉)^1/2.

k: resolution-dependent scale factor between F_o and F_T.

D: resolution-dependent scale factor between F_o and F_c.

$[\sigma_{\rm T}^{2}]$ : variance of signal, var(F_T).

$[\sigma_{\rm n}^{2}]$ : variance of noise, var(F_n).

$[\sigma_{\rm U,T}^{2}]$ : variance of unexplained signal, var(DF_c − kF_T).

f: atomic scattering factor.

s: column vector of position in reciprocal space.

s^T: row vector of position in reciprocal space.

x: column vector of position in real space.

(R, t): rotation matrix and translation vector that could be an element of a point group.

B: displacement parameter of an atom, or blurring parameter for a local or global region of a map. A real value (isotropic case) or a 3 × 3 symmetric matrix (anisotropic case). Usually B is isotropic and atomic unless otherwise stated. Also called an atomic displacement parameter (ADP) if associated with an atom.

Unless otherwise stated, all quantities in Fourier space are dependent on s.

2. Introduction

Atomic model refinement is the optimization of the model's parameters against the observed data. Atomic parameters typically include coordinates, atomic displacement parameters (ADPs) and occupancies. In crystallography, refinement is crucial because of the phase problem: the accuracy of density maps relies on the accuracy of the phases of the structure factors. Accurate phases are not observed and must be calculated from the model (Tronrud, 2004 ). More accurate maps may be obtained as the model becomes more accurate through the refinement. In single-particle analysis (SPA) there is no phase problem, although the Fourier coefficients can be noisy, especially at high resolution.

Accurate atomic model determination is becoming more and more important due to the `resolution revolution' in cryo-EM SPA following the introduction of direct electron detectors and new data-processing methods (Bai et al., 2015 ). As of April 2021, more than 2500 SPA entries with resolutions better than 3.5 Å have been deposited in the Electron Microscopy Data Bank (EMDB; Tagari et al., 2002 ). This improvement in resolution has accelerated the development of methods for model building, refinement and validation. Automatic model-building programs that were originally developed for crystallography are now being adapted for cryo-EM SPA maps (Terwilliger, Adams et al., 2018 ; Hoh et al., 2020 ; Chojnowski et al., 2021 ). Density modification and local map sharpening can help to interpret the map (Jakobi et al., 2017 ; Terwilliger, Sobolev et al., 2018 ; Ramírez-Aportela et al., 2019 ; Ramlaul et al., 2019 ; Terwilliger et al., 2020 ). In general, care must be exercised when using any techniques based on prior knowledge; bias towards incorrect assumptions might lead to misinterpretation of the maps. Full-atom refinement can be performed either in real space (Afonine et al., 2018 ) or in reciprocal space (Murshudov, 2016 ).

After refinement, the model should be validated; the model should have a reasonable geometry and should describe the map well. Due to the low data-to-parameter ratio, all models will exhibit a degree of overfitting; however, the model should not deviate substantially from cross-validation data (Brown et al., 2015 ). MolProbity is the most widely used geometry validation tool, and includes analyses of clashes, rotamers and the Ramachandran plot (Chen et al., 2010 ). Map–model quality is assessed using real-space local correlations (Cragnolini et al., 2021 ), which have commonly been used in crystallography (Tickle, 2012 ). In reciprocal-space refinement, the R factor can be calculated as in crystallography, but the map–model Fourier shell correlation (FSC) is preferred as it does not depend on resolution-dependent scaling and takes phases into account explicitly. An F_o − F_c map, which highlights unmodelled features and errors in the current model, is almost always used in crystallography, and some similar tools already exist for SPA (Joseph et al., 2020 ). The σ_A-weighted (m|F_o| − D|F_c|)exp(iφ_c) map as used in crystallography is not directly applicable to SPA, because phases are available for both F_o and F_c and we should model the error of F_o in the complex plane, rather than simply using the estimated phase error as in crystallography (see below).

In 2020, cryo-EM SPA achieved atomic resolution, according to Sheldrick's criterion (Wlodawer & Dauter, 2017 ), in structural analyses of apoferritin, which were reported by two groups (Nakane et al., 2020 ; Yip et al., 2020 ). Nakane et al. (2020) observed H-atom densities at 1.2 and 1.7 Å resolutions using F_o − F_c maps calculated by REFMAC5. There is a higher chance of observing hydrogen density in electron microscopy than in X-ray crystallography because of the increased contrast for the lighter elements (Clabbers & Abrahams, 2018 ). Nevertheless, hydrogen density is relatively weak and there is always a much higher peak from the parent atom nearby, so the F_o − F_c difference map is essential to see it. In addition, there is complexity in the interpretation of hydrogen peaks in EM. An electron in an H atom is usually shifted towards the parent atom from the nucleus position. In EM, both the electrons and the nucleus contribute to scattering, and this offset results in a shift of hydrogen density peaks beyond the position of the hydrogen nucleus (Nakane et al., 2020).

SPA structures often have point-group symmetries (rather than space-group symmetry as in crystallography). Approximately half of the SPA entries in the EMDB have non-C1 point-group symmetry according to their associated metadata. Such symmetry is advantageous and helps to reach higher resolution because it increases the effective number of particles. If the map is symmetrized, downstream analyses should be aware of it and the structural model must follow the symmetry. As in crystallography, it is natural to work in a single asymmetric unit. The MTRIX records in the PDB format or _struct_ncs_oper in the mmCIF format can be used to encode the symmetry information.¹ Currently, for structures from SPA there are only a few depositions of such asymmetric unit models in the PDB (excepting viruses). We recommend refining and depositing an asymmetric unit model, which makes sure the symmetry copies are truly identical. It should be noted that validation tools must be aware of any applied symmetry operators, but results should be reported for the asymmetric unit only. These considerations are only valid if the map is symmetrized, and we suggest that the point-group information should be required by the deposition system.

Here, we present Servalcat, a Python package and standalone program for the refinement and map calculation of cryo-EM SPA structures. Servalcat takes unsharpened and unweighted half maps of the independent reconstructions as inputs and implements a refinement pipeline using REFMAC5, which uses a dedicated likelihood function for SPA (Murshudov, 2016). After the refinement, Servalcat calculates a sharpened and weighted F_o − F_c map derived from Bayesian statistics as described below. If the map has point-group symmetry, the user can give an asymmetric unit model and a point-group symbol, and the program will output a refined asymmetric unit model with symmetry annotation as well as a symmetry-expanded model. The noncrystallographic symmetry (NCS) constraint function in REFMAC5 has been updated to consider symmetry-related nonbonded interactions and ADP similarity restraints (to ensure the similarity of ADPs of atoms brought into close proximity via symmetry operations).

Servalcat is freely available as a standalone package and also as part of CCP-EM (Burnley et al., 2017 ), where the REFMAC5 interface has been updated to use Servalcat.

3. Map calculation and sharpening using signal variance

Let us assume that F_o is the result of a position-independent blurring k of the true Fourier coefficients F_T with an independent zero-mean Gaussian noise with variance $[\sigma_{\rm n}^{2}]$ . That is,

$[p(F_{\rm o}\semi F_{\rm T}) = {{1} \over {\pi\sigma_{\rm n}^{2}}} \exp(-|F_{\rm o}-kF_{\rm T}|^{2}/\sigma_{\rm n}^{2}), \eqno (1)]$

$[{\rm var}(F_{\rm o}) = k^{2}\cdot{\rm var}(F_{\rm T})+\sigma_{\rm n}^{2} = k^{2} \sigma_{\rm T}^{2}+\sigma_{\rm n}^{2}. \eqno (2)]$

Note that in this work we treat k as a function of resolution |s|. Multiplication by k in Fourier space is equivalent to isotropic blurring by a convolution in real space. In general, k could take on a different value at each point s in Fourier space, which would produce a position-independent but direction-dependent blurring in real space.

The variance of the noise ( $[\sigma_{\rm n}^{2}]$ ) can be calculated from the half maps in resolution bins (Murshudov, 2016),

$[\sigma_{\rm n}^{2} = {{{\rm var}{(F_{\rm o1}-F_{\rm o2})}} \over {4}}.\eqno (3)]$

We will later use the relationship of $[\sigma_{\rm n}^{2}]$ and $[k^{2}\sigma_{\rm T}^{2}]$ to the FSC, correlation coefficients in resolution bins (Rosenthal & Henderson, 2003 ),

$[{\rm FSC}_{\rm half} = {\rm CC}(F_{\rm o1},F_{\rm o2}) = {{k^{2} \sigma_{\rm T}^{2}} \over {k^{2}\sigma_{\rm T}^{2}+2\sigma_{\rm n}^{2}}}, \eqno(4)]$

$[{\rm FSC}_{\rm full} = {{k^{2}\sigma_{\rm T}^{2}} \over {k^{2}\sigma_{\rm T}^{2}+ \sigma_{\rm n}^{2}}} = {{2{\rm FSC}_{\rm half}} \over {{\rm FSC}_{\rm half}+1}}. \eqno (5)]$

Let us also assume that the errors in the model follow a Gaussian distribution (Luzzati, 1952 ),

$[p(F_{\rm T} \semi F_{\rm c}) = {{k^{2}} \over {\pi\sigma_{\rm U,T}^{2}}}\exp(-|kF_{\rm T }-DF_{\rm c}|^{2}/\sigma_{\rm U,T}^{2}),\eqno(6)]$

We need two functions: the likelihood p(F_o; F_c) for the estimation of parameters (of the atomic model and of the distribution function) and the posterior distribution p(F_T; F_o, F_c) of the unknown F_T for map calculation.

3.1. Likelihood

As derived in Murshudov (2016),

$[p(F_{\rm o} \semi F_{\rm c}) = {{1} \over {\pi(\sigma_{\rm U,T}^{2}+\sigma_{\rm n}^{ 2})}}\exp[-|F_{\rm o}-DF_{\rm c}|^{2}/(\sigma_{\rm U,T}^{2}+\sigma_{\rm n}^{2})] \eqno (7)]$

is the likelihood function that is optimized during atomic model refinement. D and $[\sigma_{\rm U,T}^{2}]$ are obtained in each resolution bin i by maximizing the joint likelihood (7):

$[D = {{\textstyle\sum \limits_{s\in i}F_{\rm o}(s)F_{\rm c}^{*}(s)} \over {\textstyle\sum\limits_{s\in i}|F_{\rm c}(s)|^{2}}}, \eqno (8)]$

$[\sigma_{\rm U,T}^{2} = {\rm max}\left[0,\textstyle\sum\limits_{s\in i}{{|F_{\rm o}(s)-DF_{\rm c}(s)|^{2}} \over {N_{i}}}-\sigma^{2}_{{\rm n},i}\right], \eqno(9)]$

where N_i is the number of Fourier coefficients in bin i.

3.2. Posterior distribution and map calculation

The posterior distribution, as derived in Murshudov (2016),

$[p(F_{\rm T}\semi F_{\rm o},F_{\rm c}) = {{p(F_{\rm o}\semi F_{\rm T})p(F_{\rm T}\semi F_{ \rm c})} \over {p(F_{\rm o}\semi F_{\rm c})}} \eqno (10)]$

is a 2D Gaussian distribution with the mean and variance

$[\langle F_{\rm T}\rangle = {{wF_{\rm o}+(1-w)DF_{\rm c}} \over {k}}, \eqno(11)]$

$[{\rm var}(F_{\rm T}) = {{\sigma_{\rm n}^{2}\sigma_{\rm U,T}^{2}} \over {k^{2}(\sigma_{\rm U,T}^{2}+\sigma_{\rm n}^{2})}}, \eqno (12)]$

where

$[w = {{\sigma_{\rm U,T}^{2}} \over {\sigma_{\rm U,T}^{2}+\sigma_{\rm n}^{2}}}. \eqno (13)]$

Coefficients for an F_o − F_c-type difference map can be derived as

$[F_{\rm diff} = \langle F_{\rm T}\rangle-{{DF_{\rm c}} \over {k}} = {{w} \over {k}}(F_{\rm o }-DF_{\rm c}). \eqno (14)]$

The remaining unknown variable is k, which cannot be determined from the data alone. For position-independent isotropic Gaussian blurring, k has the form exp(−B_overall|s|²/4) and B_overall may be estimated from line fitting of a Wilson plot (Wilson, 1942 ). However such an estimate is unstable, especially when only low-resolution data are available. Here, we introduce a simple approximation using the variance of the signal. Let us assume that the true map consists of atoms with the same isotropic ADP of 〈B〉, and then

$[\eqalignno {k^{2}(s)\sigma_{\rm T}^{2}(s) & = \exp(-B_{\rm overall}|s|^{2}/2) \cr &\ \quad {\times}\ {\rm var}\left[\textstyle \sum \limits_{j}f_{j}\exp(-\langle B\rangle|s|^{2}/4)\exp(2\pi is^{\rm T}x_{j})\right] \cr & = \exp[-(B_{\rm overall}+\langle B\rangle)|s|^{2}/2]\cr &\ \quad {\times}\ \left\{\textstyle\sum \limits_{j}f_{j}^{2}+ \sum \limits_{j}\sum \limits_{j^{\prime}\neq j}f_{j}f_{j^{\prime}} \exp[2\pi is^{\rm T}(x_{j}-x_{j^{\prime}})]\right\}\cr & \simeq \exp[-(B_{\rm overall}+\langle B\rangle)|s|^{2}/2]\textstyle\sum \limits_{j}f_{j}^{2}. & (15)}]$

We ignored the interference terms $[\exp[2\pi is^{\rm T}(x_{j}-x_{j^{\prime}})]]$ . Further ignoring resolution-dependent terms in $[\textstyle \sum f_{j}^{2}]$ , we can use kσ_T as a proxy for k, which gives the best sharpening for the region, with a local blurring parameter of 〈B〉. kσ_T can be transformed as follows:

$[k\simeq{{k\sigma_{\rm T}} \over {(k^{2}\sigma_{\rm T}^{2}+\sigma_{\rm n}^{2})^{1/2}}} (k^{2}\sigma_{\rm T}^{2}+\sigma_{\rm n}^{2})^{1/2} = ({\rm FSC}_{\rm full})^{1/2} (\langle|F_{\rm o}|^{2}\rangle)^{1/2}. \eqno (16)]$

The F_o − F_c coefficient then finally has the form

$[F_{\rm diff} = {{w} \over {({\rm FSC}_{\rm full}\langle|F_{\rm o }|^{2}\rangle)^{1/2}}}(F_{\rm o}-DF_{\rm c}).\eqno (17)]$

Servalcat calculates an F_o − F_c map using (17). Note that the F_o − F_c map is only sensible when the ADPs are properly refined; otherwise we will see spurious peaks due to incorrect ADPs. For this reason, unsharpened F_o should be used as the input for atomic model refinement (see Section 4.1); the sharpening is then consistent as the same sharpening factor is applied to F_o and F_c. Note also that the sharpening is based on the average B value, so regions having very different B values may show fewer structural features.

The map from the estimated true Fourier coefficients (11) may be useful, but there is a risk of model bias because of the contribution from F_c. In the future, techniques may be available to resolve the issue of model bias. At the moment, Servalcat provides the following as a default map for manual inspection. This is a special case of (11) in the absence of a model, that is with D = 0,

$[\langle F_{\rm T}\rangle = {{k^{2}\sigma_{\rm T}^{2}} \over {k^{2}\sigma_{\rm T}^{2 }+\sigma_{\rm n}^{2}}}{{F_{\rm o}} \over {k}} = ({\rm FSC}_{\rm full})^{1/2}E_{\rm o}.\eqno (18)]$

This is equivalent to EMDA's normalized expected map (Warshamanage et al., 2021 ).

The approach here should work at any resolution where atomic model refinement is applicable.

3.3. Variance of a masked map

The significance of difference map peaks is usually defined by the r.m.s.d. (sigma) level in crystallography. However, in SPA the box size is arbitrary and the voxels outside the molecular envelope lead to underestimation of the r.m.s.d. value. Here, we demonstrate how a mask inflates sigma-scaled density and show that it is useful to normalize the map using the standard deviation within the mask.

We consider a masked map containing n points in total, where m points are within the mask and thus the values for n − m points are zero. If we calculate the mean value of the whole data,

$[\mu_{\rm total} = {{\textstyle\sum \limits_{i=1}^{n}d_{i}} \over {n}} = {{\textstyle\sum \limits_{i=1}^{m}d_{i}} \over {n}} = {{\textstyle\sum \limits_{i=1}^{m}d_{i}} \over {m}}{{m} \over {n}} = \mu_{\rm mask}{{m} \over {n}}. \eqno (19)]$

Thus, to calculate the mean within the mask we can calculate the total mean and then use the formula for correction:

$[\mu_{\rm mask} = \mu_{\rm total}{{n} \over {m}}.\eqno (20)]$

For the variance,

$[\eqalignno {{\rm var}_{\rm total} = {{\textstyle\sum \limits_{i=1}^{n}d_{i}^{2}} \over {n}}-\mu_{\rm total}^{2} & = {{\textstyle\sum \limits_{i=1}^{m}d_{i}^{2}} \over {m}}{{m} \over {n}}-{{m^{2}} \over {n^{2}}}\mu_{\rm mask }^{2} \cr & = {{m} \over {n}}{\rm var}_{\rm mask}+{{m(n-m)} \over {n^{2}}}\mu_{\rm mask }^{2}.& (21)}]$

From here we can calculate var_mask if we know var_total and μ_total. If we denote f = m/n then we can write

$[{\rm var}_{\rm total} = f{\rm var}_{\rm mask}+f(1-f)\mu_{\rm mask}^{2}. \eqno (22)]$

If the mean inside the mask is zero then there is a simple relationship between the total variance and the variance within the mask. This explains the dependence between the box size and the r.m.s.d. of a cryo-EM SPA map. Servalcat normalizes the F_o − F_c map by (var_mask)^1/2 when a mask file is given. (Otherwise only the F_o − F_c structure factors are written in MTZ format.)

If we assume that the map consists of signal and noise, and there is no correlation between them, then we can claim that var_mask = var_signal + var_noise. Now, in addition, if we assume that we have modelled the map fully with an atomic model (or that two maps have an almost perfect overlap of signals) then the difference maps should consist almost entirely of noise. Therefore, var_diffmap,mask = var_noise. This variance should be calculated within the mask to make sure that we do not have variance reduction because of systematically low values outside the region occupied by the macromolecule. If we want to increase the reliability of these variances for a region of interest then we may also mask out other regions where there might be signal that is not fully accounted for by the current model. This can also be practiced in crystallography.

4. Refinement procedure

In this section the refinement and map-calculation procedures are described. Everything other than REFMAC5 itself is implemented in Servalcat using the GEMMI library (https://github.com/project-gemmi/gemmi). Fig. 1 summarizes the procedure.

Figure 1
The workflow of Servalcat for the refinement of SPA structures.

4.1. Map choice

The optimal map depends on the purpose. For manual inspection, optimally sharpened and weighted maps should be used so that the best visual interpretability is achieved. In general, this does not mean the best signal-to-noise ratio, but it does mean that the details of structural features are visible in the map. On the other hand, unsharpened and unweighted maps are preferred in refinement. If a sharpened map is used, some atoms may need to be refined to have negative B values (or nonpositive definite if anisotropic), but they are constrained to be positive in the refinement, resulting in suboptimal atomic models. On the other hand, blurred maps will just give a shifted distribution of refined B values. An unweighted map is preferred because it enables the calculation of many properties including noise variance and optimally weighted maps after refinement (see Section 3). Users should therefore be aware that the ADPs in the model are not refined against the same map that is used for visual inspection. Cross-validation (Brown et al., 2015) can also be carried out throughout refinement and model building if both half maps are readily available. Therefore, unsharpened and unweighted half maps from two independent reconstructions are considered to be optimal inputs for the Servalcat pipeline, which performs atomic model refinement followed by map calculation.

4.2. Masking and trimming

The box size in SPA is often substantially larger than the molecule, which is unnecessary for atomic model refinement. Therefore the map is masked and trimmed into a smaller box to speed up calculations, as discussed in Nicholls et al. (2018 ).

Half maps are first sharpened, masked at a radius of 3 Å (default) from the atom positions and then blurred by the same factor. Sharpening before masking is important to avoid masking away any of the signal (the tails of the atomic density distributions), because the raw half maps are blurred and the signal is spread out. The optimal sharpening will differ depending on the region, but here we use an overall isotropic B value estimated by comparing |F_o| with |F_c| calculated from a copy of the initial model with all ADPs set to zero. Alternatively, a user-supplied B value can be used. The sharpened–masked–unsharpened half maps are then averaged to make a full map that is used as the refinement target in REFMAC5. After refinement, the map–model FSC is calculated using a newly created mask based on the refined model.

4.3. Point-group symmetry

If the maps are symmetrized, the user can specify a point-group symbol and give the coordinates for just a single asymmetric unit. Symmetry operators are calculated from the symbols (Cn, Dn, O, T and I) following the axis convention in RELION (Scheres, 2012 ), which follows the common orientation convention (Heymann et al., 2005 ) except for T. It is also assumed that the centre of the box is the origin of symmetry. This requires translation for each rotation R_j, which can be calculated as c − R_jc = (I − R_j)c, where c is the origin of symmetry. Reconstruction programs such as RELION (Scheres, 2012) usually follow this assumption. However, the rotation of the axes and the position of the origin are arbitrary in general, and in future will be determined automatically using ProSHADE (Nicholls et al., 2018; Tykac, 2018 ) and EMDA. The model in the asymmetric unit is expanded when creating a mask and performing map trimming. The rotation matrices are invariant to changing the box sizes and shifts of the molecule. The translation vectors in the symmetry operators are recalculated for the shifted model.

REFMAC5 internally generates symmetry copies when calculating F_c and restraint terms. For anisotropic ADPs, the B_aniso matrix in the Cartesian basis is transformed by $[R_{j}B_{\rm aniso}R_{j}^{\rm T}]$ . This anisotropic ADP transformation is also implemented in GEMMI.

During the refinement, nonbonded interaction and ADP similarity restraints are evaluated using the symmetry-expanded model, and the gradients are calculated for the model in the asymmetric unit.

If atoms are on special positions (for example on a rotation axis), they are restrained² to sit on the special position and have anisotropic ADPs consistent with symmetry. Firstly, atoms are identified as being on a special position if the following condition is obeyed for any of the symmetry operators j,

$[|x-(R_{j}x+t_{j})|^{2} \, \lt\, \varepsilon^{2}, \eqno (23)]$

where ɛ is a tolerance that can be modified by users. The default value is 0.25 Å. If an atom is on a special position then the program makes sure that the symmetry operators for this position form a group that is a subgroup of the point group of the map. Once the elements of the subgroup for this atom have been identified, the atom is forced to be on that position by simply replacing its coordinates with

$[x_{\rm sym} = {{1} \over {N_{\rm sym}}}\textstyle \sum \limits_{j}^{N_{\rm sym}}(R_{j}x+t_{j}). \eqno (24)]$

In every cycle, the positions of these atoms are restrained to be on their special positions by adding a term to the target function,

$[{{1} \over {\sigma_{x}^{2}}}\left|x-{{1} \over {N_{\rm sym}}}\textstyle \sum \limits_{j}^{N_{\rm sym}}(R_ {j}x+t_{j})\right|^{2}, \eqno (25)]$

where the summation is performed over all subgroup elements of the special position and σ_x is a user-controllable weight parameter for special positions. The occupancy of the atom is adjusted based on the multiplicity of the position.

If anisotropic ADPs are used, they are also forced to obey symmetry conditions for atoms on special positions by replacing the anisotropic tensor with

$[B_{\rm sym} = {{1} \over {N_{\rm sym}}} \textstyle\sum \limits_{j}^{N_{\rm sym}}R_{j}B_{\rm aniso}R_{j} ^{T}. \eqno (26)]$

After this, similarly to the positional parameters, in every cycle restraints are applied to the anisotropic tensor of the atoms on special positions to avoid violation of the symmetry condition for the ADP,

$[{{1} \over {\sigma_{B}^{2}}}\left|B_{\rm aniso}-{{1} \over {N_{\rm sym}}}\textstyle \sum \limits_{j}^{N_{\rm sym}}R_{j}B_{\rm aniso}R_{j}^{T}\right|^{2}, \eqno(27)]$

where σ_B is a user-controllable weight parameter for B_aniso values on special positions. Here, the distance between anisotropic tensors is a Frobenius distance |B₁ − B₂|² = $[\textstyle \sum_{i,j}|B_{1,i,j}-B_{2,i,j}|^{2}]$ .

4.4. H atoms

Hydrogen electrons are usually shifted towards the parent atoms by 0.1–0.2 Å (Williams et al., 2018 ). This must be accounted for when calculating structure factors from the atomic model (F_c). REFMAC5 and Servalcat (GEMMI) use the Mott–Bethe formula (Mott & Bragg, 1930 ; Bethe, 1930 ; Murshudov, 2016), which can conveniently take this fact into account.

The atomic scattering factor for an atom with a shifted nucleus is

$[f_{e}(s) = {{me^{2}} \over {8\pi h^{2}\varepsilon_{0}}}{{Z\exp(-2\pi is^{\rm T}\Delta x)-f_{X}(s)} \over {|s|^{2}}}, \eqno (28)]$

where Δx is the positional shift of the nucleus with respect to the centre of the electron density. The hydrogen density peak in real space is shifted beyond the position of the hydrogen nucleus and varies depending on the ADP and resolution cutoff (Nakane et al., 2020). The expected peak position may be calculated by the Fourier transform of (28). The new CCP4 monomer library includes nucleus bond distances (_chem_comp_bond.value_dist_nucleus; Nicholls et al., 2021 ).

4.5. Refinement

REFMAC5 performs a maximum-likelihood refinement against the Fourier transform of a sharpened–masked–unsharpened map (see Section 4.2) using a dedicated likelihood function for SPA (7). The estimated noise $[\sigma_{\rm n}^{2}]$ is not used at the moment. No solvent model is used. The average of map–model FSC weighted by the number of Fourier coefficients in each shell (FSC average) is reported to monitor the refinement. At low resolution the use of jelly-body restraints or external restraints is encouraged to ensure a large radius of convergence and stabilize the refinement (Murshudov et al., 2011 ; Nicholls et al., 2012 ). Note that jelly-body restraints are only useful when the initial model geometry is of good quality because they try to keep the model in its current conformation. After the refinement, Servalcat shifts the model back to the original box and adjusts the translation vectors of the symmetry operators if needed. It also generates an MTZ file of map coefficients including the sharpened and weighted F_o − F_c and F_o maps (as calculated by equations 17 and 18).

4.6. User interface

Servalcat has a command-line interface. A graphical interface will be available in CCP-EM, where the REFMAC5 interface has been updated and is now based on Servalcat.

From the user's point of view, the main difference in setting up a refinement job is that the default input is now a pair of half maps. (Refinement from a single input map is still possible but is no longer the default option.) The user is also offered more control over the options for refinement weight, symmetry and handling of H atoms. At the end of refinement, the F_o − F_c difference map from Servalcat is made available along with the other output files in the CCP-EM launcher.

5. Methods and results

5.1. F_o − F_c map for ligand visualization

F_o − F_c omit maps are widely used to convincingly demonstrate the existence of ligands in crystallography. They are also useful for this purpose in SPA. Fig. 2 shows an example of an F_o − F_c omit map for the ligand density from EMDB entries EMD-22898 (Kern et al., 2021 ) and EMD-8123 (Murray et al., 2016 ), clearly showing support for the presence of the ligand. To generate the map from EMD-22898, chain A of the atomic model from PDB entry 7kjr was refined using the half maps under C2 symmetry constraints. For EMD-8123, PDB entry 5it7 was refined using the half maps without symmetry constraints. After the refinement, the ligand and water atoms were omitted and the F_o − F_c maps were calculated. Map values were normalized within a mask. Since a suitable mask for EMD-22898 was not available in the EMDB, one was calculated from half-map correlation using EMDA.

Figure 2
An example of an F_o − F_c omit map for visualization of ligand density. The ligand molecules and ions shown as sticks and spheres, respectively, are omitted in the map calculation. The resolution is (a) 2.08 Å (PDB entry 7kjr/EMDB entry EMD-22898) and (b) 3.6 Å (PDB entry 5it7/EMDB entry EMD-8123). The F_o − F_c omit maps are contoured at 3σ (where σ is the standard deviation within the mask; see Section 3.3

). The images were created using PyMOL (Schrödinger, 2020

The weighting and sharpening scheme in Servalcat was compared with alternatives using no weights or (FSC_full)^1/2 weights (Rosenthal & Henderson, 2003), both with sharpening by the overall B value as determined from Wilson plot fitting by RELION (Supplementary Figs. S1 and S2). Especially in the case of EMDB entry EMD-8123 (Supplementary Fig. S2), sharpening by the overall B value obtained by line fitting gave oversharpened maps.

5.2. F_o − F_c map for detecting model errors

In crystallography, F_o − F_c maps are almost always used for manual and automatic model rebuilding. Strong negative density usually indicates that parts of the model should be moved away or removed, while strong positive density implies that there are unmodelled atoms. The F_o − F_c map is typically updated after every refinement session, and refinement may be stopped when there are no significant strong peaks.

The same refinement practice is possible in SPA. Fig. 3 illustrates the use of the F_o − F_c map for detecting model errors using EMDB entry EMD-0919 and PDB entry 6lmt (Demura et al., 2020 ). Chain A of the model was refined using the half maps under C8 symmetry constraints. After refinement, the F_o − F_c map was calculated and normalized using the standard deviation of the region within the EMDB-deposited mask. In this example, it is clear from the positive and negative difference peaks that the tryptophan and methionine side chains should be repositioned. The weighting and sharpening scheme are compared in Supplementary Fig. S3, demonstrating that appropriate weighting can increase the interpretability of maps.

Figure 3
An example of an F_o − F_c map for detecting model error, in this case mispositioned tryptophan and methionine side chains (PDB entry 6lmt/EMDB entry EMD-0919). The resolution is 2.66 Å and the F_o − F_c map is contoured at ±4σ (scaled within the mask). Green and red meshes represent positive and negative maps, respectively. The grey mesh is the weighted and sharpened F_o map. This image was created using PyMOL.

5.3. Hydrogen density analysis

Nakane et al. (2020) reported convincing densities for H atoms in apoferritin and GABA_AR maps by cryo-EM SPA at 1.2 and 1.7 Å resolution, respectively. It is natural to ask what is the lowest resolution at which H atoms can be seen in cryo-EM SPA using currently available computational tools.

Here, we analyzed apoferritin maps from the EMDB to see if and when hydrogen densities could be observed. There are 25 mouse or human apoferritin entries at resolutions better than 2.1 Å, of which 19 had half maps and were used in the analysis (Table 1). Chain A of each model was refined using the half maps under O symmetry constraints. If there was no corresponding PDB entry, PDB entry 7a4m or 6z6u was placed in the map using MOLREP (Vagin & Teplyakov, 2010 ) followed by jiggle fit in Coot (Brown et al., 2015) before full atomic refinement. After ten cycles of refinement with REFMAC5, an F_o − F_c map was calculated and normalized within the mask. Riding H atoms were used in the refinement (so they are not refined, but generated at fixed positions; this is the default in REFMAC5) and they were omitted for F_o − F_c map calculation. Peaks of ≥2σ and ≥3σ were detected using PEAKMAX from the CCP4 package (Winn et al., 2011 ), and were associated with hydrogen positions if the distance from the peak was less than 0.3 Å. H atoms having multiple potential minima (such as those in hydroxyl, sulfhydryl or carboxyl groups) were ignored in the analysis. The ratios of the number of hydrogen peaks to the number of H atoms in the model are plotted in Fig. 4(a). The result shows that the 1.25 Å resolution data gave the highest ratio of ∼70% hydrogens detected (Fig. 5a). Even at 1.84 Å resolution approximately 17% of the H atoms may be found (Fig. 5b), while at 2.0 or 2.1 Å resolution only a few H atoms are visible in the map (Fig. 5c). The weighting and sharpening schemes are compared in Supplementary Figs. S4–S6. Note that there may be false positives due to, for example, alternative conformations or inaccuracies in the model.

Table 1
Test data for hydrogen peak analysis

EMDB code	PDB code	Resolution (Å)	Reference
EMD-11638	7a4m	1.22	Nakane et al. (2020)
EMD-11103	6z6u	1.25	Yip et al. (2020)
EMD-30683	(7a4m)†	1.31	Danev et al. (2021 )
EMD-30685	(7a4m)†	1.35	Danev et al. (2021)
EMD-30684	(7a4m)†	1.43	Danev et al. (2021)
EMD-30686	(7a4m)†	1.43	Danev et al. (2021)
EMD-9865	(7a4m)†	1.54	Kato et al. (2019 )
EMD-11121	6z9e	1.55	Yip et al. (2020)
EMD-11122	6z9f	1.56	Yip et al. (2020)
EMD-9599	(7a4m)†	1.62	Danev et al. (2019 )
EMD-0144	(6z6u)†	1.65	Zivanov et al. (2018 )
EMD-20026	(6z6u)†	1.75	Pintilie et al. (2020 )
EMD-21024	6v21	1.75	Wu et al. (2020 )
EMD-10101	6s61	1.84	No publication
EMD-10675	(7a4m)†	1.86	Fislage et al. (2020 )
EMD-21951	6wx6	2.00	Tan & Rubinstein (2020 )
EMD-22351	(6z6u)†	2.07	Guo et al. (2020 )
EMD-4905	6rjh	2.10	Naydenova et al. (2019 )
EMD-20521	6pxm	2.10	No publication

†No PDB entry was assigned and the code in parentheses was used for refinement (PDB entry 7a4m from mouse and PDB entry 6z6u from human).

Figure 4
Detection of H atoms, measured as the number of observed hydrogen density peaks divided by the number of H atoms in the model. (a) Different apoferritin cases by cryo-EM SPA (see Table 1

). (b) Different (apo)ferritin cases by X-ray crystallography using PDB entries 2v2p, 2v2s, 6gxj, 5erj, 5mij, 2cih, 2w0o, 7bd7, 3f37, 2v2n, 1h96, 2chi, 2zg8, 2v2m, 2z5p, 3h7g, 3f34, 2zg7, 3f32, 3f33, 3f36, 2gyd, 3o7s, 1xz1, 1xz3, 2cn7, 2zg9, 3f38, 2cei, 2iu2, 3fi6, 6env, 3f39, 5ix6, 2v2o, 2v2l, 2v2r, 3o7r, 3rav, 3u90, 3f35, 1aew, 5mik, 2g4h, 2v2i, 3rd0, 5erk, 6ra8, 1gwg, 2clu and 2z5q. (c, d) Apoferritin cases calculated at different resolutions from the same map and model, PDB entry 7a4m/EMDB entry EMD-11638, determined at 1.22 Å resolution. (c) shows detection of H atoms in F_o − F_c maps and (d) in calculated F_c maps. This figure was prepared using ggplot2 (Wickham, 2016

) in R (R Core Team, 2020

Figure 5
Observation of hydrogen density peaks in F_o − F_c maps with different resolutions, using (a) 1.25 Å resolution data (PDB entry 6z6u/EMDB entry EMD-11103), (b) 1.84 Å resolution data (PDB entry 6s61/EMDB entry EMD-10101) and (c) 2.00 Å resolution data (PDB entry 6wx6/EMDB entry EMD-21951). H atoms are omitted in the map calculation. Green and red meshes represent positive and negative F_o − F_c maps contoured at ±3σ (scaled within the mask), respectively. The images were created using PyMOL.

In addition, F_o − F_c maps were generated from the 1.2 Å resolution data (PDB entry 7a4m; EMDB entry EMD-11638) using several different resolution cutoffs. These were analysed in the same way (Fig. 4c), along with F_c maps calculated from the PDB entry 7a4m model at the same resolutions (Fig. 4d). Figs. 4(c) and 4(d) show that if the cryo-EM experiment and atomic model refinement are carried out carefully, with due attention to ADPs, then some H atoms can be seen even at 2.0 Å resolution.

For comparison, we performed the same analysis using X-ray crystallographic data for (apo)ferritins deposited in the PDB. 51 re-refined atomic models available in the PDB-REDO database (Joosten et al., 2012 ) were downloaded, crystallographic mF_o − DF_c maps were calculated using REFMAC5 and density peaks for H atoms were analysed as just described. The result (Fig. 4b) confirms that, as expected, H atoms are more visible in EM than using X-rays.

6. Conclusions

A new program, Servalcat, for the refinement and validation of atomic models using cryo-EM SPA maps has been developed. The program controls the refinement flow and performs difference-map calculations. A weighted and sharpened F_o − F_c map was derived as a validation tool, obtained from the posterior distribution of F_T and an approximation of an overall blurring factor calculated from the variance of the signal. We showed that such maps are useful to visualize H atoms and model errors, as in crystallography.

In this work, we assumed the blurring factor k was position-independent (see Section 3). However, in reality, blurring of maps is position- and direction-dependent, for example due to the varying mobility of different domains and/or uncertainty in the particle alignments. For such regions k should ideally be replaced with k_local, derived from a local map blurring parameter B_local according to k_local(s) = exp(−B_local|s|²/4) (if isotropic) or exp(−s^TB_locals/4) (if anisotropic). If we could estimate B_local values, then we would be able to use them for the visual improvement of maps. This is especially important for identifying weak densities. We are working on this subject.

We showed that many H atoms may be observed in the difference maps, even up to a resolution of 2 Å. We would expect that they should also be visible in electron diffraction (MicroED) experiments. However, high accuracy would be needed in the experiment, data analysis and model refinement in both MicroED and cryo-EM SPA to achieve this experimentally. For example, the electron dose in cryo-EM experiments is often high enough to cause radiation damage (Hattne et al., 2018 ); H atoms are known to suffer from radiation damage (Leapman & Sun, 1995 ) and this would hinder their detection. Lower dose experiments might be needed for more reliable identification of hydrogen, even at the expense of resolution.

Symmetry is widely used in cryo-EM SPA. When symmetry is imposed in the reconstruction, it should be used throughout the downstream analyses, and all software tools should be aware of it and take it into account. The asymmetric unit model should be refined under symmetry constraints, and it should be deposited in the PDB with the correct annotation of the symmetry. The PDB and EMDB deposition system will need to validate the symmetry of both the model and the map. We hope that this will become common practice in the future. The same practice should be established for helical reconstructions, in which symmetry is described by the axial symmetry type (Cn or Dn), twist and rise (He & Scheres, 2017 ). Servalcat will support helical symmetry in the future.

Servalcat is freely available under an open source (MPL-2.0) licence at https://github.com/keitaroyam/servalcat. The features described in this paper have been implemented in REFMAC 5.8.0291 and Servalcat 0.2.0 (which requires GEMMI 0.4.9). Servalcat is also available in the latest nightly builds of the CCP-EM suite and will be included in the upcoming version 1.6 release.

Supporting information

Supplementary Figures. DOI: https://doi.org/10.1107/S2059798321009475/qt5003sup1.pdf

Footnotes

¹There is a similar record, BIOMT, which encodes the biological assembly. In SPA, the symmetry of the map usually corresponds to the biological assembly, but this is not always the case. Both MTRIX and BIOMT records are generally required during deposition.

²Technically, fixed position constraints would be more appropriate here. We used restraints instead of constraints for simplicity of implementation. In the future, we will implement the use of constraints instead.

Acknowledgements

The authors are grateful to Marcin Wojdyr for the implementation of F_c calculation for EM in the GEMMI library, Takanori Nakane for critical reading of the manuscript, computational structural biology group members for discussion, and Jake Grimmett and Toby Darling from the MRC–LMB Scientific Computing Department for computing support and resources.

Funding information

This work was supported by the Medical Research Council as part of UK Research and Innovation (MC_UP_A025_1012 to KY and GNM; MR/V000403/1 to CMP and TB).

References

Afonine, P. V., Poon, B. K., Read, R. J., Sobolev, O. V., Terwilliger, T. C., Urzhumtsev, A. & Adams, P. D. (2018). Acta Cryst. D74, 531–544. Web of Science CrossRef IUCr Journals Google Scholar
Bai, X.-C., McMullan, G. & Scheres, S. H. W. (2015). Trends Biochem. Sci. 40, 49–57. Web of Science CrossRef CAS PubMed Google Scholar
Bethe, H. (1930). Ann. Phys. 397, 325–400. CrossRef Google Scholar
Brown, A., Long, F., Nicholls, R. A., Toots, J., Emsley, P. & Murshudov, G. (2015). Acta Cryst. D71, 136–153. Web of Science CrossRef IUCr Journals Google Scholar
Burnley, T., Palmer, C. M. & Winn, M. (2017). Acta Cryst. D73, 469–477. Web of Science CrossRef IUCr Journals Google Scholar
Chen, V. B., Arendall, W. B., Headd, J. J., Keedy, D. A., Immormino, R. M., Kapral, G. J., Murray, L. W., Richardson, J. S. & Richardson, D. C. (2010). Acta Cryst. D66, 12–21. Web of Science CrossRef CAS IUCr Journals Google Scholar
Chojnowski, G., Sobolev, E., Heuser, P. & Lamzin, V. S. (2021). Acta Cryst. D77, 142–150. CrossRef IUCr Journals Google Scholar
Clabbers, M. T. B. & Abrahams, J. P. (2018). Crystallogr. Rev. 24, 176–204. Web of Science CrossRef CAS Google Scholar
Cragnolini, T., Sahota, H., Joseph, A. P., Sweeney, A., Malhotra, S., Vasishtan, D. & Topf, M. (2021). Acta Cryst. D77, 41–47. CrossRef IUCr Journals Google Scholar
Danev, R., Yanagisawa, H. & Kikkawa, M. (2019). Trends Biochem. Sci. 44, 837–848. Web of Science CrossRef CAS PubMed Google Scholar
Danev, R., Yanagisawa, H. & Kikkawa, M. (2021). Microscopy, dfab016. CrossRef Google Scholar
Demura, K., Kusakizako, T., Shihoya, W., Hiraizumi, M., Nomura, K., Shimada, H., Yamashita, K., Nishizawa, T., Taruno, A. & Nureki, O. (2020). Sci. Adv. 6, eaba8105. CrossRef PubMed Google Scholar
Fislage, M., Shkumatov, A. V., Stroobants, A. & Efremov, R. G. (2020). IUCrJ, 7, 707–718. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Guo, H., Franken, E., Deng, Y., Benlekbir, S., Singla Lezcano, G., Janssen, B., Yu, L., Ripstein, Z. A., Tan, Y. Z. & Rubinstein, J. L. (2020). IUCrJ, 7, 860–869. Web of Science CrossRef CAS PubMed IUCr Journals Google Scholar
Hattne, J., Shi, D., Glynn, C., Zee, C.-T., Gallagher-Jones, M., Martynowycz, M. W., Rodriguez, J. A. & Gonen, T. (2018). Structure, 26, 759–766. Web of Science CrossRef CAS PubMed Google Scholar
He, S. & Scheres, S. H. W. (2017). J. Struct. Biol. 198, 163–176. Web of Science CrossRef CAS PubMed Google Scholar
Heymann, J. B., Chagoyen, M. & Belnap, D. M. (2005). J. Struct. Biol. 151, 196–207. Web of Science CrossRef PubMed Google Scholar
Hoh, S. W., Burnley, T. & Cowtan, K. (2020). Acta Cryst. D76, 531–541. CrossRef IUCr Journals Google Scholar
Jakobi, A. J., Wilmanns, M. & Sachse, C. (2017). eLife, 6, e27131. Web of Science CrossRef PubMed Google Scholar
Joosten, R. P., Joosten, K., Murshudov, G. N. & Perrakis, A. (2012). Acta Cryst. D68, 484–496. Web of Science CrossRef CAS IUCr Journals Google Scholar
Joseph, A. P., Lagerstedt, I., Jakobi, A., Burnley, T., Patwardhan, A., Topf, M. & Winn, M. (2020). J. Chem. Inf. Model. 60, 2552–2560. Web of Science CrossRef CAS PubMed Google Scholar
Kato, T., Makino, F., Nakane, T., Terahara, N., Kaneko, T., Shimizu, Y., Motoki, S., Ishikawa, I., Yonekura, K. & Namba, K. (2019). Microsc. Microanal. 25, 998–999. CrossRef PubMed Google Scholar
Kern, D. M., Sorum, B., Mali, S. S., Hoel, C. M., Sridharan, S., Remis, J. P., Toso, D. B., Kotecha, A., Bautista, D. M. & Brohawn, S. G. (2021). Nat. Struct. Mol. Biol. 28, 573–582. CrossRef CAS PubMed Google Scholar
Leapman, R. D. & Sun, S. (1995). Ultramicroscopy, 59, 71–79. CrossRef CAS PubMed Web of Science Google Scholar
Luzzati, V. (1952). Acta Cryst. 5, 802–810. CrossRef IUCr Journals Web of Science Google Scholar
Mott, N. F. & Bragg, W. L. (1930). Proc. R. Soc. London A, 127, 658–665. CAS Google Scholar
Murray, J., Savva, C. G., Shin, B.-S., Dever, T. E., Ramakrishnan, V. & Fernández, I. S. (2016). eLife, 5, e13567. CrossRef PubMed Google Scholar
Murshudov, G. N. (2016). Methods Enzymol. 579, 277–305. Web of Science CrossRef CAS PubMed Google Scholar
Murshudov, G. N., Skubák, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A., Winn, M. D., Long, F. & Vagin, A. A. (2011). Acta Cryst. D67, 355–367. Web of Science CrossRef CAS IUCr Journals Google Scholar
Nakane, T., Kotecha, A., Sente, A., McMullan, G., Masiulis, S., Brown, P. M. G. E., Grigoras, I. T., Malinauskaite, L., Malinauskas, T., Miehling, J., Uchański, T., Yu, L., Karia, D., Pechnikova, E. V., de Jong, E., Keizer, J., Bischoff, M., McCormack, J., Tiemeijer, P., Hardwick, S. W., Chirgadze, D. Y., Murshudov, G., Aricescu, A. R. & Scheres, S. H. W. (2020). Nature, 587, 152–156. Web of Science CrossRef CAS PubMed Google Scholar
Naydenova, K., Peet, M. J. & Russo, C. J. (2019). Proc. Natl Acad. Sci. USA, 116, 11718–11724. Web of Science CAS PubMed Google Scholar
Nicholls, R. A., Long, F. & Murshudov, G. N. (2012). Acta Cryst. D68, 404–417. Web of Science CrossRef CAS IUCr Journals Google Scholar
Nicholls, R. A., Tykac, M., Kovalevskiy, O. & Murshudov, G. N. (2018). Acta Cryst. D74, 492–505. Web of Science CrossRef IUCr Journals Google Scholar
Nicholls, R. A., Wojdyr, M., Joosten, R. P., Catapano, L., Long, F., Fischer, M., Emsley, P. & Murshudov, G. N. (2021). Acta Cryst. D77, 727–745. Web of Science CrossRef IUCr Journals Google Scholar
Pintilie, G., Zhang, K., Su, Z., Li, S., Schmid, M. F. & Chiu, W. (2020). Nat. Methods, 17, 328–334. Web of Science CrossRef CAS PubMed Google Scholar
Ramírez-Aportela, E., Vilas, J. L., Glukhova, A., Melero, R., Conesa, P., Martínez, M., Maluenda, D., Mota, J., Jiménez, A., Vargas, J., Marabini, R., Sexton, P. M., Carazo, J. M. & Sorzano, C. O. S. (2019). Bioinformatics, 36, 765–772. Google Scholar
Ramlaul, K., Palmer, C. M. & Aylett, C. H. (2019). J. Struct. Biol. 205, 30–40. Web of Science CrossRef PubMed Google Scholar
R Core Team (2020). R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing. Google Scholar
Rosenthal, P. B. & Henderson, R. (2003). J. Mol. Biol. 333, 721–745. Web of Science CrossRef PubMed CAS Google Scholar
Scheres, S. H. W. (2012). J. Struct. Biol. 180, 519–530. Web of Science CrossRef CAS PubMed Google Scholar
Schrodinger, LLC (2020). The PyMOL Molecular Graphics System, Version 2.4. Google Scholar
Tagari, M., Newman, R., Chagoyen, M., Carazo, J.-M. & Henrick, K. (2002). Trends Biochem. Sci. 27, 589. CrossRef PubMed Google Scholar
Tan, Y. Z. & Rubinstein, J. L. (2020). Acta Cryst. D76, 1092–1103. Web of Science CrossRef IUCr Journals Google Scholar
Terwilliger, T. C., Adams, P. D., Afonine, P. V. & Sobolev, O. V. (2018a). Nat. Methods, 15, 905–908. CrossRef CAS PubMed Google Scholar
Terwilliger, T. C., Sobolev, O. V., Afonine, P. V. & Adams, P. D. (2018b). Acta Cryst. D74, 545–559. CrossRef IUCr Journals Google Scholar
Terwilliger, T. C., Sobolev, O. V., Afonine, P. V., Adams, P. D. & Read, R. J. (2020). Acta Cryst. D76, 912–925. Web of Science CrossRef IUCr Journals Google Scholar
Tickle, I. J. (2012). Acta Cryst. D68, 454–467. Web of Science CrossRef CAS IUCr Journals Google Scholar
Tronrud, D. E. (2004). Acta Cryst. D60, 2156–2168. Web of Science CrossRef CAS IUCr Journals Google Scholar
Tykac, M. (2018). PhD thesis. University of Cambridge. https://doi.org/10.17863/CAM.31783. Google Scholar
Vagin, A. & Teplyakov, A. (2010). Acta Cryst. D66, 22–25. Web of Science CrossRef CAS IUCr Journals Google Scholar
Warshamanage, R., Yamashita, K. & Murshudov, G. N. (2021). bioRxiv, 2021.07.26.453750. Google Scholar
Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. New York: Springer. Google Scholar
Williams, C. J., Headd, J. J., Moriarty, N. W., Prisant, M. G., Videau, L. L., Deis, L. N., Verma, V., Keedy, D. A., Hintze, B. J., Chen, V. B., Jain, S., Lewis, S. M., Arendall, W. B. III, Snoeyink, J., Adams, P. D., Lovell, S. C., Richardson, J. S. & Richardson, D. C. (2018). Protein Sci. 27, 293–315. Web of Science CrossRef CAS PubMed Google Scholar
Wilson, A. J. C. (1942). Nature, 150, 152. CrossRef Google Scholar
Winn, M. D., Ballard, C. C., Cowtan, K. D., Dodson, E. J., Emsley, P., Evans, P. R., Keegan, R. M., Krissinel, E. B., Leslie, A. G. W., McCoy, A., McNicholas, S. J., Murshudov, G. N., Pannu, N. S., Potterton, E. A., Powell, H. R., Read, R. J., Vagin, A. & Wilson, K. S. (2011). Acta Cryst. D67, 235–242. Web of Science CrossRef CAS IUCr Journals Google Scholar
Wlodawer, A. & Dauter, Z. (2017). Acta Cryst. D73, 379–380. Web of Science CrossRef IUCr Journals Google Scholar
Wu, M., Lander, G. C. & Herzik, M. A. (2020). J. Struct. Biol. X, 4, 100020. Web of Science PubMed Google Scholar
Yip, K. M., Fischer, N., Paknia, E., Chari, A. & Stark, H. (2020). Nature, 587, 157–161. Web of Science CrossRef CAS PubMed Google Scholar
Zivanov, J., Nakane, T., Forsberg, B. O., Kimanius, D., Hagen, W. J., Lindahl, E. & Scheres, S. H. W. (2018). eLife, 7, e42166. Web of Science CrossRef PubMed Google Scholar

This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.

STRUCTURAL
BIOLOGY

ISSN: 2059-7983

Volume 77| Part 10| October 2021| Pages 1282-1291

https://doi.org/10.1107/S2059798321009475

Open

access

Format		BIBTeX
		EndNote
		RefMan
		Refer
		Medline
		CIF
		SGML
		Plain Text
		Text

Format		BIBTeX
		EndNote
		RefMan
		Refer
		Medline
		CIF
		SGML
		Plain Text
		Text

Search IUCr Journals		doi		Advanced search
Author		volume	page

research papers\(\def\hfill{\hskip 5em}\def\hfil{\hskip 3em}\def\eqno#1{\hfil {#1}}\)

Cryo-EM single-particle structure refinement and map calculation using Servalcat

1. Notation

2. Introduction

3. Map calculation and sharpening using signal variance

3.1. Likelihood

3.2. Posterior distribution and map calculation

3.3. Variance of a masked map

4. Refinement procedure

4.1. Map choice

4.2. Masking and trimming

4.3. Point-group symmetry

4.4. H atoms

4.5. Refinement

4.6. User interface

5. Methods and results

5.1. Fo − Fc map for ligand visualization

5.2. Fo − Fc map for detecting model errors

5.3. Hydrogen density analysis

6. Conclusions

Supporting information

Footnotes

Acknowledgements

Funding information

References

research papers

5.1. F_o − F_c map for ligand visualization

5.2. F_o − F_c map for detecting model errors