Introduction to RESOLVE

Why another density modification approach?
Problems with the phase recombination approach to density modification.
The statistical approach to density modification
Using all the available information for density modification
Carrying out density modification with RESOLVE
Removing model bias with prime-and-switch phasing
NCS averaging in RESOLVE
Local pattern matching in RESOLVE
Fragment identification in RESOLVE
Automated model-building and iterative model-building in RESOLVE
Automated fitting of flexible ligands to an electron-density map
Merging NCS-related copies of a model in RESOLVE

Why another density modification approach?

Although density modification (solvent flattening, non-crystallographic symmetry, phase extension, histogram matching, etc.) has been a very powerful tool, its potential is much greater than has been achieved so far. There are two reasons for this:

The statistical basis of density modification has not been well developed
The range of potential information included in density modification has not been fully utilized.

Problems with the phase recombination approach to density modification.

RESOLVE uses a statistical approach to density modification, while other methods use an approach in which a map is modified to meet expectations and the new phases are recombined with experimental phases. For the mathematical details, see the references for RESOLVE . You might also wish to see the discussion and extensions in Kevin Cowtan's article "Gaussian Likelihoods in real and reciprocal space" in the CCP4 newsletter.

Principal problems with the phase recombination method

What is the optimal relative weighting of

modified and experimental phases?

Incorrect relative weighting means that the final results will not be optimal

Incorrect weighting terms mean that the final figures of merit are almost always inflated

When do you stop iterating?

In some approaches the maps initially get better, then get worse unless you stop

Statistical approach to density modification

Density modification can be thought of as a way to adjust crystallographic phases (or amplitudes) to make them simultaneously consistent with the experimental data and with our expectations of what an electron density map should look like. The statistical approach is a mathematical way to formulate this statement. By using this formulation, the weighting factors and problems with convergence are taken care of automatically.

In RESOLVE, any set of structure factor amplitudes and phases has an associated probability composed of two simple parts:

Probability of a set of phases (and amplitudes)

The probability of the experimental phases	This is the probability that you would have observed your experimental data if this set of phases (and amplitudes) were correct
The lprobability of the map	This is the probability that the electron density map calculated from this set of phases is drawn from the set of plausible electron density maps for this structure

RESOLVE adjusts your crystallographic phases so as to maximize the total (posterier) probability of those phases. The mathematics is a little complicated but the idea is very simple. To see the mathematics in detail, have a look at T. C. Terwilliger (2000) "Maximum likelihood density modification," Acta Cryst. D56, 965-972.

Note on terminology: The approach used by RESOLVE is now called "Statistical density modification," a name suggested by Kevin Cowtan. It used to be called "Maximum-likelihood density modification", using the term "likelihood" in a colloquial sense of probability. The old name (as pointed out by Gerard Bricogne and others) is confusing because the maximum-likelihood method is a specific technique that uses a specific definition of "likelihood" that is not used in this approach. Sorry to all for the confusion, and hoping that it will now be more clear. The mathematics remains exactly the same.

Using all the available information for density modification

Density modification is usually thought of as a process that is carried out on an experimental electron density map prior to model building, but iterative model-building methods such as ARP/wARP can also be thought of as density modification techniques. With the statistical approach, partial model information can be seamlessly incorporated into the total expression for the probability of the phases. This allows a hierachical approach to incorporating information about phase probability:

Types of information that can be used in statistical density modification

Experimental phases (if available)
Low-resolution structural information (solvent boundary)
Non-crystallographic symmetry
Partial model information (molecular replacement or model building)
Full atomic model information

The current version of RESOLVE can incorporate all of these types of information.

Carrying out density modification with RESOLVE

RESOLVE carries out density modification on several levels:

Each "mask cycle":

RESOLVE estimates the probability that each point in the map is within the protein or solvent region (a probabilistic "mask")
RESOLVE refines NCS symmetry operators, if present
RESOLVE then carries out one or more minor cycles:
- Fitting of the histogram of density in the protein and solvent regions to model histograms (yielding beta = quality of this fit, and sigma= the overall error in the map)
- Estimation of target density (a probability function)at each point based on these histograms for solvent and protein regions
- Estimation of target density and uncertainty at each point from NCS or a model map, if present
- Calculations of derivatives of map probability with respect to phases
- Estimation of phase probability from experimental phase probabilities and the map probability function

RESOLVE carries out mask cycles (up to 5) until no further changes occur in the phases.

If NCS is present, then RESOLVE carries out an initial mask cycle, not including any NCS, to estimate uncertainties in density estimated from NCS copies. Then RESOLVE carries out another initial mask cycle, using NCS but not solvent flattening, to estimate "sigma", the overall error in the map.

If "use_input_solv" is not set and "hklstart" is not specified, then RESOLVE uses the R factor to estimate the solvent content of the crystal. Solvent contents from 0.1 to 0.9 are tested, and the value leading to the minimum R is chosen. This optimal solvent content is written to the file "resolve.solvent." Note: if "use_input_solv" is specified, then RESOLVE assumes that the solvent content is already known and reads it from "solvent_content" if specified, or else from "resolve.solvent" if present, or else the default (0.40) is used.

RESOLVE also uses the R-factor to identify which histogram of solvent densities and protein densities to use in density modification. The file "rho.list" in $SOLVEDIR/segments/ contains several histogram profiles, all based on model electron density maps. These are at resolutions from 1.2 A to 4 A. RESOLVE carries out a test of each histogram initially and chooses the one leading to the lowest R factor. The histogram can be set using "database". The optimal database entry is written to "resolve.database".

Resolve estimates the optimal smoothing radius using a simple formula. For cycles where no density modification has occurred yet (first cycle normally, unless "phases_from_resolve" has been set), R is set with the equation: R=2.41 (dmin)**0.9 (fom)**-0.26. For all other cycles (after density modification has begun), the smoothing radius is 4 A. These can also be set with "wang_radius", "wang_radius_cycle", "wang_radius_start", or "wang_radius_finish".

If "n_restore" is set by the user to be non-zero (default = 0), then after the phases have converged, the whole process is repeated again, starting with the original phases, but using the current probabilistic solvent mask. This allows an optimized mask to be used in the "first" cycle of density modification.

Removing model bias with prime-and-switch phasing

Electron density maps obtained using phases calculated from atomic models often show peaks at the coordinates of atoms in the models, even when those atoms are incorrectly placed. This effect can be reduced by careful weighting such as can be accomplished by Randy Read's SIGMAA approach, but it cannot be eliminated unless the phases are changed.

Prime-and-switch phasing is a way to remove model bias by using statistical density modification, but without including the phase information coming from the model once an initial map has been calculated.

The basic procedure is simple:

Use the best existing amplitudes, phases, and weights to calculate a map
Identify the probability that each point in the map is in solvent/macromolecule/NC-symmetry regions, etc
Identify the expected distribution of electron density for points in each class (solvent/macromolecule etc)
Calculate the log-probability of this map
Identify how the log-probability of the map would change if the phases were changed
Adjust the phases to maximize the log-probability of the map

The initial biased phase information from the model is required to get the procedure going. The final phases are essentially unbiased by the model because they are based on the features of the map, not on the prior phase probabilities.

The final phases are generally improved the most when:

The starting phases are accurate (even if they are biased!)
There is substantial solvent (25% is enough, the more the better)
The data are accurate and high resolution (3 A is fine, the higher resolution the better provided there is accurate starting phase information at the high resolution limit)

There are some ways that prime-and-switch phasing can have residual bias:

If there is low solvent content and not enough cycles are carried out (prime-and-switch phasing converges more slowly for low solvent content)
If the starting model is highly refined and not enough cycles are carried out (if the model is refined, then the phases have already been adjusted to minimize the density in the solvent region, and prime-and-switch phasing converges more slowly)

There are some cases where prime-and-switch phasing does not yield a nice-looking map

Usually these are cases where the model-based phases were very inaccurate (though they might have made a nice-looking biased map)
The estimated corrected figure of merit output by RESOLVE will generally be very low in these cases, so you know that there just was not enough phase information

NCS averaging in RESOLVE

Non-crystallographic symmetry is an important source of information about the probability of an electron density map. RESOLVE can begin with transformation matrices and an estimate of the center-of-mass of molecule 1 that you input. RESOLVE can also figure out the transformations and center-of-mass automatically from the NCS in heavy-atom sites in a PDB file (if the default file "ha.pdb" exists and you don't specify NCS transformations, RESOLVE will try to find the NCS in those sites). RESOLVE can figure out the region over which to apply the NCS relationships automatically. You can help it by restricting the region to search for NCS with the keyword "ncs_domain_pdb xxxx.pdb" and supplying a PDB file that contains dummy atoms in the region where NCS exists (all copies must be supplied in the PDB file).

See also the sample script at resolve_sample_scripts

RESOLVE uses NCS information in the following way (see Terwilliger, T. C. 2002 "Statistical density modification with non-crystallographic symmetry". Acta Cryst. D58, 2082-2086 and Terwilliger, T. C. (2002). "Rapid Automatic NCS identification Using Heavy-Atom Substructures" Acta Cryst. D58, 2213-2215.)
Use the best existing amplitudes, phases, and weights to calculate a map
Identify the region near the center-of-mass of molecule 1 for which the NCS transformations (on average) result in significant correlation
Define the asymmetric unit of NCS such that all points in it are close together and within the region of significant correlation, and such that no point in it is equivalent to any other point either by crystallographic or non-crystallographic symmetry.
Estimate the overall correlation among NCS-related molecules as a function of position in the NCS asymmetric unit (so that the edges might be allowed to vary more than the centers, for example)
Map all N copies of the NCS asymmetric unit on to identical grids
Generate target density for each molecule using all the other N-1 copies only
Map all the target densities back onto the original asymmetric unit of the crystal and use agreement between target density and the map as part of the probability of the map

NOTE: If there is more than one type of NCS relationship in the crystal, RESOLVE can carry out NCS averaging separately for each group of molecules related by an NCS relationship. If you use this, you should specify the maximum extent of the region over which to apply each NCS relationship with the ncs_domain_pdb command. In that case RESOLVE can refine the NCS operators using only the part of the molecule that you specify (and not be confused by some other region that is part of another NCS group). If the regions for several NCS groups overlap, the NCS group that will be used for the overlapping points in NCS averaging will be whichever NCS group has the higher NCS correlation near those points.

NOTE If you want to input the NCS operators manually, then the keywords rota_matrix, tran_orth, center_orth (see the list of resolve keywords) are useful. For example:

(Mapping molecule j onto molecule 1)
(As input)

Operator #            1

     New X-prime= 1.0000 X +   0.0000 Y +   0.0000 Z + 0.0000
     New Y-prime= 0.0000 X +   1.0000 Y +   0.0000 Z + 0.0000
     New Z-prime= 0.0000 X +   0.0000 Y +   1.0000 Z + 0.0000

Approximate center_of_mass of this object (from center_of_mass of
object 1 and NC symmetry) is    -28.39   18.81 -16.36

Operator #            2

     New X-prime= -1.0000 X + -0.0035 Y + -0.0040 Z + -4.1335
     New Y-prime= 0.0043 X + -0.1070 Y + -0.9943 Z +-21.6192
     New Z-prime= 0.0031 X + -0.9943 Y +   0.1070 Z +-19.2559

Approximate center_of_mass of this object (from center_of_mass of
object 1 and NC symmetry) is     24.44   -7.12 -39.79

would be input as:

rota_matrix    1.0000   0.0000   0.0000
rota_matrix    0.0000   1.0000   0.0000
rota_matrix    0.0000   0.0000   1.0000
tran_orth     0.0000   0.0000   0.0000
center_orth -28.3915 18.8125 -16.3621

rota_matrix   -1.0000 -0.0035 -0.0040
rota_matrix    0.0043 -0.1070 -0.9943
rota_matrix    0.0031 -0.9943   0.1070
tran_orth    -4.1335 -21.6192 -19.2559
center_orth   24.4419 -7.1171 -39.7930

Local pattern matching in RESOLVE

RESOLVE can use the local patterns of density in your electron density map in statistical density modification to improve crystallographic phases. The basic idea is that on a local level (within a sphere of radius 2 A) there are patterns of electron density that are associated with high density at the center of the pattern, and other patterns associated with low density at the center. RESOLVE goes through your electron density map, and at each point it compares the nearby density with a set of 20 templates (it does not use the density at the point of interest or right around it in this analysis). RESOLVE_PATTERN uses this analysis to come up with a new estimate of the density at each point in the map. This new estimate of density (the "image") has the remarkable property that errors in the image are almost uncorrelated with errors in the map used to create it. This means that phase information from the "image" can be combined with phase information from other sources in a simple way. You can see the details of all this in Terwilliger, T. C. (2003) Statistical density modification using local pattern matching. Acta Cryst. D59, 1688-1701.

The resolve_build script uses image-based phasing with pattern matching and fragment identification, alternating with model-building and standard density modification. Image-based phasing is the use of an electron density map that typically comes from either an atomic model or from pattern-matching or from NCS, along with observed values of FP, to estimate phases. The process results in phases and figures of merit similar to those obtained with Randy Read's SIGMAA, but the values come directly from map-probability phasing. The electron density map provided is used as a target for statistical density modification: crystallographic phases are found that, when combined with observed amplitudes, give a map that is as close as possible to the target map. The figures of merit reflect how precisely each phase can be determined using this approach. The phases from image-based phasing are not the same as those from an FC calculation and they are not always unimodal like FC, SIGMAA or Sim-weighted phases.

The resolve_pattern script also carries out image-based phasing, but it differs from the resolve_build script in that it does not alternate it with building a model, and in that it only uses patterns and not fragment identification.

Fragment identification in RESOLVE

RESOLVE can carry out an FFT-based search for fragments of structure (currently helices, strands), refine the locations of these fragments, and use them in density modification even if a complete model cannot be built. The approach to finding fragments ("Maximum-likelihood density modification with pattern recognition of structural motifs",Terwilliger, T. Acta Cryst D. 57, 1755-1762; 2001) is very similar to Kevin Cowtan's FFT-based search (Cowtan, K., Acta Cryst D54, 750-756, 1998). A template consisting of averaged helical density (or strand density) is rotated over a range of orientations designed to cover most possibilities within about 20 degrees and an FFT convolution is carried out for each orientation to find locations where the template and map match. The best matches are identified and the orientiations and positions are refined. Then a pseudo-map is constructed consisting of the original templates, oriented based on the refined positions found in the search, and weighted by the local correlation coefficient. This pseudo-map is used as a source of phase information through map-probability phasing (Map-likelihood phasing", Terwilliger, T., Acta Cryst., D57, 1763-1775). This approach is similar to the one described in the original publication ("Maximum-likelihood density modification with pattern recognition of structural motifs",Terwilliger, T. Acta Cryst D. 57, 1755-1762; 2001) but works much better than the original method.

Fragment identification is normally carried out right after model-building because the same FFT search can be used for both. The RESOLVE build script includes it.

Automated model-building and iterative model-building in RESOLVE

After the completion of density modification, RESOLVE builds a model of your structure. For versions 2.02 and higher, the model needs sequence information from you. You specify a file with the keyword "seq_file" and RESOLVE expects a sequence of amino acids in 1-letter format. If there are more than one type of chain, RESOLVE expects them separated by a line containing ">>>". . Typically RESOLVE can build 70-90% of the residues for a good map at 2-3 A resolution. You can tell if the model is correct by noting how good the match is to the sequence and by noting the NCS correspondence among chains (if NCS exists). The PDB file that RESOLVE writes out will have the model and also as HETATM records at the end with the heavy atom sites from SOLVE output file ha.pdb.

You can read all the details about RESOLVE automated model-building in Terwilliger, T. C. (2003). Automated main-chain model-building by template-matching and iterative fragment extension. Acta Cryst. D59, 38-44 and Terwilliger, T. C. (2003). Automated side-chain model-building and sequence assignment by template-matching. Acta Cryst. D59, 45-49.

RESOLVE now has superquick model building! The standard RESOLVE model-building for version 2.05 and higher is about 3 times faster than earlier versions. This is made possible by a more selective choice of which fragments to consider extending (no need to work on a fragment that covers a region that is already built). Versions 2.05 and higher also have the option of "superquick_build" which is about 10 times faster than previous versions of RESOLVE model-building. For a very good map (one where RESOLVE can build >80% of the model) superquick_build typically gives almost the same model as the standard build. For a moderate-quality map, the standard build or even the "thorough_build" may give up to 10% more model built.

RESOLVE versions 2.05 and higher include cycles of model-building in which the thresholds for fit of the model to the map are sequentially lowered. This allows much more of the model to be built, while keeping the accuracy of most of the model high. You can use "aggressive_build" to try and build as much as possible, or "conservative_build" to build only the best parts.

RESOLVE versions 2.06 and higher include the capability of identifying fragments (helices; strands) in a map and including them in density modification

RESOLVE builds a model in the following way.

Use the best existing amplitudes, phases, and weights to calculate a map
Identify locations of helices and strands using an FFT-based correlation search with a standard set of helix/strand templates
Using a library of actual helical/beta templates, find the best match to density near each helix/strand
Trim the templates down to match the density
Extend the templates using a template library of short fragments
Assemble fragments into longer fragments
Match side-chain density to library of side-chain densities, get probability of each possible sequence alignment, choose those with very high probability
Map all the fragments to one asymmetric unit so that they are as close together as possible
Write out PDB file with the fragments as a main-chain model.
Note: RESOLVE writes out "resolve.mtz" before it starts building the model, so if you don't want to wait, your phases are already ready for you.

Pattern identification, fragment identification, density modification, in iterative model-building

RESOLVE (versions 2.06 and higher) can carry out pattern identification, fragment identification, density modification, and iterative model-building and refinement in combination with refmac5 (versions 5.1.24 and higher only!)

In the first cycle, RESOLVE carries out density modification and builds a model as usual. The model is refined and extended as much as possible. Then the current electron density map is searched for local patterns and for fragments (helices/strands), and maps are created based just on these features.
On the next cycles, RESOLVE uses the pattern and fragment maps, along with a map created from all models built so far, to generate model density for the asymmetric unit. This density is used along with solvent flattening, NCS, and histogram matching in the next cycle of density modification.
Additionally, a prime-and-switch composite omit map is created in which all the above information is used except that in each "omit" region of the map, the model-based information is left out; and the omit regions are then spliced together to form a composite omit map. This omit map is used in identifying the "patterns" for the next cycle so as to minimize bias in this step.
RESOLVE then builds a new model based on the combined phase information. It also uses fragments from the last model as candidates for parts of the new model.
The process is iterated as long as desired (typically 5-10 cycles are plenty). If the model is not as complete as desired after about 10 cycles (i.e., R-factor > .40) then the model is used in model rebuilding with phase information. This is just like model rebuilding (below) except that experimental phase information is included throughout.
You can use a standard script " resolve_build.csh " to carry this out.

Iterative model-REbuilding

RESOLVE (versions 2.03 and higher) can also carry out iterative model-rebuilding. This is like model-building except that you start with just a model of some kind and measured amplitudes and RESOLVE does everything from there. This works much more slowly than model-building with experimental phases.

Rebuilding can be carried out either with or without composite omit maps. Omit maps are recommended for rebuilding of a model with possible model errors.
Each cycle, RESOLVE starts with input model and create an electron density map (image). Then RESOLVE uses this map as a target in statistical density modification along with the measured FP to estimate phases. Then these phases are used as the starting phases (but not probabilities) in a cycle of density modification including (1) any experimental phase probabilities, (2) solvent flattening, histogram matching, NCS, and (3) model density based on a composite of the last 20 models.
RESOLVE then builds a model, the model is refined with refmac5, and the model is then extended and rebuilt. On each cycle that is not an omit cycle, RESOLVE uses fragments from the previous model along with fragments identified from the map itself as possibilities for constructing the new model.
This process is repeated (typically 100 cycles)
You can use a standard script " resolve_build.csh " to carry this out (it is the same script as for autobuilding). It can take a long time for rebuilding!

Evaluating a model based on a map

RESOLVE_BUILD (versions 2.06 and higher) can automatically evaluate a model, given a set of amplitudes FP (and phases PHIB and FOM if available). First RESOLVE will rebuild the model (to reduce any bias due to refinement). Then RESOLVE will calculate a prime-and-switch composite omit map (as used in rebuilding) based on the rebuilt model and any phase information you give it. Then RESOLVE will compare the original model to this map and summarize the fit for you.

Ligand fitting

RESOLVE (versions 2.08 and higher) can carry out fitting of FLEXIBLE LIGANDS to an electron density map. The only inputs needed are an electron density map (or difference map), and either just one (recommended) or else 5-10 copies (ok also) of the ligand in random but stereochemically ideal conformations in a PDB format file. The routine will figure out the allowed bond rotations from the copies of the ligand, and then will fit the ligand into the density starting with the biggest rigid part of the ligand. Parts of the ligand that do not fit are built as reasonably as possible, but may be built out of density or may be left off.

You can use the sample script resolve_ligand_fit.com script which allows you to find one or more than one copy of a ligand in a map.

You can even take a list of PDB files containing different ligands, fit each one to your map, and score them to identify which ligand may be bound, using the sample script resolve_ligand_id.com.

See the additional descriptions in resolve_sample_scripts too.

Also see the list of resolve keywords for additional options.

Thanks to Herb Klei for emphasizing the need for ligand fitting and for suggesting the idea of first finding the biggest fixed part of the ligand and then building the rest from this core!

Merging NCS copies

RESOLVE (versions 2.08 and higher) will automatically merge NCS-related copies of your model during iterative model-building and refinement. The merging is done in the "extend_only" mode of model-building. An mtz file with FP PHIB FOM, a model (with >1 NCS copy) and a coordinate file with positions of atoms or pseudo-atoms (ha_file) used to deduce the NCS relationships are read in. The coordinates of each NCS-related copy are placed at all NCS-related positions, merged (if possible) and then are extended if possible into the density. If you do not want this to be done, use the flag no_merge_ncs_copies. You can merge models yourself with RESOLVE too: use the extend_only flag for model-building and include your model with: pdb_in your-current-model . Note that you need to supply an mtz file with a map to do this. You can specify the keywords trim or no_trim to tell RESOLVE to trim the resulting model back to the density or not.