LANL Home  |  Phone
 

MADBST: Calculation of Bayesian Fa values from MAD data

[How MADBST works | Output | Keywords]

 

Madbst reads in MAD data from a dorgbn file. It estimates the components of the heavy (anomalously scattering) atom structure factor (Fa) parallel to and perpendicular to the structure factor for all non-anomalously scattering atoms. It also estimates the magnitude of Fa and its uncertainty.

Generally you will want to use "ANALYZE_MAD" to run "MADMRG" and "MADBST" for you. Ordinarily, you will take your scaled MAD data (the exact same data you have used for MADMRG) and use it with MADBST. The output of MADBST can be used in (1) Patterson maps, (2) difference fouriers, (3) direct methods, and (4) in automated structure determination with SOLVE in this package.

 

How MADBST works

Here is how MADBST works. MADBST first estimates rms heavy atom structure factor for anom scattering atoms at each wavelength used from form factor tables and # of atoms. Then MADBST goes through all the reflections and estimates probability distributions for the components of Fh parallel (Fha) and perpendicular (Fhb) to the structure factor corresponding to non-anomalously scattering atoms. Possible values of Fha, Fhb are deduced from rms value of Fh and Wilson statistics. Range of Fh tested are from -3 sigma to +3 sigma. Relative probabilities of each possible (Fha,Fhb) are calculated from Wilson statistics.

For each reflection, all above possiblities for (Fha, Fhb) are tested. For each (Fha,Fhb), values of Fp (structure factor for protein atoms only, assumed to be along the x-axis as we have no information on it) are tested. For any set of values of (Fha,Fhb) and Fp, it is easy to calculate values of Fbar and DelFano at each wavelength. The relative likelihood that a particular set of values of (Fha,Fhb) and Fp is correct is estimated (from Bayes' Rule), as the weighted residual: exp( - sum of (calc - obs)/sigma**2) for Fbar and DelfAno at all wavelengths. The a priori probability of (Fha,Fhb) is also included (Wilson statistics). The final "best" value of (Fha,Fhb) is just the weighted average over all possibilities. Similarly the "best" FH**2 is the weighted average of FH**2 over all possibilities (note that the best (FH**2) is not the (best FH)**2. Reflections for which no values of Fha,Fhb are likely ( P<0.001) are rejected.

 

MADBST output

The output file is a copy of the input file, with 6 data columns appended to it, and with all reflections out of the resolution range or with no data tossed. The extra 6 data columns are:

ncol+1 -- <Fa cos theta> = Fh component along Fo weighted by figure of merit
ncol+2 -- <Fa sin theta> = weighted Fh component perpendicular to Fo
ncol+3 -- <Fa> = best estimate of Fa
ncol+4 -- sigma of <Fa>
ncol+5 -- sqrt(<Fa**2>) = sqrt of best estimate of Fa**2
ncol+6 -- sigma of sqrt(<Fa**2>)

The first two of these data columns can be used in difference Fourier maps to show the positions of anomalously scattering atoms. For example, if you have an estimate of phases for Fo, the non-anomalously scattering part of the structure from some partial solution, then you can add the phase angle for Fo on to the phase angle for Fa and draw a Fourier for Fa, the anomalously scattering atoms. This is done in SOLVE.

The third or fifth data columns can be used in a Patterson map (after squaring Fa) or in direct methods. SOLVE can use this Patterson map that you calculate.

 

Keywords for MADBST

(many are the same as for MADMRG):

nshells xx              # of shells of resolution to use (usu. 5 to max of 10)
nres  xx                # of protein residues in the asymmetric unit
nanomalous  xx          # of anomalously scattering atoms in the a.u.

mad_atom xx              Name of anomalously scattering atom.  This generates
                         aval_mad, bval_mad and cval_mad for you.  

lambda n                wavelength # n (n=1,2,3..) for data to follow
ncolfbar n             column # of Fbar at wavelength 1
ncolsfbar n             column # of sigma of Fbar, wavelength 1
ncoldelf  n             column # of DelAno at wavelength 1
ncolsdelf n            column # of sigma fo DelAno at wavelength 1
                        ... (same for wavelength 2 etc...)
fprimv_mad  xx          1 real number for f' value for anomalously scattering
                        atom at the current wavelength.  Wavelength is defined
                        by the most recent value of the keyword "LAMBDA".
                        Note that f' and f" are required regardless of whether
                        you input aval_mad... or mad_atom
fprprv_mad  xx          1 real number for f" value for anomalously scattering
                        atom at the current wavelength.  Wavelength is defined
                        by the most recent value of the keyword "LAMBDA"

infile xxxx             Input data file (.drg file)
outfile xxxx            Output data file (.drg file, same as input except
                        reflections out of range are tossed and there are
                        6 additional columns of data)
>Using SOLVE with MAD data analyzed with MADBST

If you want to phase the difference Fouriers calculated by SOLVE using both the anomalous and dispersive data data then you can specify,

NCOLFHCOS xx

NCOLFHSIN xx

which will, for lambda (1), use the value in ncolfhcos and ncolfhsin as estimates of the heavy atom structure factor components parallel to and perpendicular to the native structure factor. This is done automatically by ANALYZE_MAD if you are using automated structure determination. Here ncolfhcos(1) is identical to ncolfhcos. If you specify ncolfhcos(2) it refers to "derivative" 2.

The output of MADBST provides ncolfhcos(1) and ncolfhsin(1) as "<fa cos theta>" and "<fa sin theta>" (See MADBST writeup). You might also want to use a "combined" Bayesian patterson map as output by MADBST or an optimized difference Patterson map as calculated by MADMRG for your Patterson searches. If you want to specify a previously calculated Patterson map for lambda 1, use the command

PATTFFTFILE xxxxxx

where xxxxxx is the name of the FFT file containing this patterson. The FFT must have been calculated with this package using the same grid as currently specified. PATTFFTFILE is equivalent to PATTFFTFILE(1), where the 1 refers to lambda 1. This keyword will result in the use of file xxxxxx as the patterson for lambda #1.

Disclaimer

Operated by Los Alamos National Security, LLC for the U.S. Department of Energy's NNSA

Inside | © Copyright 2006 Los Alamos National Security, LLC All rights reserved | Disclaimer/Privacy | Web Contact