Solve localscale

Local scaling and merging of data

[ Script for localscale | Keywords for localscale ]

[ Merge | Keywords for Merge | More on Merge ]

LOCALSCALE

LOCALSCALE is a routine to scale a "derivative" dataset to a "native" dataset using local scaling. In this method the scale factor for a particular reflection is based on the ratio of derivative:native for reflections surrounding this reflection. This method is useful because the scale factor is not restricted to any particular function of position in reciprocal space.

In this implementation, at least 30 reflections surrounding the reflection to be scaled are used to obtain a scale factor. Additionally, the reflections used in obtaining a scale factor are always chosen so that they form a complete sphere around the reflection of interest (inasmuch as possible). Initial Wilson scaling is carried out before local scaling.

Data files: The program expects to read in two data files: one for the native dataset and one for the derivative. The two files may in fact be the same if desired. The native dataset is expected to have h,k,l, F and sigma (at least). The derivative dataset is expected to have h,k,l, F, and sigma, and, if desired, del F ano and sigma of del F ano. The scale factor obtained for the derivative F is applied to all of the derivative data.

A dorgbn-style file is written out containing the scaled derivative data. If you wish to have the derivative and native data in the same file, then follow this with the routine "FILEMERGE" and merge the two files. The output data file is NOT mapped to the asymmetric unit. Ordinarily you will want to follow LOCALSCALE with MERGE to merge the symmetry-related reflections and map everything to the asymmetric unit. You may need to run MERGE on your native data as well, to map it to the asymmetric unit.

Sample script to localscale der.drg to nat.drg:

!---------------Script for localscaling of derivative F to native F -----
@solve.setup                  ! standard parameters for this dataset
infile nat.drg                ! native in infile
nnatf 1                       ! column for native F
nnats 2                       ! column for native sigma
infile(2) der.drg             ! derivative in infile(2)
nderf 1                       ! column for deriv F
nders 2                       ! column for deriv sigma
outfile der.scl               ! output file
localscale                    ! do local scaling
!--------------------------------------------------------------------------

Keywords for LOCALSCALE

NSHELLS n         number of shells of resolution used to group data(default=10)
INFILE(1) xx      file with Native which is not further scaled
INFILE(2) xx      file with Derivative data to scale to Native
OUTFILE xx        output file with scaled derivative data
NNATF n           column # for F of native data
NNATS n           column # of sigma of F of native data
NDERF n           column # for F of deriv data
NDERS n           column # for sigma of F of deriv data
NANOF n           column # of anomalous difference (Fplus-Fminu) of deriv data
NANOS n           column # of sigma of anomalous difference 
note: be sure to set those columns you don't want to 0

FILETITLE         optional title for output file

KEEPALL           keep reflections even with high differences 
TOSSBAD           (default)Toss reflections if differences between native and 
                  derivative are more than 3 * the rms found for other 
                  reflections. 
                  Note: KEEPALL and TOSSBAD apply to MERGE, LOCALSCALE,
                  SCALE_MAD, SCALE_MIR, SCALE_NATIVE. This is the      
                  place to reject derivative reflections with very large del F 
                  if you want to reject them at all.
ANCUT             minimum # of reflections to use to scale a reflection (30.)
RATMIN            minimum ratio of F/sigma to include (default=2)
NOBFACTOR         if specified, do not apply overall Wilson scaling before doing 
                  local scaling. Generally used only along with DAMPING=0.
BFACTOR           undoes NOBFACTOR. Do apply Wilson scaling before local scaling
DAMPING xx        scale factor (after Wilson scaling) is damped by taking it
                  to the power xx. Generally used with NOBFACTOR and a value of 
                  0 to not do any scaling at all.
NODAMPING         undoes DAMPING by resetting damping factor to 1.0
OVERALLSCALE      just get 1 scale factor for the whole dataset. No local
                  scaling, no wilson scaling. Same as NOBFACTOR + DAMPING 0.0
NOOVERALLSCALE    undoes OVERALLSCALE. SAME AS BFACTOR + DAMPING 1.0
RATIO_OUT   3.0   Reject reflections with iso or ano diff > ratio_out*rms diff in shell
REQUIRE_NAT       If native is missing, toss derivative too

More on localscale:

1. A value of 0 or less for fnat or fder is assumed to mean data are not measured. A value of 0.0 or -1.0 for del f ano is assumed to mean the data are not measured also.

2. If sigmas are not supplied at all, then a value of 1.0000 will be assumed. This can affect what data are read in if you specify a minimum F/sig >0.0

3. If a particular (h,k,l) is found more than once, only the first is used. This is because localscale uses neighboring reflections to scale each (h,k,l) and if it is found more than once there is no way to know which observations are really its neighbors in both time and position.

MERGE

MERGE is a routine that merges measurements of structure factor amplitudes and rejects outliers. It summarizes the quality of the dataset in a listing of R-factors on I and on F.

Sample script file for MERGE

!-------------Script file for merging of native F from 2 data files------
@solve.setup              ! standard data for this dataset
nset 2                    ! number of input files to follow
infile(1) nat1.drg        ! input data file with F's unmerged
infile(2) nat2.drg        ! another input data file
ncolf_merge 1             ! get native F from column 1 in each data file
ncolsig_merge 2           ! get sigma from column 2
outfile native.mrg        ! output file with cols 1, 2 of "Favg" and sigma
merge
!-------------------------------------------------------------------------

The method followed by the program is:

1. group equivalent reflections together, analyze 1 group at a time.

2. get mean, sd for this group

3. reject observations differing from mean by >4 sigma

4. reject reflection outright if Chi-squared is greater than 20 and ikeepflag=0

5. calculate stats based on what's left

6. figure out the relationship between sigmas in the input files and reasonable estimates of the true sigmas by assuming that the reduced chi-square would equal 1.0 if the correct sigmas were present. The data are fit to the equation,

Sig**2(I)=Sig**2(Poisson)+( A*I)**2

and all sigmas are corrected with this factor.

6. write out mean, SEM for the reflection

Keywords for MERGE

NSHELLS n        number of shells of resolution used to group data (default=10)
NFILES n         # of input files (1 to 4)
INFILE(1) xx     input file 1
INFILE(2) xx     input file 2 (up to 4 files)
NCOLF_MERGE n    column number in input file for F (default = 1)
NCOLSIG_MERGE n  column number in input file for sigma of F (default =2)
KEEPALL          keep all reflections, regardless of merging chisqr
TOSSBAD          toss reflections with merging chisqr> 20 (default) 
                 Note: KEEPALL and TOSSBAD also apply to LOCALSCALE
OUTFILE xx       output file with 2 columns (F,sig)
IKEEPFLAG  1      Keep reflections even if large deviations from expected (default=0; reject them)

More on MERGE:

It is ASSUMED that columns 1,2 are your values of F and sigma. (If this is not true, you need to run FILEMERGE first to create such a file). If your data is I and sigma of I, then run MATH with I_TO_F to convert from I to F.

The input data files do not need to have data in any particular order or to have complete datasets.

The data are written out starting with minimum H,K,L and incrementing L fastest, then K, then H.

The routine reports the number of rejects as NNN + MMM where NNN = the number rejected as being too far from the mean for that reflection and MMM is the number of reflections rejected completely with chisqr > 20.

Estimating completeness of a dataset

COMPLETE

COMPLETE a routine to determine the completeness of a dataset. It maps input data to the asymmetric unit of the space group and calculates the percentage of data that is present.

Sample script file for COMPLETE:

!----------------Script to estimate completeness of a dataset ---------------
@solve.setup        !  standard information about this dataset
infile  data.drg    !  input dorgbn file with data to be examined
nnatf 1             !  column for F
nnats 2             !  column for sigma
ratmin 2.0          !  only use data with F/sigma > 2.0
complete            !  figure out completeness of this dataset
!-----------------------------------------------------------------------------