Local scaling and merging of data
[ Script for localscale | Keywords for localscale ]
[ Merge | Keywords for Merge | More on Merge ]
[Complete]
LOCALSCALE
LOCALSCALE is a routine to scale a "derivative" dataset to a "native" dataset using local scaling. In this method the scale factor for a particular reflection is based on the ratio of derivative:native for reflections surrounding this reflection. This method is useful because the scale factor is not restricted to any particular function of position in reciprocal space.
In this implementation, at least 30 reflections surrounding the reflection to be scaled are used to obtain a scale factor. Additionally, the reflections used in obtaining a scale factor are always chosen so that they form a complete sphere around the reflection of interest (inasmuch as possible). Initial Wilson scaling is carried out before local scaling.
Data files: The program expects to read in two data files: one for the native dataset and one for the derivative. The two files may in fact be the same if desired. The native dataset is expected to have h,k,l, F and sigma (at least). The derivative dataset is expected to have h,k,l, F, and sigma, and, if desired, del F ano and sigma of del F ano. The scale factor obtained for the derivative F is applied to all of the derivative data.
A dorgbn-style file is written out containing the scaled derivative data. If you wish to have the derivative and native data in the same file, then follow this with the routine "FILEMERGE" and merge the two files. The output data file is NOT mapped to the asymmetric unit. Ordinarily you will want to follow LOCALSCALE with MERGE to merge the symmetry-related reflections and map everything to the asymmetric unit. You may need to run MERGE on your native data as well, to map it to the asymmetric unit.
Sample script to localscale der.drg to nat.drg:
!---------------Script for localscaling of derivative F to native F ----- @solve.setup ! standard parameters for this dataset infile nat.drg ! native in infile nnatf 1 ! column for native F nnats 2 ! column for native sigma infile(2) der.drg ! derivative in infile(2) nderf 1 ! column for deriv F nders 2 ! column for deriv sigma outfile der.scl ! output file localscale ! do local scaling !--------------------------------------------------------------------------
NSHELLS n number of shells of resolution used to group data(default=10) INFILE(1) xx file with Native which is not further scaled INFILE(2) xx file with Derivative data to scale to Native OUTFILE xx output file with scaled derivative data NNATF n column # for F of native data NNATS n column # of sigma of F of native data NDERF n column # for F of deriv data NDERS n column # for sigma of F of deriv data NANOF n column # of anomalous difference (Fplus-Fminu) of deriv data NANOS n column # of sigma of anomalous difference note: be sure to set those columns you don't want to 0 FILETITLE optional title for output file KEEPALL keep reflections even with high differences TOSSBAD (default)Toss reflections if differences between native and derivative are more than 3 * the rms found for other reflections. Note: KEEPALL and TOSSBAD apply to MERGE, LOCALSCALE, SCALE_MAD, SCALE_MIR, SCALE_NATIVE. This is the place to reject derivative reflections with very large del F if you want to reject them at all. ANCUT minimum # of reflections to use to scale a reflection (30.) RATMIN minimum ratio of F/sigma to include (default=2) NOBFACTOR if specified, do not apply overall Wilson scaling before doing local scaling. Generally used only along with DAMPING=0. BFACTOR undoes NOBFACTOR. Do apply Wilson scaling before local scaling DAMPING xx scale factor (after Wilson scaling) is damped by taking it to the power xx. Generally used with NOBFACTOR and a value of 0 to not do any scaling at all. NODAMPING undoes DAMPING by resetting damping factor to 1.0 OVERALLSCALE just get 1 scale factor for the whole dataset. No local scaling, no wilson scaling. Same as NOBFACTOR + DAMPING 0.0 NOOVERALLSCALE undoes OVERALLSCALE. SAME AS BFACTOR + DAMPING 1.0 RATIO_OUT 3.0 Reject reflections with iso or ano diff > ratio_out*rms diff in shell REQUIRE_NAT If native is missing, toss derivative too
1. A value of 0 or less for fnat or fder is assumed to mean data are not measured. A value of 0.0 or -1.0 for del f ano is assumed to mean the data are not measured also.
2. If sigmas are not supplied at all, then a value of 1.0000 will be assumed. This can affect what data are read in if you specify a minimum F/sig >0.0
3. If a particular (h,k,l) is found more than once, only the first is used. This is because localscale uses neighboring reflections to scale each (h,k,l) and if it is found more than once there is no way to know which observations are really its neighbors in both time and position.
.
MERGE is a routine that merges measurements of structure factor amplitudes and rejects outliers. It summarizes the quality of the dataset in a listing of R-factors on I and on F.
Sample script file for MERGE
!-------------Script file for merging of native F from 2 data files------ @solve.setup ! standard data for this dataset nset 2 ! number of input files to follow infile(1) nat1.drg ! input data file with F's unmerged infile(2) nat2.drg ! another input data file ncolf_merge 1 ! get native F from column 1 in each data file ncolsig_merge 2 ! get sigma from column 2 outfile native.mrg ! output file with cols 1, 2 of "Favg" and sigma merge !-------------------------------------------------------------------------
The method followed by the program is:
1. group equivalent reflections together, analyze 1 group at a time.
2. get mean, sd for this group
3. reject observations differing from mean by >4 sigma
4. reject reflection outright if Chi-squared is greater than 20 and ikeepflag=0
5. calculate stats based on what's left
6. figure out the relationship between sigmas in the input files and reasonable estimates of the true sigmas by assuming that the reduced chi-square would equal 1.0 if the correct sigmas were present. The data are fit to the equation,
Sig**2(I)=Sig**2(Poisson)+( A*I)**2
and all sigmas are corrected with this factor.
6. write out mean, SEM for the reflection
NSHELLS n number of shells of resolution used to group data (default=10) NFILES n # of input files (1 to 4) INFILE(1) xx input file 1 INFILE(2) xx input file 2 (up to 4 files) NCOLF_MERGE n column number in input file for F (default = 1) NCOLSIG_MERGE n column number in input file for sigma of F (default =2) KEEPALL keep all reflections, regardless of merging chisqr TOSSBAD toss reflections with merging chisqr> 20 (default) Note: KEEPALL and TOSSBAD also apply to LOCALSCALE OUTFILE xx output file with 2 columns (F,sig) IKEEPFLAG 1 Keep reflections even if large deviations from expected (default=0; reject them)
It is ASSUMED that columns 1,2 are your values of F and sigma. (If this is not true, you need to run FILEMERGE first to create such a file). If your data is I and sigma of I, then run MATH with I_TO_F to convert from I to F.
The input data files do not need to have data in any particular order or to have complete datasets.
The data are written out starting with minimum H,K,L and incrementing L fastest, then K, then H.
The routine reports the number of rejects as NNN + MMM where NNN = the number rejected as being too far from the mean for that reflection and MMM is the number of reflections rejected completely with chisqr > 20.
Estimating completeness of a dataset
COMPLETE a routine to determine the completeness of a dataset. It maps input data to the asymmetric unit of the space group and calculates the percentage of data that is present.
Sample script file for COMPLETE:
!----------------Script to estimate completeness of a dataset --------------- @solve.setup ! standard information about this dataset infile data.drg ! input dorgbn file with data to be examined nnatf 1 ! column for F nnats 2 ! column for sigma ratmin 2.0 ! only use data with F/sigma > 2.0 complete ! figure out completeness of this dataset !-----------------------------------------------------------------------------