Armadillo repeat region of beta-catenin (data courtesy of Andy Huber and Bill Weis)
Summary of this structure solution:
This is a dataset with 4 wavelengths of MAD data, 17000 reflections to 2.7 A, 537 amino acids, and 15 selenium sites. SOLVE found 14 selenium sites in 2 hours on a DEC Alpha 500 MHz workstation. The remaining 2 sites (one selenium has 2 positions) are very weak and were not included by SOLVE.
Solve.setup file listing basic information about the beta-catenin crystals:
resolution 2.7 20 symfile /usr/local/lib/solve/c2221.sym cell 64.1 102.0 187.0 90 90 90Input script file used to run SOLVE on beta-catenin
#!/bin/csh # # set CCP4 and SOLVETMPDIR variables: # setenv CCP4_OPEN UNKNOWN setenv SOLVETMPDIR /var/tmp setenv SYMOP /usr/local/lib/solve/symop.lib setenv SYMINFO /usr/local/lib/solve/syminfo.lib # # solve <<EOD > solve.log !command file to read in raw MAD data, scale, analyze and solve it---- title armadillo repeat of beta catenin 4-wavelength MAD data logfile mad.logfile ! write out most information to this file. ! summary info will be written to "solve.prt" @solve.setup ! get our standard information read in readformatted ! or: readdenzo, readtrek, readccp4_unmerged unmerged ! or; premerged mad_atom se refscattfactors ! do not refine scattering factors (you can if ! you want though) ! Comment out next line if you don't know any sites checksolve ! compare solutions to the one input below ! Comment out next lines if you don't know the structure ! native.fft is fft calculated from catenin_y.pdb (offset +0.5 in y) comparisonfile native.fft lambda 1 ! info on wavelength #1 follows label Wavelength # 1 ! a label for this wavelength rawmadfile l1.int wavelength 0.9000 ! wavelength value fprimv_mad -1.6 ! f' value at this wavelength fprprv_mad 3.4 ! f" value at this wavelength ! list of all SE positions in refined beta-catenin ! structure (offset by 0.5 in y from PDB file). Only if you know them atomname se xyz 0.2631041 0.6633824 2.8978506E-02 xyz 0.4166300 0.6113137 7.7325497E-03 xyz 0.4765674 0.7249608 2.5320712E-02 xyz 0.4591554 0.7427059 0.4719517 xyz 0.4083922 0.7455686 0.1403100 xyz 0.4416372 0.8393628 7.6342024E-02 xyz 0.1327285 0.4970000 0.4364171 xyz 9.6379094E-02 0.5855882 0.3802352 xyz 7.6066948E-02 0.6245000 0.3974865 xyz 0.1150683 0.7795883 0.3715025 xyz 0.1385160 0.7238529 0.4098982 xyz 9.2073016E-02 0.7063529 0.4022779 xyz 0.2152710 0.8265882 0.3764597 xyz 0.3304202 0.6161765 0.2311389 xyz 0.1806852 0.8512745 0.1618233 lambda 2 rawmadfile l2.int wavelength 0.9794 fprimv_mad -11.44 fprprv_mad 8.74 lambda 3 rawmadfile l3.int wavelength 0.9797 fprimv_mad -12.83 fprprv_mad 2.56 lambda 4 rawmadfile l4.int wavelength 0.9897 fprimv_mad -2.42 fprprv_mad 1.13 nres 700 [approx # of residues in protein molecule] nanomalous 15 [approx # of anomalously scattering atoms per protein] acceptance 0.10 SCALE_MAD ! read in and localscale the data ANALYZE_MAD ! run MADMRG and MADBST and analyze all the Pattersons SOLVE ! Solve the structure EODSummary information from the "solve.prt" output file produced after completion of the automated structure determination of beta-catenin
Correlation of anomalous differences. These indicate that the data all the way to about 2.7 A are contributing to the phasing, as the correlation >0.3 for wavelengths 1 vs 2.
CORRELATION FOR
WAVELENGTH PAIRS
DMIN 1 VS 2 1 VS 3 1 VS 4 2 VS 3 2 VS 4 3 VS 4
5.40 0.84 0.66 0.42 0.79 0.41 0.35
4.05 0.75 0.53 0.36 0.69 0.35 0.33
3.78 0.65 0.43 0.21 0.60 0.23 0.19
3.58 0.67 0.38 0.24 0.58 0.27 0.22
3.38 0.56 0.31 0.19 0.50 0.19 0.17
3.24 0.53 0.28 0.12 0.40 0.14 0.14
3.11 0.48 0.21 0.14 0.36 0.18 0.16
2.97 0.44 0.25 0.11 0.38 0.18 0.11
2.84 0.41 0.21 0.08 0.32 0.13 0.06
2.70 0.33 0.11 0.10 0.25 0.13 0.11
ALL 0.63 0.37 0.22 0.52 0.24 0.19
List of sites analyzed for compatibility with difference Patterson
PEAK X Y Z OPTIMIZED RELATIVE OCCUPANCY 1 0.833 0.115 0.231 66.078 2 0.424 0.125 0.102 60.406 3 0.944 0.337 0.076 55.988 4 0.368 0.997 0.062 56.693 5 0.681 0.354 0.162 49.508 6 0.403 0.083 0.118 54.275 7 0.049 0.243 0.028 41.397 8 0.972 0.226 0.025 52.006 9 0.389 0.281 0.127 42.309 10 0.292 0.326 0.125 31.510 11 0.410 0.212 0.095 27.341 12 0.361 0.226 0.090 39.093 13 0.910 0.250 0.144 22.225 14 0.889 0.111 0.012 24.421 Evaluation of this test soln with 14 sites after optimizing occupancy of each site Cross-vectors for sites 1 and 1 (excluding origin; 1000 = 1 sigma): # U V W HEIGHT PRED HEIGHT SYMM# 1 -1.667 -0.229 0.500 7041.00 8732.56 2 2 -1.667 0.000 0.037 6518.80 8732.56 2 3 0.000 -0.229 -0.463 6185.17 8732.56 2 Cross-vectors for sites 2 and 1 (excluding origin; 1000 = 1 sigma): # U V W HEIGHT PRED HEIGHT SYMM# 1 -0.410 0.010 -0.130 6028.85 3991.49 1 2 -1.257 -0.240 0.370 6627.21 3991.49 1 3 -1.257 0.010 0.167 7920.79 3991.49 1 4 -0.410 -0.240 -0.333 5973.46 3991.49 1 Cross-vectors for sites 2 and 2 (excluding origin; 1000 = 1 sigma): # U V W HEIGHT PRED HEIGHT SYMM# 1 -0.847 0.000 0.296 4925.09 7297.73 2 2 0.000 -0.250 -0.204 5302.47 7297.73 2
(etc. for many many more cross-vectors)
Selenium atom occupancy, coordinates, and thermal factors, and Cross-validation fouriers calculated with all heavy atoms in all derivs except the site being evaluated and any sites equivalent to it.
(Peak height is height of peak at this position/rms of map) Site x y z occ B -- PEAK HEIGHT -- 1 0.830 0.116 0.231 0.691 38.214 30.54 2 0.422 0.124 0.103 0.671 44.433 25.98 3 0.943 0.338 0.076 0.706 42.818 25.65 4 0.367 0.996 0.063 0.527 15.000 24.49 5 0.679 0.353 0.162 0.641 60.000 20.66 6 0.406 0.084 0.119 0.574 32.160 22.10 7 0.045 0.243 0.028 0.539 48.318 17.88 8 0.970 0.225 0.026 0.764 60.000 18.08 9 0.386 0.281 0.128 0.306 15.000 15.63 10 0.289 0.326 0.125 0.303 33.767 12.07 11 0.409 0.211 0.095 0.310 36.022 10.80 12 0.362 0.225 0.091 0.192 15.000 12.00 13 0.910 0.250 0.145 0.289 31.524 8.22 14 0.891 0.110 0.011 0.456 60.000 8.36Re-refinement of f' and f" values:
Final refined values of f-prime and f" Wavelength ------- f-prime -------- --------f"-------------- last refinement Refined last refinement Refined 1 -2.206 -2.206 5.365 4.357 2 -10.957 -11.069 11.971 7.525 3 -12.631 -12.740 3.032 2.000 4 -2.714 -2.507 1.232 0.563
Figure of merit versus resolution, and anomalous and dispersive FH/E versus resolution
FIGURE OF MERIT WITH RESOLUTION
DMIN: TOTAL 9.09 5.96 4.72 4.03 3.57 3.24 2.99 2.79 N: 17155 946 1466 1815 2122 2386 2623 2798 2999 MEAN FIG MERIT: 0.55 0.70 0.74 0.67 0.59 0.56 0.52 0.47 0.41 RMS ANOMALOUS FH/E [f" PART OF FH / RMS ANO ERROR]: LAMBDA: 1 0.6 1.1 1.2 0.9 0.7 0.6 0.5 0.4 0.4 LAMBDA: 2 1.0 1.2 1.3 1.2 1.1 1.0 0.9 0.8 0.6 LAMBDA: 3 0.3 0.4 0.6 0.4 0.3 0.3 0.3 0.2 0.2 LAMBDA: 4 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.1 0.1 RMS DISPERSIVE FH/E [Delta-f-prime PART OF FH / RMS DISPERSIVE ERROR]: L1 VS L2: 1.0 1.4 1.5 1.3 1.1 1.0 0.8 0.7 0.6 L1 VS L3: 1.1 1.5 1.6 1.4 1.2 1.1 0.9 0.8 0.7 L1 VS L4: 0.0 0.1 0.1 0.1 0.1 0.0 0.0 0.0 0.0 L2 VS L3: 0.3 0.6 0.5 0.4 0.4 0.3 0.2 0.2 0.1 L2 VS L4: 1.0 1.3 1.5 1.3 1.1 1.0 0.9 0.8 0.6 L3 VS L4: 1.2 1.4 1.7 1.4 1.2 1.1 1.0 0.9 0.8 .1The summary of scoring for this solution
Summary of scoring for this solution: -- over many solutions-- -- this solution -- Criteria MEAN SD VALUE Z-SCORE Pattersons: 4.89 0.745 14.2 12.5 Cross-validation Fourier: 21.7 4.61 185. 35.5 NatFourier CCx100: 11.0 5.29 58.9 9.05 Mean figure of meritx100: 0.000E+00 5.00 69.3 13.9 Correction for Z-scores: -12.1 Overall Z-score value: 58.8Note that the Z-score for this solution (59) is much higher than for the gene 5 protein example, even though the phasing is about the same. This is because SOLVE scoring gets higher for datasets with more sites.
The end of the solve.status file:
***************************************************************************
SOLVE STATUS 07-oct-00 11:46:31
DATASET TITLE: armadillo repeat of beta catenin 4-wavelength MAD data
TIME ELAPSED: 2 HR
---------------------------------------------------------------------------
CURRENT STEP:SOLVE MAIN PROGRAM
STATUS: DONE
---------------------------------------------------------------------------
---------------------------------------------------------------------------
---TOP SOLUTION FOUND BY SOLVE ( <m> = 0.69; score = 58.80) ---
X Y Z OCCUP B HEIGHT/SIGMA
2 0.830 0.116 0.231 0.691 38.2 30.5
2 0.422 0.124 0.103 0.671 44.4 26.0
2 0.943 0.338 0.076 0.706 42.8 25.7
2 0.367 0.996 0.063 0.527 15.0 24.5
2 0.679 0.353 0.162 0.641 60.0 20.7
2 0.406 0.084 0.119 0.574 32.2 22.1
2 0.045 0.243 0.028 0.539 48.3 17.9
2 0.970 0.225 0.026 0.764 60.0 18.1
2 0.386 0.281 0.128 0.306 15.0 15.6
2 0.289 0.326 0.125 0.303 33.8 12.1
2 0.409 0.211 0.095 0.310 36.0 10.8
2 0.362 0.225 0.091 0.192 15.0 12.0
2 0.910 0.250 0.145 0.289 31.5 8.2
2 0.891 0.110 0.011 0.456 60.0 8.4
TIME REQUIRED TO OBTAIN THIS SOLUTION: 38 MIN
---------------------------------------------------------------------------
CURRENT RESOLUTION: 2.7 A. FINAL RESOLUTION: 2.7 A.
~