UB Home Page CSE Home Page HWI Home Page
DOWNLOAD
Announcements
DREAR
Bugs
FAQs
Funding
Manuals
Multiprocessing
Parameters
References
Personnel
Report Structures
SnB Server
SnB Successes
SnB Tutorial
Contact
SnB Home Page

Overview of SnB Operation

  1. Initiate the SnB program:

    Run SnB from the directory containing the reflection file for the structure you wish to consider.

  2. Enter general information:

    On the General Information page, enter the fundamental information that is requested about the structure. This information includes most of the parameters (e.g. cell constants, space group) for which it is difficult to supply default values. At this time, it is also necessary to specify the type of data to be used. A single file of high-resolution (1.1-1.2 Angstroms or better) intensity data with Bijvoet-related reflections merged is required to look for a complete structure. This type of data is referred to as Basic data. SnB can be applied to isomorphous substructures or anomalously scattering substructures using SIR or SAS data, respectively, at 3-4 Angstroms resolution. SIR data consist of reflection files for a pair of isomorphous structures. SAS data is a single reflection file with Bijvoet-related reflections unmerged.

  3. Normalize the data:

    Before direct methods can be applied to a data set, normalized structure-factor magnitudes (|E|s) or difference magnitudes (in the case of SIR or SAS substructure data) must be computed from the usual structure-factor magnitudes (|F|s). The SnB GUI provides convenient access to the DREAR suite of data reduction and error analysis routines in order to compute these quantities. Simply use the Create Es page to execute DREAR and generate an SnB input reflection file. DREAR can accept SCALEPACK and d*TREK output files as well as a simple free-format ASCII file consisting of the fields H, K, L, |F| & Sig(|F|) separated by one or more spaces.

    If you have |E| values that were computed by some other program, you can supply them to the main SnB program in the form of a free-format ASCII file containing H, K, L, |F|, Sig(|F|), |E| & Sig(|E|).

  4. Check remaining parameters:

    Once the General Information page has been completed and the data normalized by DREAR, the program will supply default values for all of the remaining parameters. It is recommended that the user explore the other pages and decide whether any parameters need to be modified. However, it is possible to proceed directly to the Submit Jobs page and initiate a batch job (see #7) to process trial structures (default: 1000 trials) after choosing an appropriate name for the set of output file (see #5).

  5. Specify output files:

    Choose a filename prefix (e.g., "structure-name") to be used for the output information (see the Submit Jobs page). Output file names are of the form prefix_#.SnB_output. SnB can be run in multiprocessor mode, and the symbol # stands for the digit(s) used to denote the processor number. If only a single processor is being used, the processor number will be 0. A description of the available output files is given elsewhere in this document.

  6. Save the current parameters:

    At any time, you can save the information contained on the screens for future use by clicking on the Save As button. The screens should always be saved after all the information necessary to run a Shake-and-Bake job has been entered. The screen information is stored in a so-called "configuration" file. Use of a filename such as "structure-name.config" or "job#.config" is recommended. Save can be used later if you want to update an existing file with a modified set of parameter values. Open can be used to restore a previously saved set of values.

  7. Submit a Shake-and-Bake job:

    Once you are satisfied with the parameter settings and other values that are entered, execute the main phasing program in batch mode by clicking the Process Jobs button on the Submit Jobs page. The trial structures will then be processed according to the dual-space Shake-and-Bake phasing protocol. If you expect the processing time to be long, you can now logout and return later to check your results.

  8. Check a submitted job for possible solutions using the Rmin histogram:

    Trial structures with relatively low values of Rmin (the minimal function) are most likely to be solutions. To see a histogram of the final Rmin values, go to the Evaluate Trials page and choose a previously submitted job to be reviewed (click Update List, select the desired result files, and then click on the View Histogram button). A clear bimodal distribution of Rmin values is a strong indication that a solution has been found. However, users are hereby warned that solutions are sometimes present even when the histogram appears to be unimodal. R(true) gives some indication of what Rmin values to expect for solutions, and R(random) indicates values expected for completely random phase sets.

  9. Check other figures of merit:

    Confirmation that a solution has likely been obtained should be sought by checking for consistency with other figures of merit that are stored, along with Rmin, in an output file called the "trace" file. The View Sorted Trials option shows the "trace" file sorted in increasing order according to Rmin values. A crystallographic R value based on |E| values and the Eobs/Ecalc correlation coefficient, CC [Fujinaga, M. & Read, R.J. (1987). J. Appl. Cryst. 20, 517-521], are available as additional figures of merit. True solutions should always have a correlation between Rmin and the crystallographic R value. Ideally, there should be a bimodal distribution of the crystallographic R as well as Rmin.

    If the best trials have been subjected to Fourier refinement using a large fraction of the data, a correlation coefficient of 0.7 or more is a very strong indication that a solution has been obtained. As it is implemented in SnB, CC is often not reliable for isomorphous or anomalously scattering substructures because the weaker difference data, on which CC depends, are themselves unreliable.

  10. Check the false minimum indicators:

    False minima, typically having a single large "uranium" peak, do occasionally occur, especially in space group P1. Users should be suspicious of trials having large values of the R-Ratio (>0.2) or Peak-Ratio (>5). The R-Ratio is a function of the Rmin values before and after the imposition of real-space constraints (peak picking). Peak-Ratio is the density ratio for the largest and second largest peaks. If the trial with the best figures of merit is suspect because of either of these criteria, it is wise to look further and inspect the best trial that lacks any indication that a false minimum exists.

  11. View the Rmin trace as a function of cycle:

    If complete traces have been stored for all the refinement cycles, then Trace Rmin will show the course of Rmin values over all cycles for the best trial. Typically, Rmin values will drop slightly during the first few cycles, and then reach a plateau. A sudden significant drop in Rmin value followed by stabilization at a lower plateau is another indication that a solution has been found.

  12. Visualize and edit the structure:

    To view and manipulate a ball-and-stick representation of the best trial structure, go to the Evaluate Trials page and enter the bond distance information. When View Structure is selected, the model will be displayed. It can be edited to remove obviously incorrect peaks before saving it as atoms in the .SnB_atom, .SnB_ins, and .SnB_pdb files.

    The visualization feature can also be useful for substructures. When viewing substructures with a "bond distance" of 4-5 Angstroms, one hopes NOT to see many "bonds", and those that do show up should have relatively large distances. In favorable cases, the visualization window may even reveal the presence of NCS in multi-site selenomethionine derivatives, and this is a strong indication that the peaks involved are correct.

    Selecting Check Geometry will display a complete listing of bond distances and angles.

  13. Look for more atoms:

    In order to search for additional atoms, an edited "atom" file can be recycled in SnB as a single trial by using it as a model structure (see Trials & Cycles screen). Use either an edited file that has been saved from a visualization session ("prefix_#.SnB_atom") or a peak file ("prefix_#.SnB_peak") that has been manually edited. Unwanted low-density peaks must be removed from the end of the peak file since the model structure file will be read until an end of file is encountered. For example, suppose that the edited atom file from job1 is to be used as a starting point for 10 more cycles of Shake-and-Bake refinement followed by 20 cycles of Fourier refinement. Note that the current version of SnB must always do at least one Shake-and-Bake cycle. The GUI fields which should be changed are:

    • Trials to process:
      • Starting phases from: Model Structure Atoms
      • Number of trials: 1
      • Number of Shake-and-Bake cycles: 10
      • Input atom file: job1.SnB_atom
    • File name prefix for results: job2
    • Twice Baking:
      • Trials for E-Fourier filtering: All or Best
      • Number of cycles: 20
      • Number of peaks: as desired
      • Minimum |E |: as desired

  14. Examine other trials:

    If the "best" peak file does not yield a chemically sensible structure or there are indications that false minima are present, then other trials can be rerun by specifying individual desired trials using the parameters on the Trials & Cycles screen. A different descriptive name should be chosen for the output prefix each time a phasing run is submitted. For example, suppose that a user wished to obtain the peak file for trial 517 which had the second lowest value of Rmin. Then the following fields

    • Number of trials: 1
    • Start at trial: 517
    • File name prefix for results: tr517

    should be changed to reprocess trial 517. It is also possible to preform extra Shake-and-Bake cycles with more peaks or to add or change the twice baking (E-Fourier recycling) cycles. However, DO NOT CHANGE ANY OTHER PARAMETERS or the desired trial will not be reproduced. In order to ensure that other parameters do not change, it is wise to begin by restoring the "configuration" file for the original job. Also, run the new job on the same machine as the original.

  15. What do I do next?

    SnB outputs coordinate information in a variety of ways including both fractional (peak and atom files) and orthogonal forms (short pdb files). An "ins" file is produced that allows the coordinates for complete structures to be input, with minimal editing, to SHELXL for least-squares refinement. Substructure sites can be put into PHASES, MLPHARE, CNS, or SOLVE for heavy-atom refinement.