Manuals

The "Run SnB" screen allows you to start an SnB job on the local computer or submit it to a batch processing system, such as PBS or LoadLeveler, if one is available. It also supports submission to Condor, a system that scavenges unused computing time on a network of workstations (for more information on Condor, see http://www.cs.wisc.edu/condor/ ). These options provide you with convenient ways to take maximum advantage of the inherently parallel nature of the Shake-and-Bake algorithm by dividing the trial structures among as many processors as possible. Thus, jobs can be run in several parts with each subjob creating its own set of output files. The results, however, are combined for inspection using the tools provided by the Evaluate Trials screen.

We (the SnB developers) have a limited number of platforms available for development and testing. Your system configuration may differ from ours, and the batch submission options may not work as expected. In that case, please contact us at snbhelp@hwi.buffalo.edu so that we can work with you to support your configuration.

There are three sections on this screen: Required Information, Local Options, and Batch Options. The required information must be supplied. Whether or not the other sections need to be completed depends on the choices you make in the required information section.

Required Information
- Queueing System: Select the queueing system you would like to use.
  
  None (local machine) runs the job on the machine where the GUI is running. If you are using X-Windows, note that this is not necessarily the same as the machine where the GUI is being displayed.
  
  PBS will submit the job to a PBS queue. The 'qsub' program must be installed and configured on your local machine, even if PBS is actually submitting jobs to a remote machine.
  
  Loadleveler submits a job to a LoadLeveler queue on an IBM SP system.
  
  Condor allows submission to a Condor flock.
  
  Clicking Custom generates the dat files required to run SnB without actually starting the job. This is useful if you want to run SnB via a batch queueing system that is not supported directly by SnB. Given the dat files, you can write a script that will submit the job to the batch queueing system that you are using at your site.
- File name prefix for results: All files that are generated by the SnB run will start with the prefix entered here. Appended to this prefix will be an underscore and a number ranging from zero to one less than the number of SnB processes you request (see the next variable). Do NOT use an underscore in the prefix name itself (hyphens are OK).
- Number of SnB processes to run: If the local run method is selected, the GUI will initiate this many processes on the local machine. If you select one of the batch methods (PBS, LoadLeveler, Condor), this variable indicates the number of nodes to be requested from the batch queueing system.
Local Options
- Priority: Used to choose the "nice" value at the time of job submission. If you are sharing a machine and wish to run a background job, choose "low" priority.
- Process jobs: When you have finished filling in all the required fields, click this button to begin processing the job.
Batch Options
- Queue: Select the queue for PBS and LoadLeveler jobs. Condor does not support different queues.
- Copy input files to remote machine(s): Select "yes" if you want to copy all input files to the machine where the job will be run. When SnB is finished, it will copy the output files back to the working directory on the local machine. Copying the files does not really improve overall performance since the only significant amount of I/O occurs at the start of the job. However, it is recommended that you transfer input files to remote cluster machines since these machines typically have low disk and network I/O performance. Thus, their network and disk subsystems could become overloaded when starting a job.
- Remote directory: The directory for staging files. You need to supply this information only if you selected "yes" for "copy input files to remote machine." If your batch environment provides a temporary directory name in an environment variable, you can enter that here.
- Queue type: Your choices are serial, parallel (shared memory), and parallel (cluster). For example, suppose you entered "8" for the number of SnB processes to run (in the required information section). Choosing serial would cause eight single-processor jobs to be submitted to the queue that you selected. Both parallel selections will submit a single eight-processor job. The difference between the two is that the parallel shared memory option will use cp to stage files whereas the parallel cluster option uses rcp (a shared file system is not assumed). When running LoadLeveler jobs, you are not prompted for this item.
  
  Shared memory machines include the SGI Origin2000, Sun Enterprise 10000, and any other machine that has multiple processors in the same physical unit. On these machines you should select parallel shared memory as the queue type.
  
  Cluster machines include the IBM SP and Beowulf-style clusters. Clusters consist of two or more distinct computers that are coupled together via software. For these machines you should select parallel cluster as the queue type.
  
  Serial can be chosen for either shared memory or cluster computers. Whether you choose serial or one of the parallel options is a matter of preference. One serial job will start up when a single processor is free. On the other hand, a parallel job that requires n processors will have to wait till n processors are free. Your computing site will also have limits on how many jobs you can have running as well as how many processors you can allocate for a parallel job. These limits will also influence which option you should choose. If you are unsure, you should contact the administrator of the machine you are using.
- Tasks per node (LoadLeveler only): The number of tasks to start on each SP node. If you are utilizing SMP nodes, you can set this number to the number of processors in each node. Then, the total number of processors that your job will use is equal to (tasks per node)*(number of nodes).
- Number of nodes (LoadLeveler only): The number of nodes to allocate for the job.
- Process jobs: When you have finished filling in all the required fields, click this button to submit your job to the batch system that you have selected.