- Collaboration Tools
- Computational Cluster
- Running software the easy way
- User Testimonials
- Cluster User Group
- Scratch Space
- Recent Changes
- Accessing client-side data on the cluster though NX
- LSF Migration
- LSF Examples
- Accessing the cluster
- Cluster upgrade
- Gold Allocation Manager
- User Pages
- Data Storage
- Database Servers
- Virtual Servers
- Web Servers
- Public Policies
Job Submission Examples (Torque)
Users submit jobs to the server using the qsub command. The current state of the queue in the server can be viewed using qstat. There are a host of other utilities that can be used by Torque users like: qdel, qalter, qhold, qmove, qrls, qmsg, qrerun, qselect, etc. qsub can be used for batch as well as interactive submission of jobs. Interactive job submission should be used only when a user needs to run and debug his code and for short-duration jobs.
(adopted from M. Petersen's nice documentation)
The basic syntax for qsub is simply
- qsub batchfilename.bat
where batchfilename.bat is a file with shell commands that are to be executed. The first few lines of the batch file should contain PBS directives (lines starting with #PBS) that specify the resources that the job requires (e.g., number of nodes, number of processors, memory required, etc.).
A simple batch job example
Suppose you have an R program runme.R in your home directory that runs for a long time and that you would like to run on the cluster. It requires a single cpu for, say, no more than 10 hours. Here's a batch file that would do the trick on either the internal or external cluste
#PBS -S /bin/csh #PBS -l walltime=10:00:00 #PBS -l ncpus=1 #PBS -e ~/pbs_logs/stderr.txt #PBS -o ~/pbs_logs/stdout.txt module load R cd ~ # execute program R CMD BATCH runme.R
Here's a break down of what the lines in this batch file mean:
- #PBS -S /bin/csh tells PBS to use the c-shell as a shell (/bin/bash or /bin/tcsh are other options)
- #PBS -l walltime=10:00:00 tells PBS that your jobs will require no more than 10 hours of walltime to complete. The time format is HH:MM:SS. Some schedulers will prioritize short jobs over long jobs, so the less time you ask for, the more likely it is your job will get scheduled sooner rather than later. Should the actual job length exceed what you requested then your job will be killed. (this feature is currently not used in our implementation, but a default running time will likely be implemented at some point)
- #PBS -l nodes=1 asks PBS for one CPU. This means that when your jobs starts you will have exclusive access to one CPU. But if you want something like 4 nodes each with exactly 2 CPUs (total of 8 CPUs), then you would use something like -l nodes=4:ppn=2. Instead, if you just want any 8 CPUs in the cluster, you would request like just -l nodes=8.
- #PBS -e ~/pbs_logs/ tells PBS to store all output that would normally be put in stderr into a file in your pbs_logs directory. This file's name will contain the PBS job number and will have suffix .ER. This enables you to check whether there were any errors running your R program.
- #PBS -o ~/pbs_logs/ tells PBS to redirect all output to a .OU file in your pbs_logs directory, similarly to the location of the error file in the previous line.
- comment lines: The other lines in the sample script that begin with '#' are comments. The '#' for comments and PBS directives must be in column one of your script file. The remaining lines in the sample script are executable commands.
A parallel batch job example
Suppose now you have an parallel mpi job that needs 4 processors and you would like to have 2 processors on 2 nodes. Here's the corresponding batch file you can submit with qsub:
#PBS -S /bin/csh #PBS -l walltime=10:00:00 #PBS -l nodes=2:ppn=2 #PBS -e ~/pbs_logs/stderr.txt #PBS -o ~/pbs_logs/stdout.txt module load mpich1/gnu cd ~ # execute program mpiexec -np 4 myprogramname
The line #PBS -l nodes=2:ppn=2 requests 2 nodes with 2 processors per node (ppn=2). You could also have requested -l nodes=4 if you didn't care about their location.
If you know that your job requires a lot (say, 16GB) of memory, then you can request it with the #PBS -l mem=16GB directive.
An interactive job example
A form of interactive access is available on the processor via the qsub -I command. This is useful for small debugging or test runs. There are two avenues to get interactive time via qsub -I. To use it, type:
qsub -I -l nodes=N -l walltime=mm:ss
See the discussion of the format for qsub -I in manpages for additional information. You should use this only for short, interactive runs. If there are no nodes free, the qsub command will wait until they become available. This can be a long wait, even hours, depending on the mix of running and queued jobs. Please check the system to be sure that there are available nodes before issuing qsub -I. You can determine if there are free nodes by using the pbsnodes command. Format of the qsub -I command The format for qsub -I is:
qsub -I -l nodes=2 -l walltime=30:00 myscript
This requests interactive access to 2 processors for thirty minutes. Change the number of nodes and processors and the time to suit your needs.
X-Forwarding in interactive jobs
Torque also provides X-Forwarding ability through a special -X switch that you can use in an interactive job. Please note that -X switch always needs a -I switch. Also -X switch only works when you are requesting a single node. Your resource request should be something like -lnodes=1:ppn=... Torque will complain, correctly so, if your DISPLAY environment variable is not set properly for a job request that has -X switch. An example job request with X-Forwarding would look like this.
qsub -I -X -lnodes=1:ppn=2
Type ^C (control-c) to exit qsub -I and double check on the availability of free nodes. The -m b option to qsub can be used with qsub -I to have the system send you email when your job starts and you have access to your nodes. The -M option is used to specify the email addresses to which the email will be sent. See the above discussion of options to qsub for more details on using the -m b and -M options. Once nodes are allocated to you, you will receive a command prompt. Even though you are running interactively you must use a mpirun command to run your executable on your compute nodes. This can either be in your script as specified above, or in a command line as
mpirun -np 2 ./myrunscript
Enter the actual values for the -np option. Stdout and stderr will go to your terminal. Use input redirection to get stdin input to your mpirun executable. When you are finished with your interactive session type ^D (CTRL-D).
^D qsub: job 349.eady completed
The default wallclock time for jobs is 1 hour and this includes jobs submitted using qsub -I. When you use qsub -I you hold your processors whether you compute or not. Thus, as soon as you are done with your mpirun commands you should type ^ D to end your interactive job. If you submit an interactive job and do not specify a wall clock time you will hold your processors for 1 hour or until you type ^D.
More complex requests
- If all you want is 4 CPUs in the cluster, and if it does not matter how those processors are allocated (all in 1 node, or 1 CPU per node in 4 nodes, or 2 per node in 2 different nodes etc.), please use
prakash@bmiclusterp1:~> qsub -l nodes=4
- If you need a specific number of CPUs per node, then you need to specify like
prakash@bmiclusterp1:~> qsub -l nodes=4:ppn=2
In this case, you will get a total of 8 CPUs, but this will NOT guarantee that you get those CPUs out of 4 DIFFERENT nodes. This only tells the scheduler that it needs to allocate at least 2 processors per node. So, if you do cat $PBS_NODEFILE from an interactive session, you might find the allocated nodes as below, which is a surprise. Scheduler allocated a total of 8 CPUs and made sure there are at least 2 CPUs per allocated node (but there are only 2 DIFFERENT nodes).
prakash@bmi-opt2-11:~> cat $PBS_NODEFILE bmi-opt2-11 bmi-opt2-11 bmi-opt2-11 bmi-opt2-11 bmi-opt2-12 bmi-opt2-12 bmi-opt2-12 bmi-opt2-12
- If you want specific number of CPUs to be allocated from each DIFFERENT node, then you would have to use
qsub -l nodes=20,tpn=2
This will allocate a total of 20 CPUs, but will also guarantee that only 2 CPUs are allocated from each DIFFERENT node. So a total of 10 DIFFERENT nodes will be allocated in this case. The output of cat $PBS_NODEFILE is shown here.
prakash@bmi-opt2-07:~> cat $PBS_NODEFILE bmi-opt2-07 bmi-opt2-07 bmi-opt2-08 bmi-opt2-08 bmi-opt2-09 bmi-opt2-09 bmi-opt2-10 bmi-opt2-10 bmi-opt2-11 bmi-opt2-11 bmi-opt2-12 bmi-opt2-12 bmi-opt2-13 bmi-opt2-13 bmi-opt2-14 bmi-opt2-14 bmi-opt2-15 bmi-opt2-15 bmiwebd1 bmiwebd1
- And when nodes are allocated to jobs, they are assigned by way of First Available in a predefined order. The current order of systems can be obtained by executing
pbsnodes -a | grep "^[^ ]"
on the cluster login node (currently bmiclusterd). Generally the order would be in terms of least memory / CPU. So if you do not specify any memory requirements in your job request, you would get the default value, which may not be what you want. So if you have specific memory requirements for your job, you can ask like
qsub -l nodes=4,mem=<number><mb/kb/gb>
and that should allocate the correct resources for your job, if they are available.
There are several software licenses managed by the Division of Biomedical Informatics including Matlab, Totalview, Discovery Studio, and GeneSpring.
Although most of these software are in a floating license model, some are not. Moab can talk natively to a FlexLM-based license manager to retrieve information on available licenses for a software. In cases where this is not possible, either because the license manager is not FlexLM-based or when the licenses are node-locked, Moab can be configured to make these software as Generic Resources rather than as software.
Some of these software could be accessed from the compute nodes, if required. Moab has been configured to allocate a node only when licenses for a specific software that has been requested by the user is available. This is how to request a software using Torque.
qsub <-I> -lnodes=<number_of_nodes>,software=<software_name>
If you want to request a software which has been configured as a generic resource, this is how you can do that.
qsub <-I> -lnodes=<number_of_nodes>,gres=<software_name>
Software that can be requested as a software:
Software that can be requested as a generic resource:
Matlab and its toolboxes are currently not available in the compute nodes. If you have a strong need for them, please let Prakash Velayutham know.
- We have implemented a set of scripts for some commonly-used interactive applications, which will transparently perform job submission.
- Following are the applications that currently have a wrapper script.:
- Autodock (3 and 4)
- The scripts accept the following parameters which are for the Torque batch manager:
Please note that these parameters are not mandatory, and if not given, a site-defined default will be applied.
- To start any of these applications, please do as below:
<software_name> <parameters for torque> <application parameters>
- plink --pbs-mem=2gb --pbs-walltime=24:00:00 --script _scr
- GeneSpring --pbs-mem=8gb
- R CMD BATCH Rbatch.bat
- R --pbs-mem=12gb --pbs-walltime=10:00:00 CMD BATCH Rbatch.bat
- matlab --pbs-walltime=10:00:00 example1.m
Other useful hints
What if firewall keeps kicking me out often? How can I preserve my session across such freezes?
- ssh glucose.cchmc.org -l<username>
- ssh bmiclusterd.cchmc.org
- screen (and then enter when you get a message in your screen).
- In the command prompt, do your usual qsub thing.
- If you lose network connection after sometime, just go over steps 1 and 2 again and you will get a new session to bmiclusterd.
Once you are there, just type 'screen -d -r' and that should get you your interactive Torque session back.