Personal tools
You are here: Home Resources Computational Cluster Policies

Policies

Read about policies and scheduling limits before using the cluster.

The DMZ cluster maintained by the Division of Biomedical Informatics is for authorized users only. If you have any questions regarding usage or would like to submit any feedback, please email the BMI Help Desk.

Also, please make sure you are subscribed to the bmi-cluster-users mailing list so that you receive important announcements regarding downtimes, policy changes, etc. You can subscribe yourself by browsing the following web page: http://mailman.cchmc.org/mailman/listinfo.

By using this cluster, you acknowledge that you understand and accept the the policies and scheduling limits outlined below.

Current Limits

  • Currently only users who have a job running on a node can gain SSH access to that node. This access policy prevents users from submitting jobs outside of the batch controller. This necessitates the following changes for you as a user:
    • When you submit a job, please make sure that within your job, all your temporary files are copied back to either your home directory or somewhere where you can access them after your job completes.
  • All users have a default CPU limit of 100. If you have jobs running currently in the cluster that account for 100 procs, any new jobs you submit will go into Q status and not be scheduled to run until one or more of your current jobs complete. For example, if you have five parallel jobs running in the cluster each with 20 CPUs, you have 100 CPUs allocated for your jobs. And if you request a serial job requesting one CPU, your job will not run until one of your parallel jobs completes.
    • Reasoning: This policy is set to prevent a single user from flooding the entire cluster with jobs. Even though 100 CPUs is the limit, please be considerate to others and break down your jobs into smaller (CPU time) jobs.
    • If you plan to use more than 100 CPUs at one time, please contact us to make a reservation.
    • If you want to run jobs on more than 100 CPUs (subject to availability) without going through a reservation policy, follow the procedure below.
    1. First off, know that the following condition applies to these additional jobs:
      1. If you are over your 100 CPU limit AND the cluster is full, AND if another user comes in and wants to run jobs in their original quota of 100 processors, their job will stop and requeue your over-quota job. Now that you know the policy, please read on to understand how to submit additional jobs over your quota.
    1. On the jobs that are over the 100 CPU limit, add a flag "qos=preempt" to your batch file as below:
    prakash@bmiclusterlogin1:~> qsub -l<other resources requests here>,qos=preempt
    The "qos=preempt" above will give you additional processors, but (only) those jobs will be subject to the requeue policy. If the cluster gets full and those processors are being requested by other users who have not yet reached their 100 CPU quota, they will be stopped and requeued (PLEASE READ - started from the beginning all over again, unless your job has an intelligent way of checkpointing itself and starting from where it left off).
  • There is a limit of 256 CPUs for the jobs in the large queue (> 24 hours). This is approximately 2/3 of the total number of CPUs in the cluster. This limit
    • ensures that there are at least some CPUs always available for short (=< 2 hours) and medium-duration (=< 24hours) jobs.
    • encourages users not to over-estimate their job's walltime requirements.
  • There is a maximum walltime limit of 12960000 seconds (2400 hours or 150 days) per user for jobs to get scheduled to run. For example:
    • User A has 150 jobs with walltime requests of 24 hours each. So total walltime is 3600 hours. But suppose these jobs have been running for 2 hours, so the total remaining walltime is 3600 - (2 x 150) = 3300 hours. At this point, user A can only submit new jobs with a cumulative walltime request of < (3600 - 3300 = 300 hours). If he/she submits new jobs with walltime > 300 hours, some jobs that go beyond the limit will get queued. If walltime is < 300 hours, based on CPU availability, the job will be eligible to be scheduled in the cluster.
  • The above policies are ignored if the incoming request is an interactive job request. In such a case, there is a maximum wall time limit of 12 hours. This limit is to prevent users from abusing this policy.

Access policies

  • You will need privileges to submit jobs to the cluster. So, if your job submission errors out like
    prakash@bmiclusterlogin1:~> qsub -I -lnodes=1
    qsub: Job rejected by all possible destinations
    it is highly likely that you did not request us for job submission privileges to the cluster. If you need this access, please email us at BMI Help Desk.

Queue policies

  • By default, all requests from users go to a routing queue called "routing" which in turn routes those jobs to execution queues based on the resource requests.
  • Any request from users belonging to 'wwwgroup' system group would then be routed to 'www' Torque queue. Requests from other users would fall into several other queues named 'small,' 'medium,' 'large,' 'interactive,' etc.
  • An upper limit of 256 CPUs has been enforced on the 'large' queue. This is to encourage users from providing accurate walltimes for their jobs instead of overestimating. This number will be changed based on the total number of CPUs available in the cluster.
  • Please remove any reference to 'users' queue from your job batch file (or from your qsub command line if you do this interactively) as soon as convenient. This is no longer required, as the routing queue routes requests to appropriate queues automatically.

Node/Resource requests

  • By default, all jobs are assigned a default resource requirement of 384 MB of memory and one CPU hour. If your job exceeds these limits and you did not ask for different values for these resources explicitly, your job will potentially be terminated (some exceptions apply).
  • We have nodes with different resources (some have faster CPUs, some slower; memory ranges from 4GB in some nodes to 32 GB in some). This has a direct effect in the scheduling of resources to jobs.
  • If all you want is 4 CPUs in the cluster, and if it does not matter how those processors are allocated (all in 1 node, or 1 CPU per node in 4 nodes, or 2 per node in 2 different nodes etc.), please use
    prakash@bmiclusterlogin1:~> qsub -l nodes=4
  • If you need a specific number of CPUs per node, then you need to specify like
    prakash@bmiclusterlogin1:~> qsub -l nodes=4:ppn=2
    In this case, you will get a total of 8 CPUs, but this will NOT guarantee that you get those CPUs out of 4 DIFFERENT nodes. This only tells the scheduler that it needs to allocate at least 2 processors per node. So, if you do cat $PBS_NODEFILE from an interactive session, you might find the allocated nodes as below, which is a surprise. Scheduler allocated a total of 8 CPUs and made sure there are at least 2 CPUs per allocated node (but there are only 2 DIFFERENT nodes).
    prakash@bmi-opt2-11:~> cat $PBS_NODEFILE
    bmi-opt2-11
    bmi-opt2-11
    bmi-opt2-11
    bmi-opt2-11
    bmi-opt2-12
    bmi-opt2-12
    bmi-opt2-12
    bmi-opt2-12
  • If you want specific number of CPUs to be allocated from each DIFFERENT node, then you would have to use
    qsub -l nodes=20,tpn=2
    This will allocate a total of 20 CPUs, but will also guarantee that only 2 CPUs are allocated from each DIFFERENT node. So a total of 10 DIFFERENT nodes will be allocated in this case. The output of cat $PBS_NODEFILE is shown here.
    prakash@bmi-opt2-07:~> cat $PBS_NODEFILE
    bmi-opt2-07
    bmi-opt2-07
    bmi-opt2-08
    bmi-opt2-08
    bmi-opt2-09
    bmi-opt2-09
    bmi-opt2-10
    bmi-opt2-10
    bmi-opt2-11
    bmi-opt2-11
    bmi-opt2-12
    bmi-opt2-12
    bmi-opt2-13
    bmi-opt2-13
    bmi-opt2-14
    bmi-opt2-14
    bmi-opt2-15
    bmi-opt2-15
    bmiwebd1
    bmiwebd1
  • And when nodes are allocated to jobs, they are assigned by way of First Available in a predefined order. The current order of systems can be obtained by executing
    pbsnodes -a | grep  "^[^  ]"
    on the cluster login node (currently bmiclusterd). Generally the order would be in terms of least memory / CPU. So if you do not specify any memory requirements in your job request, you would get the default value, which may not be what you want. So if you have specific memory requirements for your job, you can ask like
    qsub -l nodes=4,mem=<number><mb/kb/gb>
    and that should allocate the correct resources for your job, if they are available.

Time-shared node

  • Justification
  • This should be used when all of the following are true.

    • You want to run a quick job for a project.
    • You find that the cluster is 100% reserved (both for jobs and users).
    • You know your job would only take 10 minutes to complete and it is not feasible to wait a long time to run a short job.
    • You are ready to run the job in a node that is overcommitted (you might be fighting for resources on the node with other similar users)
  • Solution
  • We have one node with 8-cores which has been overcommitted to the scheduler as having 24 cores. The idea behind this goes something like this.

  • Usage
    • In the batch script (or in qsub command line), use "-l advres=ts1" and you should get this time-shared node (provided that the overcommitted 24-cores are not already requested by other users).
    • No more than 2 procs can be requested per job from the time-shared node.
    • The walltime of the job cannot exceed 2 days.

Software requests

There are several software licenses managed by the Division of Biomedical Informatics including Matlab, Totalview, Discovery Studio, and GeneSpring.

Although most of these software are in a floating license model, some are not. Moab can talk natively to a FlexLM-based license manager to retrieve information on available licenses for a software. In cases where this is not possible, either because the license manager is not FlexLM-based or when the licenses are node-locked, Moab can be configured to make these software as Generic Resources rather than as software.

Some of these software could be accessed from the compute nodes, if required. Moab has been configured to allocate a node only when licenses for a specific software that has been requested by the user is available. This is how to request a software using Torque.

qsub <-I> -lnodes=<number_of_nodes>,software=<software_name>

If you want to request a software which has been configured as a generic resource, this is how you can do that.

qsub <-I> -lnodes=<number_of_nodes>,gres=<software_name>

Software that can be requested as a software:

Bioinformatics_Toolbox
Compiler
Curve_Fitting_Toolbox
MATLAB
Neural_Network_Toolbox
Optimization_Toolbox
Signal_Toolbox
Spline_Toolbox
Statistics_Toolbox
Wavelet_Toolbox
Biopolymer
CHARMM
CNX_XRAY
DS_Analysis
DS_ModelingVisualizer
DS_ProjectKM
DS_ProteinFamilies
DS_ProteinSimSrch
DS_SequenceAnalysis
DelPhi
LIGANDFIT
LIGSCORE
License_Holder
Ludi
Ludi_Genfra
Ludi_Score
ProteinFamilies_Client
health
modeler
xbuild

Software that can be requested as a generic resource:

genespring

Note

Matlab and its toolboxes are currently not available in the compute nodes. If you have a strong need for them, please let Prakash Velayutham know.

NEWSFLASH

  • Currently, there is a processor quota of 60 per user in the cluster. This is the number of processors that a single user can use at any given time in the whole cluster.

    In the past we have had times when the cluster has been mostly free and some users wanted to run jobs in more than 60 processors. Even though we encourage users to request reservations for large submissions like that, there was a consensus that it would be better if there were an allowance, sort of, where users could run in more processors than 60, with the condition that if other users come in and want to run jobs in their original quota of 60 processors, they should be able to grab the processors currently allocated to the over-quota users.

    We have a solution to this situation and I think it should work well for all groups of users.

    If you already have jobs running in 60 processors and want more, and are willing to risk your additional jobs getting requeued (stopped and resubmitted by the system automatically), you can do so by using the syntax below.

    qsub -l<other resources>,qos=preempt

    The "qos=preempt" above will give you additional processors, but (only) those jobs will be subject to the requeue policy, if the cluster gets full and those processors are being requested by other users who have not yet reached their 60 CPU quota.

  • We have implemented a set of scripts for some commonly-used interactive applications, which will transparently perform job submission.
  • Following are the applications that currently have a wrapper script.:
    • GeneSpring
    • R
    • Matlab
    • plink
    • Autodock (3 and 4)
  • The scripts accept the following parameters which are for the Torque batch manager:
    • pbs-walltime
    • pbs-mem
    • pbs-pmem
    • pbs-vmem

Please note that these parameters are not mandatory, and if not given, a site-defined default will be applied.

  • To start any of these applications, please do as below:

<software_name> <parameters for torque> <application parameters>

Example

  • plink --pbs-mem=2gb --pbs-walltime=24:00:00 --script _scr
  • GeneSpring --pbs-mem=8gb
  • R CMD BATCH Rbatch.bat
  • R --pbs-mem=12gb --pbs-walltime=10:00:00 CMD BATCH Rbatch.bat
  • matlab --pbs-walltime=10:00:00 example1.m
Document Actions