Zentrum für Datenverarbeitung (ZDV) (data center)

Submit a batch job

The batch system, or torque, offers the following commands for job management.

command description
qsub dispatches a job
qstat shows a job's status
qalter edits job attributes
qdel deletes a job
qsig sends a signal to the job

The most important function of the batch system is dispatching a job to moab which is done with qsub. Now a description of the qsub options follows:

option description
-d path sets the working directory. If a working directory is not specified, the current directory will be used.
-q destination selects the queue. This is not required because the batch system chooses a suitable queue according the required resources.
-I starts an interactive job (capital "i"). Essentially the same as an ssh access on any node.
-l resource_list This parameter (lowercase "L") destinates the required resources of the job, e.g. number of nodes, number of cores, walltime, memory, etc.
-M list This parameter declares a list of e-mail addresses, which will get information from the PBS-System.
-m mail_options This parameter tells when an e-mail is supposed to be sent. Thus a=abort, b=begin and e=end. If you want to get e-mail notification when a job is aborted, begun or ended, enter "-m abe".
-N sets the job name. Usually the job gets its name from the script name.
-v variable_list This command edits a variable list which the batch system sets in the working directory.
-V This command takes all current variables and sets them in the batch workspace.

A detailed description of the qsub parameter you will find on the qsub man pages: "man qsub".

Dispatch a simple job

To learn how to write a simple job script, click here.

When you have written your script, dispatch it with "qsub jobscript.sh". It returns a job ID.

Check the status of your job with "qstat".

[<userid>@u-003-scfe03 queuetest]$ qsub jobscript.sh

161716.icmu03

[<userid>@u-003-scfe03 queuetest]$ qstat

Job id Name User Time Use S Queue

161716.icmu03 jobscript.sh <userid> 00:00:00 R tue-short

The status of a job

Important status of qstat are:

Status Description
Q Queued: waiting
T Transfer: ready to release
R Running
C Complete: completed successfully or with error
H Hold: hold back

How to use hardware resources

The command qsub offers a -W flag, which sets additional job attributes while you are dispatching a job.

Generally the following command is used:

qsub -W attr_name=attr_value[,attr_name=attr_value...]

For a detailed description of qsub and the option -W read man qsub. This section discusses the attribute "depend" with the value "afterok" and "afterany". Usually, jobs are declared with their dependencies with colon-seperated job-IDs.

depend=afterok

By means of the attribute depend=afterok you can declare dependencies between multiple jobs. The dispatched job is only executed when all other named jobs have been executed successfully.

qsub -W depend=afterok:jobid[:jobid]

Example:

qsub -W depend=afterok:123.big.iron.com /tmp/script

Here you will find information in detail.

depend=afterany

Use the attribute depend=afterany to execute your job after another job has ended. The named job can terminate with or without an error

qsub -W depend=afterany:jobid[:jobid]

Example:

qsub -W depend=afterany:123.big.iron.com /tmp/script

Details about the depend attribute and about other optional attributes available here.

Hardware resource: storage

When you dispatch your job you can declare required resources with the -l option. you can find a list of the available resources here. This section uses examples to describe the attributes mem and pmem, which sets the amount of the memory requirements.

mem=amount

Giving mem you can set the maximum required physical memory that your job will need. Torque chooses the appropriate node.

Example:

qsub -l mem=300mb /home/user/script.sh

pmem=amount

Using pmem you specify that each process of your job will be assigned the maximum physical memory needed.

Example:

qsub -l pmem=50mb /home/user/script.sh

How you can combine multiple jobs?

Sometimes you need to execute multiple jobs in one job script.

For that, qsub won't be executed multiple times. Instead, a job array takes multiple jobs in an array and dispatches them at once.

Job arrays

Thus a "job array" is a collection of multiple jobs. It is possible to reference the whole array or only one special job in this job array.

The command for a job array is "qsub -t" or "#PBS -t" in a batch script.

With "0-4" or "0,1,2,3,4" you can put multiple jobs in an job array. And "%3" you can limit the number of jobs which supposed to be dispatched simultanously.

> qsub jobscript.sh -t 0-4%3 1098[].hostname > qstat -t 1098[0].hostname 1098[1].hostname 1098[2].hostname 1098[3].hostname

Here you can read more detailed information.