Submit a batch job
The batch system, or torque, offers the following commands for job management.
command | description |
---|---|
qsub | dispatches a job |
qstat | shows a job's status |
qalter | edits job attributes |
qdel | deletes a job |
qsig | sends a signal to the job |
The most important function of the batch system is dispatching a job to moab which is done with qsub. Now a description of the qsub options follows:
option | description |
---|---|
-d path | sets the working directory. If a working directory is not specified, the current directory will be used. |
-q destination | selects the queue. This is not required because the batch system chooses a suitable queue according the required resources. |
-I | starts an interactive job (capital "i"). Essentially the same as an ssh access on any node. |
-l resource_list | This parameter (lowercase "L") destinates the required resources of the job, e.g. number of nodes, number of cores, walltime, memory, etc. |
-M list | This parameter declares a list of e-mail addresses, which will get information from the PBS-System. |
-m mail_options | This parameter tells when an e-mail is supposed to be sent. Thus a=abort, b=begin and e=end. If you want to get e-mail notification when a job is aborted, begun or ended, enter "-m abe". |
-N | sets the job name. Usually the job gets its name from the script name. |
-v variable_list | This command edits a variable list which the batch system sets in the working directory. |
-V | This command takes all current variables and sets them in the batch workspace. |
A detailed description of the qsub parameter you will find on the qsub man pages: "man qsub".
Dispatch a simple job
To learn how to write a simple job script, click here.
When you have written your script, dispatch it with "qsub jobscript.sh". It returns a job ID.
Check the status of your job with "qstat".
[<userid>@u-003-scfe03 queuetest]$ qsub jobscript.sh 161716.icmu03 [<userid>@u-003-scfe03 queuetest]$ qstat Job id Name User Time Use S Queue 161716.icmu03 jobscript.sh <userid> 00:00:00 R tue-short |
The status of a job
Important status of qstat are:
Status | Description |
---|---|
Q | Queued: waiting |
T | Transfer: ready to release |
R | Running |
C | Complete: completed successfully or with error |
H | Hold: hold back |
How to use hardware resources
The command qsub offers a -W flag, which sets additional job attributes while you are dispatching a job.
Generally the following command is used:
qsub -W attr_name=attr_value[,attr_name=attr_value...]
For a detailed description of qsub and the option -W read man qsub. This section discusses the attribute "depend" with the value "afterok" and "afterany". Usually, jobs are declared with their dependencies with colon-seperated job-IDs.
depend=afterok
By means of the attribute depend=afterok you can declare dependencies between multiple jobs. The dispatched job is only executed when all other named jobs have been executed successfully.
qsub -W depend=afterok:jobid[:jobid]
Example:
qsub -W depend=afterok:123.big.iron.com /tmp/script
Here you will find information in detail.
depend=afterany
Use the attribute depend=afterany to execute your job after another job has ended. The named job can terminate with or without an error
qsub -W depend=afterany:jobid[:jobid]
Example:
qsub -W depend=afterany:123.big.iron.com /tmp/script
Details about the depend attribute and about other optional attributes available here.
Hardware resource: storage
When you dispatch your job you can declare required resources with the -l option. you can find a list of the available resources here. This section uses examples to describe the attributes mem and pmem, which sets the amount of the memory requirements.
mem=amount
Giving mem you can set the maximum required physical memory that your job will need. Torque chooses the appropriate node.
Example:
qsub -l mem=300mb /home/user/script.sh
pmem=amount
Using pmem you specify that each process of your job will be assigned the maximum physical memory needed.
Example:
qsub -l pmem=50mb /home/user/script.sh
How you can combine multiple jobs?
Sometimes you need to execute multiple jobs in one job script.
For that, qsub won't be executed multiple times. Instead, a job array takes multiple jobs in an array and dispatches them at once.
Job arrays
Thus a "job array" is a collection of multiple jobs. It is possible to reference the whole array or only one special job in this job array.
The command for a job array is "qsub -t" or "#PBS -t" in a batch script.
With "0-4" or "0,1,2,3,4" you can put multiple jobs in an job array. And "%3" you can limit the number of jobs which supposed to be dispatched simultanously.
> qsub jobscript.sh -t 0-4%3 1098[].hostname > qstat -t 1098[0].hostname 1098[1].hostname 1098[2].hostname 1098[3].hostname
Here you can read more detailed information.