To submit jobs to the queueing system, log in on zbitmaster via ssh. If you need bash syntax for your script, add the following as first line of your script: #$ -S /bin/bash
Choose the right queue for your job: For long running jobs, i.e. more than 8 hours, choose long.q. For short running jobs, i.e. less than 8 hours, choose the default queue all.q. The long.q uses zbitnode05 to zbitnode24, whereas all.q uses zbitnode05 to zbitnode09. zbitnode01 to zbitnode04 are for interactive use only.
Check the status of the desired queue with qstat -f -q <queue-name>.
Submit your job(s) with qsub -q <queue-name> <job-script>.
You can check the status of your job with qstat -f -u <user-name>.
We currently have OpenMPI 1.4.2 integrated into grid engine. OpenMPI is installed in /share/opt/x86_64_sl5/openmpi-1.4.2. The mpirun binary is in the default PATH, the MPI libraries are in the default LD_LIBRARY_PATH. For submitting MPI Jobs, you have to supply the parallel environment (PE) orte and the no. of job slots to the qsub command:
qsub -pe orte <no_of_slots> <your_job_script.sh>
All MPI jobs are automatically submitted to the long.q. You can specify a maximum number of 120 MPI job slots, because that is the maximum no. of slots in the long.q. In <your_job_script.sh> you have to call mpirun and the MPI-enabled executable of the program, you want to use. You don't have to give the -np option to the mpirun call, because the no. of program copies to be run has already been specified in the qsub command with <no_of_slots>.
To submit multi-threaded jobs, that are supposed to run on a single node, you have to specify the parallel environment (PE) smp as well as the no. of threads you will use. As each node has 8 cores, you can specify a maximum of 8 threads per job. The PE smp is available for both long.q and all.q. Your qsub command would look s.th. like this:
qsub -pe smp <no_of_threads> -q <queue_name> <your_job_script.sh>
Certain computing resources (e.g. amount of RAM, CPU time) can be reserved in advance and reservations can be referred to, when submitting a job. To reserve all RAM on zbitnode13 or zbitnode14 for a job in long.q, submit an advance reservation (AR) with:
qrsub -u yourUsername -d 10:00:00 -N myARname -q "long.q@zbitnode13,long.q@zbitnode14" -l h_vmem=32G
The above reservation can only be claimed by yourUsername and is valid for 10 hours. To claim the reservation, find your AR-ID with qrstat and specify the ID with the -ar switch when submitting your job with qsub. For a list of the most common ressources, that can be reserved in advance, see below. More ressources can be found in man complex and with qconf -sc.
|-l h_rt||Amount of wall time (hh:mm:ss).|
|-l h_vmem||Amount of memory (in K, M or G units).|
|-l h_fsize||Amount of disk space (in K, M or G units).|
|-l h_stack||Amount of stack memory (in K, M or G units).|