Job Execution
Once resources have been allocated through PBS, users have the option of serially running commands on the allocated resources’ head node or across all the resources in the allocated resource pool.
Serial
Within a batch script or interactive batch job, commands not executed through a mpi job launcher (such as mpiexec_mpt) will be executed on the batch job’s head node. Examples of serial job execution can be file system navigation, job execution set-up, and even submitting other batch jobs.
Parallel
By default, commands will be executed on the head node. The mpiexec_mpt command is used to execute a job on one or more compute nodes. Frost’s layout should be kept in mind when running a job using mpiexec_mpt. Frost’s current layout consists of four quad-core sockets per node.
The PBS node option requests compute nodes; the PBS ppn option requests cores. Entire nodes will be allocated to the batch job. However, the ppn option will be used by mpiexec_mpt to determine task layout. For example,
-
The following will run a job on 8 cores, 4 cores on 2 nodes:
#PBS -l nodes=2:ppn=4 mpiexec_mpt -n 8 a.out
-
the following will run a job on 64 cores, 16 cores on 4 nodes:
#PBS -l nodes=4:ppn=16 mpiexec_mpt -n 64 a.out
Note: If you do not specify the number of tasks to mpiexec_mpt, the system will default to use all cores requested through the batch request (node * ppn).
Task Layout
By default the MPI task layout is sequential.
For example,
#PBS -lnodes=2:ppn=8 mpiexec_mpt -n 16 a.out
will run the MPI executable a.out on a total of 16 cores, 8 cores on 2 compute nodes. The MPI tasks will be allocated in the following sequential fashion:
Task: 0 on r1i0n2 Task: 1 on r1i0n2 Task: 2 on r1i0n2 Task: 3 on r1i0n2 Task: 4 on r1i0n2 Task: 5 on r1i0n2 Task: 6 on r1i0n2 Task: 7 on r1i0n2 Task: 8 on r1i0n3 Task: 9 on r1i0n3 Task: 10 on r1i0n3 Task: 11 on r1i0n3 Task: 12 on r1i0n3 Task: 13 on r1i0n3 Task: 14 on r1i0n3 Task: 15 on r1i0n3
Threads
Memory is shared within a node. Threads cannot span across nodes. To run a code with 2 MPI tasks and 8 threads per task, you would use the following:
#PBS -lnodes=2:ppn=1 export OMP_NUM_THREADS=8 mpiexec_mpt -n 2 ./a.out
NOTE: csh/tcsh users should replace the first line with setenv OMP_NUM_THREADS 2 .
The PBS option #PBS -lnodes=2:ppn=1 allocates 2 nodes and tells mpiexec_mpt to use one core per node for MPI tasks. Within the batch script mpiexec_mpt specifies 2 tasks (-n 2). The OMP_NUM_THREADS environment variable tells the system to spawn 8 threads per MPI task.
MPI task 0 (OpenMP thread 0) on r1i0n2 MPI task 0 (OpenMP thread 1) on r1i0n2 MPI task 0 (OpenMP thread 2) on r1i0n2 MPI task 0 (OpenMP thread 3) on r1i0n2 MPI task 0 (OpenMP thread 4) on r1i0n2 MPI task 0 (OpenMP thread 5) on r1i0n2 MPI task 0 (OpenMP thread 6) on r1i0n2 MPI task 0 (OpenMP thread 7) on r1i0n2 MPI task 1 (OpenMP thread 0) on r1i0n3 MPI task 1 (OpenMP thread 1) on r1i0n3 MPI task 1 (OpenMP thread 2) on r1i0n3 MPI task 1 (OpenMP thread 3) on r1i0n3 MPI task 1 (OpenMP thread 4) on r1i0n3 MPI task 1 (OpenMP thread 5) on r1i0n3 MPI task 1 (OpenMP thread 6) on r1i0n3 MPI task 1 (OpenMP thread 7) on r1i0n3
