Job Execution through PBS

Once resources have been allocated through PBS, users have the option of serially runing commands on the allocated resources head node or across all the resources in the allocated resource pool.

Serial

Batch Script

The executable portion of batch scripts is interpreted by the shell specified on the first line of the script. If a shell is not specified, the submitting user’s default shell will be used. This portion of the script may contain comments, shell commands, executable scripts, and compiled executables. These can be used in combination to, for example, navigate file systems, set up job execution, run executables, and even submit other batch jobs.

Batch Interactive

While running in interactive mode, the submitting user’s default shell will be used.

Parallel

By default, commands will only be executed on the head processor from the group of allocated resources. To run a command on each resource in the job’s allocated resource pool, the aprun command should be used.

Using MSP mode

For executables compiled in MSP mode, aprun will run the given executable on the given number of MSPs. For example, if a.out was compiled in MSP mode,

aprun -n 256 a.out

will run a.out on 256 MSPs.

Before running this job, at least 256 MPSs should be allocated through PBS. For example, the following line could be used in your PBS batch script:

#PBS -l mppe=256

See the C and Fortran sections of the Phoenix compiling page for more information on compiling a code in MSP mode.

The file command can be used to determine if an executable was compiled in MSP mode. For example,

> file a.out
 a.out: ELF 64-bit MSB executable (not stripped) MSP application NV1 - version 1
>

shows that a.out was compiled in MSP mode.

Please note that the file command must be executed from Phoenix.

Using SSP mode

For executables compiled in SSP mode, aprun will run the given executable on the given number of SSPs. For example, if a.out was compiled in SSP mode,

aprun -n 256 a.out

will run a.out on 256 SSPs.

Before running this job, at least 64 MSPs should be allocated through PBS. For example, the following could be added to your PBS batch script:

#PBS -l mppe=64

See the C and Fortran sections of the Phoenix compiling page for more information on compiling a code in SSP mode.

The file command can be used to determine if an executable was compiled in SSP mode. For example,

> file a.out
 a.out: ELF 64-bit MSB executable (not stripped) SSP application NV1 - version 1
>

shows that a.out was compiled in SSP mode.

Please note that the file command must be executed from Phoenix.

Memory

On Phoenix, each node contains 8 GB of memory. A brief overview of a node is provided below.

  • One node contains four MSPs.
  • One MSP contains four SSPs.
Node
MSP 0 MSP 1 MSP 2 MSP 3
SSP 0 SSP 1 SSP 0 SSP 1 SSP 0 SSP 1 SSP 0 SSP 1
SSP 2 SSP 3 SSP 2 SSP 3 SSP 2 SSP 3 SSP 2 SSP 3

Each MSP on a node shares the node’s 8 GB of memory. By default, jobs are executed on each MSP or SSP of the allocated nodes. For example, if a.out was compiled in MSP mode, the following job would have 8 GB of memory available:

#PBS -l mppe=4
aprun -n 4 a.out

The following aprun command can be used to run the MSP mode executable a.out on four MSPs across four nodes, allowing the job to use almost 8 GB per MPI task:

#PBS -l mppe= 16
aprun -n 4 -m 8000M a.out

To use all of the memory on a node, you must request the number of entire nodes required through PBS. In this case, four nodes or 16 MSPs.

Because the system requires a small amount of memory on each node, you cannot request all of the node’s memory. If a job requested -m 8GMB, the job would sit in a posted state and never run. Requesting 8,000 MB will reserve almost all of the node’s memory available for user processes.