Common Login System



The XT common external login nodes provide a single system external to each XT partition that allows users to access data, compile, and submit batch jobs regardless of the partition’s state.

The login system is comprised of four nodes. Each node contains a dual socket 4-core Opteron with 64GB of memory.

Interaction with the XT partitions through the common external login system is very similar to that of each partition’s internal login nodes. This page lists many of the notable similarities and differences between the common login and each partition’s login nodes.




Access


Users can access the XT common external login system through:

  • jaguar-ext.ccs.ornl.gov

Direct access to individual XT partitions will continue through each partition’s internal login nodes:

  • jaguar
  • jaguarpf

More information on accessing the XTs can be found on the NCCS general access page.




Filesystems


Most filesystems available from each partition’s internal login nodes are also available from the common login nodes. This section lists the availability of common filesystems.

  • home, software, project areas
    The NFS mounted home (/ccs/home/$USER), software (/sw), and project (/ccs/proj) areas are available from each system as well as the common login nodes.
  • Center Wide Lustre Filesystem
    The center wide shared luster filesysem (spider) is available from each primary NCCS system as well as the common login nodes. Each user’s /tmp/work/$USER link points to the common spider area on Jaguar (xt4), Jaguar (xt5), Lens, Smoky, Ewok, the dtn systems, and the common login system.
  • XT4 local luster (Not Available)
    The XT4 local lustre areas (/lustre/scr72a, /lustre/scr72b, and /lustre/scr144) are visible from the XT4 partition only. The areas are NOT mounted on the common login system.
  • HPSS
    The HPSS can be accessed from the common login system through the hsi and htar utilities.


More information on the available filesystems can be found on the XT filesystems page.



top





Compilation


Using the Cray provided compiler wrappers cc, CC, and ftn codes can be built for the XT compute nodes on the common login system as they are on the XT4 and XT5 internal login nodes.

Note:
Because hardware differs between the XT4 and XT5 partitions, binaries should be built for a specific partition. To allow users to easily mimic the environment available on each XT partition, the following modules are provided:

  • xt4
  • xt5

The xt5 module is loaded by default for all users upon login. To build for the XT4 partition, you can swap the xt5 module for the xt4 module:

module swap xt5 xt4


More information on building codes for the XTs can be found on the XT compiling page.



top





Batch System


Each XT partition’s batch queue can be viewed and accessed from the common login system. This section lists some of the notable differences in batch job submission and batch queue interaction.


Batch Jobs

Batch jobs can be submitted onto the XT4 or XT5 partition through batch scripts and interactive batch jobs.

Users must specify on which partition a batch job should run. This should be done using the PBS -l partition flag. For example:

Batch script:
  • #PBS -l partition=jaguar
  • #PBS -l partition=jaguarpf
Interactive:
  • -l partition=jaguar
  • -l partition=jaguarpf

Outside the requirement to specify a partition, batch scripts and submission options used on each partition’s internal login nodes can be used on the common login system.


More information on XT batch jobs can be found on the XT batch scripts and interactive batch pages.



top



Submitting Batch Jobs

As on each XT partition’s internal login nodes, batch jobs can be submitted to each partition’s batch queue from the common login system using qsub.

Once submitted the batch job must be migrated from the common login to the target partition. This is an internal process, but is worth noting. The following steps briefly explain the migration process:

  1. Upon successful submission through qsub the job will enter the common login’s batch system and a grid identifier will be given.

    • jaguar-ext4 > qsub test.pbs
      jaguar-grid.368
      jaguar-ext4 >
      
    • At this point, the job will only be visible through showq. The job will not be visible through qstat.
  2. The common login system will then attempt to contact the target partition to migrate the batch job to the partition.
    • The batch job can not be migrated until the target partition is available.
    • If the target partition is unavailable, the common login will continue to attempt to migrate the job.
  3. Once successfully migrated, a PBS job identifier will be issued to the job.
    • The batch job will not be eligible for execution until a PBS identifier has been issued.
    • Once the job has been successfully migrated to the target partition, it will be visible through both showq and qstat on the common login system and target partition’s login nodes.
    • The showq utility can be used to see if a job has been issued a PBS identifier
    • For example,
      jaguar-ext4 > showq -v
      ...
      JOBID                   USERNAME      STATE PROCS     WCLIMIT            QUEUETIME            PARTITION
      jaguar-grid.343           user1       Idle 12000    00:05:00  Mon Nov  9 09:34:07
      jaguar-grid.357/713253    user1   Migrated     4    00:05:00  Mon Nov  9 12:26:52               jaguar
      ...
      jaguar-ext4>
      

      In the above example,

      jaguar-grid.343
      The job only has a grid identifier and has not been migrated to the target partition.
      The job does not have a PBS identifier.
      The job is not eligible for execution.
      jaguar-grid.357
      The job has a PBS identifier, 713253.
      The job has been migrated to the remote partition and is eligible for execution.
      The job can also be seen through qstat using the PBS identifier 713253.



top



Viewing the Batch Queue

As on each XT's internal login nodes, the showq and qstat utilities can be used to view the batch queue.

In addition to the standard options, the following lists options that can be used to help display jobs on differing partitions.

showq
  • Without arguments showq will display all jobs on both partitions.
  • Without arguments showq will not display on which partition each job is running or scheduled to run.
  • Without arguments showq will not display the PBS identifier.
  • Note: The displayed available and used nodes/cores are a combined total for both partitions.
showq –v
  • The -v flag provides the partition on which each job is running or is scheduled to run as well as the PBS identifier if one exists.
  • Note: The displayed available and used nodes/cores are a combined total for both partitions.
showq –p partition
  • Displays jobs running or scheduled to run only on the specified partition.
  • The displayed available and used nodes/cores are for the given partition only.
  • This is similar to the output you would see by running showq on the specified partition's login node.
  • For example:
    • showq -v -p jaguar
    • showq -v -p jaguarpf





qstat
  • Shows only jobs migrated onto the partition of the currently loaded xt loaded module.
qstat -p partition
  • Shows jobs on given partition.
  • For example:
    • qstat -p jaguar
    • qstat -p jaguarpf
qstat jobid
  • Will query system on which moab shows given job running.
  • For example:
    • qstat 713253
    • qstat jaguar-grid.357


More information on monitoring the XT batch queues can be found on the XT monitoring job status page.



top



Altering Batch Jobs

As on each partition's login nodes, qalter, qdel, qhold, and qrls can be used to alter a submitted batch job or a submitted batch job's state.

Please Note:
If the partition on which the batch job has been migrated is unavailable, the job can not be altered with the above PBS utilities. The systemstatus utility can be used to determine the remote partition's state. In this case, the MOAB utility mjobctl can be used. The syntax for this command is very different than that of the PBS commands. More information can be found through the mjobctl man page.


More information on altering XT batch jobs can be found on the XT altering batch jobs page.




Remote Partition's Status


You can check the status of the individual partitions from any NCCS system without having to visit the System Status page. Simply run systemstatus:

jaguar-ext2> systemstatus

The XT4 partition is UP.
The XT5 partition is UP.

jaguar-ext2>