Petascale Computing on Jaguar

With a peak speed of 2.33 petaflops (over two thousand trillion calculations per second), "Jaguar," a Cray XT5 supercomputer located at Oak Ridge Leadership Computing Facility (OLCF), is the world's fastest supercomputer for unclassified research. Capable of simulating physical systems with heretofore unfeasible speed and accuracy—from the explosions of stars to the building blocks of matter—Jaguar has led OLCF into the era of petascale computing and beyond.

In the first half of 2009, OLCF explored the uncharted territory of petascale scientific supercomputing by inviting 28 leading research teams from around the world to participate in a six-month program of early petascale science using Jaguar. Using more than 355 million combined processor hours, these research teams delivered breakthrough scientific discoveries in climate science, chemistry, materials science, nuclear energy, physics, bioenergy, astrophysics, geosciences, fusion, and combustion. The research included climate models of unprecedented resolution, calculations of the flux of uranium into the Columbia River from aging underground storage facilities, and in-depth studies about impediments to producing bioethanol from plant material.

The results from these early petascale projects assured OLCF leaders that Jaguar was primed for next use by more than 38 research team awarded computational hours on Jaguar through the Department of Energy's 2009 Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program (http://www.sc.doe.gov/ascr/incite). For more informational about the early petascale science projects, visit http://www.nccs.gov/leadership-science/petascale-early-science.

As 2009 draws to a close, OLCF is poised to continue leading the world in computationally intensive research. The upgrade of Jaguar to six-core processors brings the total number of compute cores in the XT5 component to 224,526 (http://www.nccs.gov/supercomputing-resources/jaguar/#XT5-6-Core-Upgrade), made possible with funds from the American Recovery and Reinvestment Act of 2009. Using an InfiniBand network to unite a Cray XT4 component with the upgraded XT5 component, Jaguar has a total of over 255,000 compute cores. With unmatched speed, memory, and bandwidth to disks and networks, Jaguar is ready once again to provide unparalleled computational ability to researchers, engineers and computational scientists around the world.

Anatomy of Jaguar

The Jaguar system consists of an 84 cabinet quad-core Cray XT4 system and 200 upgraded Cray XT5 cabinets, using six-core processors. The XT4 has 8 gigabytes of memory per node while the XT5 has 16 gigabytes per node, giving the users a total of 362 terabytes of high-speed memory in the combined system. The two systems are connected to the Scalable I/O Network (SION), which links them together and to the Spider file system. The XT5 system has 256 service and I/O nodes providing up to 240 gigabytes per second of bandwidth to SION and 200 gigabits per second to external networks. The XT4 has 116 service and I/O nodes providing 44 gigabytes per second of bandwidth to SION and 100 gigabits per second to external networks.

There are four nodes on both the XT4 and XT5 boards. The XT4 nodes have a single AMD quad-core Opteron 1354 "Budapest" processor coupled with 8 gigabytes of DDR2-800 memory. The XT5 is a double-density version of the XT4. It has 3.7 times the processing power and twice the memory and memory bandwidth on each node. The XT5 node has two Opteron 2435 "Istanbul" processors linked with dual HyperTransport connections. Each Opteron has directly attached 8 gigabytes of DDR2-800 memory. The result is a dual-socket, twelve-core node with 16 gigabytes of shared memory and a peak processing performance of 125 gigaflops.

Each node runs Cray's version of the SuSE Linux operating system. Cray has tuned the Linux kernel to remove unnecessary services from the compute nodes. The result is that the operating system minimizes interruptions to the application codes running on the system, thus giving predictable, repeatable run times for applications. The SuSE Linux operating system on the nodes joins the system services, networking software, communications, I/O, and mathematical libraries, as well as compilers, debuggers, and performance tools to form the Cray Linux Environment. Jaguar supports MPI, OpenMP, SHMEM, and PGAS programming models. The Oak Ridge Leadership Computing Facility (OLCF) supports compilers from Cray, PGI, Pathscale, and GNU on Jaguar.

With a peak power density of about 1,750 watts per square foot, Jaguar could not be built without using some form of liquid cooling to manage heat dissipation requirements. At 4,400 square feet, the XT5 segment is larger than an NBA basketball court. Traditional under-floor, forced air distribution methods would be impractical across such a large area and heat load. Cray solves this problem by using their new ECOphlex™ cooling technology. This technology uses R-134, a high temperature refrigerant sometimes found in domestic refrigeration and automobile air conditioners, to first condition the inlet air, then remove heat as air enters and exits each cabinet. The result is not only a highly reliable cooling delivery system, but a 5% reduction, or more than 2.5M kW-h annually, in total consumed electricity versus a traditional forced-air cooling system. Further savings are realized by using the 480-volt power supplies in each cabinet, and by minimizing the distance from the main switchboards to the computer cabinets. These two factors reduced initial materials cost, eliminated an expensive and inefficient step-down to 208v, and will reduce operating costs over the life of the system.

Spider File System

A Lustre-based file system dubbed Spider will replace multiple file systems on the NCCS network with a single scalable system. Spider provides centralized access to petascale data sets from all NCCS platforms thereby eliminating islands of data. File transfers among computers and other systems will be unnecessary. Transferring petascale data sets between Jaguar and the visualization system, for example, could take hours, tying up bandwidth on Jaguar and slowing simulations in progress. Eliminating file transfers will improve performance, convenience, and cost. Data analytics platforms will benefit from the high bandwidth of Spider without requiring a large investment in dedicated storage.

In order to access Spider each NCCS platform is configured with Lustre routers. These routers allow Lustre clients on the compute nodes to access Spider as if the storage was locally attached. All other Lustre components reside within the Spider infrastructure providing ease of maintenance, accessibility during service outages on compute platforms and the ability to expand the file system performance and capacity independently of these platforms.

Moving towards a centralized file system required increased redundancy and fault tolerance. Spider was designed to eliminate single points of failure and thereby maximize availability. By using failover pairs, multiple networking paths and the resiliency features of the Lustre file system, Spider provides a reliable centralized storage solution.

Spider File System Specs

Unlike previous storage systems, which are simply high-performance raids, connected directly to the computation platform, Spider is a large-scale storage cluster. 48 DDN S2A9900s provide the backend object storage which in aggregate provides over 240 gigabytes per second of bandwidth, over 10 petabytes of RAID6 capacity from 13,440 1 terabyte SATA drives. This object storage is accessed through 192 Dell dual socket quad core Lustre OSS servers providing over 14 teraflops in performance and 3 terabytes of RAM. Each object storage server can provide in excess of 1.25 gigabytes per second of file system level performance. Metadata is stored on 2 LSI Engino 3992s and served by 3 Dell quad socket quad core systems. These systems are interconnected via our scalable I/O network (SION) providing a high performance backplane for Spider.

Scalable I/O Network - SION

In order to provide a truly integrated computing facility the LCF deployed a system area network (SAN) dubbed SION. SION is a multi-stage InfiniBand network which connects all NCCS platforms. SION provides a backplane for integration of multiple systems such as Jaguar, Spider, Lens (visualization cluster), Ewok (end-to-end productivity cluster), Smoky (application readiness cluster), HPSS and GridFTP servers. By providing a high-performance link between multiple systems SION allows communication between the two segments of Jaguar. New capabilities such as on-line visualization are now possible as data from the simulation platform can stream to the visualization platform at extremely high data rates.

As new platforms are deployed at LCF, SION will continue to scale out providing an integrated backplane of services. Rather than replicating infrastructure services for each new deployment SION will allow access to existing services thereby reducing total costs, enhancing usability and decreasing the time from initial acquisition to production readiness.

SION Specs

SION is a high-performance InfiniBand DDR network providing over 889 gigabytes per second of bisectional bandwidth. The core network infrastructure is based on three 288-port Cisco 7024D IB switches. One switch provides an aggregation link while the remaining 2 switches provide connectivity between the two Jaguar segments and the Spider file system. A fourth 7024D switch provides connectivity to all other LCF platforms and is connected to the single aggregation switch. Spider is connected to the core switches via 48 24-port Flextronics IB switches, which allows storage to be addressed directly from SION. Additional switches provide connectivity for the remaining LCF platforms.

The LCF spans over 40,000 ft2 of raised floor space with platforms spread throughout the center. In order to span the distance requirements imposed by such a large-scale center, SION utilizes Zarlink IB optical cables in a number of lengths of up to 60 meters. These long length cables allowed connectivity between the two-story facility, an impossibility with copper cables. In total, SION has over 3,000 InfiniBand ports and over 3 miles of optical cables providing high performance connectivity.

NCCS Networking

Networking capability at the OLCF is being expanded in parallel with its computing capability to ensure accurate, high-speed data transfer. High-throughput networks among its systems and upgraded connections to ESnet (Energy Sciences Network) and Internet2 have been installed to speed data transfers between the NCCS and other institutions.

OLCF has a direct connection to DOE's ESnet, providing a high-bandwidth pipe that links the center with more than 40 other DOE sites, as well as fast interconnections to more than 100 additional networks.

OLCF is also connected to the Internet2 network and NSF's TeraGrid. Internet2 provides the U.S. research and education community with a network that meets its bandwidth-intensive requirements. The network is a dynamic, robust, and cost-effective hybrid optical and packet network. It furnishes a high-speed network backbone that can handle full-motion video and 3D animations to more than 200 U.S. educational institutions, corporations, and non-profit and government agencies.

The OLCF core LAN network consists of 2 Cisco 6500 series routers along with a Force10 E1200 router. The core network provides over 100 10GE ports for intra-switch connections, as well as directly connected hosts using 10GE. NCCS provides more than 1200 ports of gigabit Ethernet for machines with lesser data-transfer needs.

Networking Specs

ORNL owns and manages its own single-mode fiber optic network that provides physical connectivity from Oak Ridge, TN to Chicago, Nashville, and Atlanta. ORNL lights this fiber using Ciena Corporation Wave Division Multiplexing (WDM) equipment, and provides connectivity to external collaborators and partners using core routers from Cisco Systems and Juniper Networks. This dark fiber infrastructure allows ORNL to quickly and cost-effectively light new waves at 10Gb/s or higher to new partners at any of these peering points. In addition, ORNL is participating in the Advanced Networking Initiative, an effort to demonstrate 100Gb/s wide area connectivity among a number of DOE facilities.

Archival Storage - HPSS

The High Performance Storage System (HPSS), OLCF's archival data storage facility, has been significantly upgraded to ensure high-speed, reliable storage and retrieval of petascale data sets, which contain petabytes of data. HPSS currently stores more than 7 petabytes of data, and up to 40 terabytes are added daily. The amount stored has been doubling every year, and the addition of two petascale systems is expected to escalate that rate. In order to keep pace with the demands of petascale simulation platforms HPSS is continuously expanded each year. Integration efforts will bring HPSS connectivity to SION, allowing new capabilities such as seamless integration with Spider. This integration will enable extremely high-performance data transfers in/out of HPSS directly from Spider using multiple transfer mechanisms such as the HPSS transfer agent or the local file mover.

HPSS Specs

HPSS infrastructure includes 28 production Dell servers used as core, ACSLS, user interface gateway, and movers (disk/tape). Tape storage is made up of two STK PowderHorn robotic libraries and three SUN SL8500 libraries. These libraries contain 14 STK 9840 tape drives, 16 STK 9940 tape drives, 24 SUN T10K-A tape drives, 32 SUN T10K-B tape drives, and a total of 30,000 tapes. Four DDN 9550s with over 1,500 terabytes of storage make up the disk tier of HPSS and provide high-performance access for small and medium files while also acting as a cache mechanism for larger files destined for tape.

Science and Petascale Computing

From probing the potential of new energy sources to dissecting the dynamics of climate change to manipulating protein functions, terascale systems have been an indispensable tool in scientific investigation and problem solving. The capability offered by petascale machines to expand on these advances and address some of humankind's most pressing problems is unprecedented. ORNL provides the scientific community with the most powerful tools on the planet for addressing some of the world's toughest challenges.


Oak Ridge Supercomputers Provide First Simulation of Abrupt Climate Change

At ORNL, the world's fastest supercomputer for unclassified research is simulating abrupt climate change and shedding light on an enigmatic period of natural global warming in Earth's relatively recent history. The work, led by Zhengyu Liu at the University of Wisconsin and Bette Otto-Bliesner of the National Center for Atmospheric Research, is featured in the July 17 issue of the journal Science and provides valuable new data about the causes and effects of global climate change.

ORNL Supercomputers Help Studies of Supernovas, Space

Type Ia supernovas are the largest thermonuclear explosions in nature, expelling mass greater than that of the Sun and many of the basic elements of life. The long-standing mystery of these exploding stars lies in precisely how they explode. Stan Woosley at the University of California-Santa Cruz with colleagues ran simulations on Jaguar showing that Type Ia supernovas can explode asymmetrically and that this asymmetry would greatly affect their brightness.

Life and its Half-life

Carbon-14 decays far more slowly than most isotopes in its weight class, allowing researchers to date as far back as 60,000 years anything that was once part of a plant or body. A team led by David Dean of ORNL is using Jaguar's unprecedented computing power to examine the carbon-14 nucleus. A simulation that can help us understand why the half-life of this isotope is so long has the potential to illuminate all half-lives, long and short, and help us better understand the makeup of matter.

From Photosynthesis to Fuel: The Next Generation of Ethanol

Jeremy Smith and colleagues use the Jaguar and Kraken supercomputers to reveal the detailed workings of cellulose, a complex carbohydrate that gives leaves, stalks, stems, and trunks their rigidity. Figuring out how to unlock its sugar subunits, which can be fermented to produce ethanol, could enable full use of plants for fuel.

Fusion Gets Faster

Few codes require faster I/O or scale better than today's fusion particle codes. GTC and XGC-1, for instance, are running on more than 120,000 cores on the NCCS's Jaguar Cray XT5 supercomputer, the fastest system in the world for open science. Thanks to ORNL's Scott Klasky and a diverse team of collaborators, GTC recently became twice as fast, for not only an ideal benchmark case but also for a production simulation.

Modeling Volcanic Eruptions Mimics a Stressed Climate

A team led by Kate Evans at ORNL and the National Center for Atmospheric Research is using the Jaguar supercomputer to simulate the climate system's reaction to aerosols from volcanic eruptions. If the model can predict the system's response to the aerosols, which can remain in the atmosphere for several years, confidence will rise for predicting responses to longer-term anthropogenic emissions.

Jaguar XT5 Image Gallery

The new 1.64-petaflop Cray XT Jaguar features more than 180,000 processing cores, each with 2 gigabytes of local memory. The resources of the ORNL computing complex provide scientists with a total performance of 2.5 petaflops. Images of the NCCS petaflop Jaguar system are seen here. For the latest science visualization images, see the NCCS photo gallery.


Jaguar XT5 Video Gallery

Use the video playlist below to select a video for viewing.

Get the Flash Player to see the wordTube Media Player.