NCCS Snapshot November 17, 2008

Oak Ridge Supercomputer Is the World’s Fastest for Science

Jaguar breaks petascale barrier

A Cray XT high-performance computing system at the Department of Energy’s (DOE) Oak Ridge National Laboratory (ORNL) is the world’s fastest supercomputer for science. The annual ranking of the world’s top 500 computers (www.top500.org) will be released Tuesday in Austin at an annual international supercomputing conference.

The Cray XT, called Jaguar, has a peak performance of 1.64 petaflops (quadrillion floating point operations, or calculations) per second, incorporating 1.382 petaflop XT5 and 266 teraflop XT4 systems. Each component of the Jaguar system is separately ranked second and eighth on the current list of Top500 supercomputers in the world.

"This accomplishment is the culmination of our vision to regain leadership in high-performance computing and harness its potential for scientific investigation," said Undersecretary for Science Raymond L. Orbach. "I am especially gratified because we make this machine available to the entire scientific community through an open and transparent process that has resulted in spectacular scientific results ranging from the human brain to the global climate to the origins of the Universe."

ORNL Director Thom Mason said the real value of the new machine will be measured by the scientific breakthroughs that will now be possible.

"We are proud to be home to the world’s most powerful computer dedicated to open science, but we are more excited about the ability of Oak Ridge and the Department of Energy to take a leading role in finding solutions to scientific challenges such as new energy sources and climate change," Mason said.

In June, a DOE supercomputer named Roadrunner at Los Alamos National Laboratory was the first to break the petascale barrier. Built with advanced IBM Cell processors, Roadrunner helps ensure the reliability of America’s nuclear weapons stockpile.

Beginning as a 26-teraflop system in 2005, ORNL embarked upon a three-year series of aggressive upgrades designed to make their machine the world’s most powerful computing system. The Cray XT was upgraded to 119 teraflops in 2006 and 263 teraflops in 2007. In 2008, with approximately 182,000 AMD Opteron processing cores, the new 1.64-petaflop system is more than 60 times larger than its original predecessor.

Thomas Zacharia, ORNL’s associate director for Computing and Computational Sciences, says petascale machines like Jaguar help advance critical scientific application areas by enabling researchers to get answers faster and explore complex, dynamic systems. In a matter of few days, Jaguar has already run scientific applications ranging from materials to combustion on the entire system, sustaining petaflops performance on multiple applications. A calculation that once took months can now be done in minutes. A 2008 report from the DOE Office of Science, America’s largest funder of basic physical science programs at universities and government laboratories, said six of the top ten recent scientific advancements in computational science used Jaguar to provide unprecedented insight into supernovas, combustion, fusion, superconductivity, dark matter and mathematics.

The DOE’s Office of Science makes Jaguar available to scientists in academia, industry and government to tackle the world’s most complicated projects. Through the agency’s Innovative and Novel Computational Impact on Theory and Experiment program, which allocates the supercomputer’s resources through a peer-reviewed proposal system, researchers were allocated more than 140 million processor hours for 30 projects.

To date the computer simulations on Jaguar have focused largely on addressing new forms of energy and understanding the impact on climate resulting from energy use. For example, INCITE projects have simulated enzymatic breakdown of cellulose to make production of biofuels commercially viable as well as coal gasification processes to help industry design near-zero-emission plants. Combustion scientists have studied how fuel burns, which is important for fuel-efficient, low-emission engines. Computer models have helped physicists use radio waves to heat and control ionized fuel in a fusion reactor. Similarly, engineers have designed materials to recover energy escaping from vehicle tailpipes. Simulation insights have enabled biologists to design new drugs to thwart Alzheimer’s fibrils and engineer the workings of cellular ion channels to detoxify industrial wastes.

Jaguar’s superlative speed is matched by substantial memory that allows scientists to solve complex problems, sizeable disk space for storing massive amounts of data and unmatched speed to read and write files. High-speed Internet connections enable users from around the world to access the machine, and high-end visualization helps them make sense of the avalanche of data Jaguar generates.
Twice a year, the TOP500 list ranks powerful computing systems on their speed in running a benchmark program called HPL, for High-Performance Linpack. In June of 2007, Jaguar solved the largest HPL challenge ever–a matrix problem with nearly 5 trillion elements. The achievement highlights Jaguar’s skill in balancing processor speed and system memory.

ORNL Supercomputer Simulation Wins Prize for Fastest-Running Science Application

Materials simulation breaks 1.3 petaflops

A team led by Thomas Schulthess of the U.S. Department of Energy’s Oak Ridge National Laboratory received the prestigious 2008 Association for Computing Machinery (ACM) Gordon Bell Prize Thursday after attaining the fastest performance ever in a scientific supercomputing application.

Schulthess is group leader of ORNL’s Computational Materials Science Group and recently accepted a position as director of the Swiss National Supercomputing Center at Manno, an institution of ETH Zurich. He and colleagues Thomas Maier, Michael Summers and Gonzalo Alvarez, all of ORNL, achieved 1.352 quadrillion calculations a second—or 1.352 petaflops—on ORNL’s Cray XT Jaguar supercomputer with a simulation of superconductors, or materials that conduct electricity without resistance. By modifying the algorithms and software design of its DCA++ code to maximize speed without sacrificing accuracy, the team was able to boost performance tenfold with the help of John Levesque and Jeff Larkin of Cray Inc.

Jaguar was recently upgraded to a peak performance of 1.64 petaflops, making it the world’s first petaflop system dedicated to open research. The team’s simulation made efficient use of 150,000 of Jaguar’s 180,000-plus processing cores to explore electrical conductance.

To put the achievement into perspective, it would take every man, woman and child on earth more than 500 years to work through as many calculations as DCA++ gets through in a single day—and that’s assuming each of us worked day and night solving one calculation a second.

Researchers have known about superconductors for nearly a century and have prized these materials both for their ability to conduct electricity without resistance, or energy loss, and for their especially strong magnetic field. Superconducting materials have obvious potential application in power transmission, and superconducting magnets have found a place in hospital magnetic resonance imaging machines, particle accelerators such as Europe’s Large Hadron Collider, and magnetic levitation transportation systems.

The challenge is that superconducting materials must be very, very cold. Even so-called high-temperature superconductors—discovered in the mid-1980s—must be chilled to a “transition temperature” of around –200°F before they exhibit their amazing behavior. In addition, a full scientific explanation is missing of how high-temperature superconductors work.

The team used the DCA++ application within a promising mathematical framework known as the two-dimensional Hubbard model. These simulations were the first in which it had enough computing power to move beyond ideal, perfectly ordered materials. By looking at materials with disorder—or impurities—the team is moving toward the necessarily imperfect materials found in the real world.

“The real materials are very inhomogeneous,” noted team member Thomas Maier of ORNL.

Specifically, the team focused on chemical disorder in high-temperature superconductors known as cuprates—layers of copper oxide separated by layers of an insulating material. By advancing our understanding of the interplay between these imperfections and superconductivity, the work promises to help researchers push transition temperatures ever higher, possibly approaching the lofty goal of “room-temperature superconductors,” or materials that exhibit this behavior without artificial cooling.

The team studied the local repulsion between electrons on the same atom. Because electrons have a negative electrical charge, they push one another away in what is known as a Coulomb repulsion. For the material to become superconducting, however, the electrons must overcome this repulsion and join into units called Cooper pairs. The team is looking to take advantage of an earlier discovery that indicates the insulating material promotes this process by drawing electrons away from the copper oxide layer.

“If you draw electrons away from the copper oxide layers, they become superconducting,” Maier said. “Then the question is, what happens if you replace lanthanum with strontium, for instance. You do have different potentials, but you should also have different Coulomb repulsions on each site.”

To achieve the sustained speed demonstrated in the simulation, the team made two fundamental changes to the DCA++ application, allowing it to delay memory-intensive operations and use a less memory-intensive data form. Both of these techniques exploit the fact that DCA++ uses the Monte Carlo approach, which relies on random sampling of a variable to explore systems such as the two-dimensional Hubbard model that do not lend themselves to an exact solution.

Between the two approaches, the team was able to boost the speed of the application by a factor of about 10, according to team member Marcus Eisenbach of ORNL’s National Center for Computational Sciences. This increase in speed allows the team to look at a wider variety of materials in increased detail.

The Gordon Bell Prize is administered by ACM and recognizes leadership in computational science and engineering. The prize was announced in Austin, Texas, in conjunction with the SC08 supercomputing conference.

Library of Flames Illuminates Design of Advanced Combustion Devices

Next-generation engines and industrial burners may use less fuel, emit fewer pollutants

Two-thirds of the petroleum Americans use goes for transportation. The remaining one-third heats buildings and generates electricity in steam turbines. In the not-so-distant future, engines, furnaces, and power-generation devices may burn alternative fuels and employ advanced technology. Supercomputers at the National Center for Computational Sciences (NCCS) are hastening the arrival of advanced combustion devices that will consume less energy and emit fewer pollutants. On these machines mechanical engineer Jacqueline Chen of Sandia National Laboratories (SNL) leads an effort to simulate the combustion of diverse fuels. The result is a library of science data that captures complex aero-thermo-chemical interactions and provides insight into how flames stabilize, extinguish, and reignite. Chen’s data libraries will assist engineers in the development of models that will be used to design next-generation combustion devices burning alternative fuels.

“If low-temperature compression ignition concepts employing dilute fuel mixtures at high pressure are widely adopted in next-generation autos, fuel efficiency could increase by as much as 25 to 50 percent,” Chen said. With mechanical engineer Chun Sang Yoo of SNL and computational scientist Ramanan Sankaran of ORNL, Chen recently used Jaguar at the NCCS to simulate combustion of ethylene, a hydrocarbon fuel. Their simulation generated more than 120 terabytes (120 trillion bytes) of data—more than ten times as much as contained in the printed contents of the Library of Congress.

Advanced combustion technology depends on lifted flames, which result when cold fuel and hot air mix and ignite in a high-speed jet. If the speed increases too much, lifted flames can blow out. For flames to stabilize—or continue to burn downstream from the burner—turbulence, which mixes fuel with air to enable burning, must exist in balance with key ignition reactions that occur upstream of where the flames appear.

In industrial burners used for power generation, lifted flames reduce thermal stresses to nozzles by minimizing contact between the flame and the nozzle. Lifted flames are also integral to the workings of direct-injection gasoline engines, compression-ignition diesel engines, and gas turbines, in which streams of cold fuel and hot oxidizer are partially premixed prior to combustion. The position downstream of a fuel injector at which a diesel fuel jet establishes a flame influences the degree of premixing needed and affects combustion and soot formation. Proper positioning of lifted flames in advanced engines could burn fuel so cleanly that emissions of nitrogen oxide, a major contributor to smog, would be nearly undetectable, Chen said.

To explore processes underlying ethylene combustion, the group uses direct numerical simulation (DNS), a technique that solves equations governing viscous, heat-conducting fluids without using turbulence models. DNS uses a computational mesh to reveal a turbulent flame’s physical characteristics, such as temperatures and chemical species, on spatial and temporal scales ranging from the smallest to the largest details. Using terascale computing, DNS is feasible for canonical flows with a moderate Reynolds number (an indication of the range of scales in a system), where the dynamic range between the largest and smallest features is approximately 10,000 units.

“Direct numerical simulation is our numerical probe to measure, understand, or see things in great detail at the finest scales where chemical reactions occur,” Chen said. “That’s particularly important for combustion because reactions occurring at the finest molecular scales impact global properties like burning rates and emissions.”

The simulation ran a software application developed at Sandia called S3D, which runs on multiple processing cores to model compressible, reacting flows with detailed chemistry. S3D was one of six applications recently selected to run pioneering “science-at-scale” simulations efficiently employing most or all processing cores of Jaguar, which was upgraded in May 2008 to perform 263 teraflops. The simulation used 30,000 of Jaguar’s 31,000 processing cores and 4.5 million processor hours. Running computationally demanding applications after a major machine upgrade is part of a transition-to-operations activity that begins when a commissioned NCCS system passes a formal acceptance test and its performance is monitored and assessed. Science-at-scale simulations run on leadership machines like Jaguar help advance critical scientific application areas to efficiently exploit petascale computing systems, which are capable of a quadrillion calculations per second and will be available to the scientific community in 2009.

The DOE Office of Basic Energy Sciences and Office of Advanced Scientific Computing Research supported this research.

ORNL, GUMC Sign Formal Collaboration Agreement

New relationship lends supercomputing resources to biological research

The fight against cancer and other deadly diseases has found a new ally in a recently established partnership between ORNL and Georgetown University Medical Center (GUMC).

ORNL hosts the world’s most powerful computing complex, with two systems exceeding the petascale. GUMC is a leading biomedical research facility and of one of America’s 41 comprehensive cancer centers.

The Comprehensive Research and Development Agreement, which spans 5 years, will foster collaboration between the two institutions and give GUMC researchers access to both ORNL’s leading supercomputing systems and the lab’s expertise in protein and drug modeling. Areas of research will include computational biology, radiation biology, and systems genetics, to name a few.

Essentially, GUMC researchers will be able to simulate complex biological systems on the laboratory’s first-class computing systems, revealing a more accurate picture of interactions between chemical compounds and diseases.

GUMC and its partners have enormous drug libraries that can be tested against different cancer protein targets, said ORNL’s Ed Uberbacher, adding that “these kinds of problems are huge, and the petascale systems at ORNL provide a scale that’s appropriate for the problem. … Current software for drug docking often provides clues about how to build the right drug but usually falls short of directly providing an optimal drug that binds tightly to the target. With the computational power available, we can potentially get more accurate answers more quickly and save time in the drug-development process.”

Cray Workshop Promotes Supercomputing Skills

Researchers gather to work with experts from ORNL and vendors

More than two dozen computational scientists gathered at ORNL in mid-October to hone their Cray XT4 and XT5 supercomputer skills and share tips and experiences.

The 2008 Cray XT Quad-core Workshop was held October 15–17 at ORNL. Sponsored jointly by NICS and the NCCS, the workshop gave scientists an opportunity to meet with experts from the two organizations as well as from supercomputer maker Cray Inc. and chip maker Advanced Micro Devices Inc.

The workshop featured hands-on sessions to help users make the most of ORNL’s two Cray XT supercomputers, the Jaguar system at the NCCS and the Kraken system at NICS. The event also featured talks on a range of issues important to users, including the use of system tools and libraries, chip architecture, system configuration, and optimizing scientific applications for the Cray supercomputers.

The workshop video and presentation slides will soon be posted on the NCCS website at Workshop Archives.