ORNL Hosts Lustre Part II
Jun 30th, 2009 in Highlights
Workshop looks ahead to 2015
Users, engineers, and developers converged on Oak Ridge National Laboratory (ORNL) May 19-20 for part two of the Lustre Scalability Workshop.
Lustre is an Open Source cluster file system developed and supported by Sun Microsystems and popular in high-performance computing environments due to its scalability and open-source nature.
Sponsored by ORNL, Sun Microsystems, and Cray, Inc., the workshop brought together representatives from the world’s largest Lustre deployments to identify key scalability issues and develop a roadmap for the future, namely bandwidth in the terabytes per second range and the manageability of exabytes of storage by 2015.
“The focus of the workshop was to identify long term, 2015 and beyond, I/O and storage requirements for high performance computing and to discuss how the Lustre file system can meet these requirements,” said Galen Shipman, group leader for technology integration at the National Center for Computational Sciences (NCCS), the ORNL-based organization that runs Jaguar, the fastest computer in the world for open science.
Speakers included Instrumental Incorporated’s Henry Newman, who discussed the Defense Advanced Research Projects Agency’s (DARPA’s) 14 I/O scenarios and their implications for parallel file systems. DARPA, the Department of Defense’s main research and development office, is developing High Productivity Computing Systems (HPCS) for national security purposes and to ensure U.S. leadership in critical technologies. For example, HPCS, or trans-petaflop, systems, will allow DARPA to more effectively develop models for weather prediction, ocean and wave prediction, ship design, climate modeling, nuclear stockpile management, and weapons integration. Improved I/O is crucial in the future HPCS systems that will be used in this research.
Andreas Dilger from Sun showcased the latest Lustre designs to meet DARPA’s future file system I/O goals. Dilger outlined the I/O goals of the HPCS project and the architectural improvements and performance enhancements necessary to achieve those goals. Among the architectural improvements are end-to-end data integrity, file system integrity checking, and recovery improvement, to name a few. Necessary performance enhancements include improvements in scalability and the combination of multiple network faces. Overall, said Dilger, the Lustre file system is capable of meeting DARPA’s HPCS roadmap and the HPCS program symbiotically provides a motivation to continue growing Lustre.
ORNL’s Shipman discussed the NCCS’s current Lustre-based Spider file system and presented a roadmap for delivering an exascale system within the decade and the I/O requirements necessary to achieve the 2015 goals of a number of partners, including ORNL, Lawrence Livermore National Laboratory, Pacific Northwest National Laboratory, and other mission partners.
In all, the conference hosted 30 attendees and was held at ORNL’s Joint Institute for Computational Sciences.
“The workshop provided a great opportunity for us to involve our most demanding users in setting the direction we will take with Lustre over the next several years. This is an important part of our ongoing commitment to meet the most demanding I/O and storage needs of the HPC community” said Peter Bojanic, director of Sun’ s Lustre Group.

