StorCloud

 

Global Lambdas for Particle Physics Analysis:
SC|05 Demonstration

Caltech

We achieved over 150 Gbits/s!

At 150Gbits/s one could transfer the contents of all the books and other print collections of the Library of Congress in under 9 minutes, over 130 DVD movies in a minute, or serve 10,000 MPEG2 HDTV movies in real-time.

The Caltech-CERN-Florida-FNAL-Michigan-Manchester-SLAC entry demonstrated high speed transfers of particle physics data between host labs and collaborating institutes in the USA and worldwide. Caltech and FNAL are major participants in CERN's CMS experiment at the Large Hadron Collider (LHC). SLAC is the host of the BaBar collaboration. Using state of the art WAN infrastructure and Grid Web Services based on the LHC Tiered Architecture, we showed real-time particle event analysis requiring transfers of Terabyte-scale datasets. We aim to maximize the utilization of the available waves to and from the Seattle show floor. The traffic consisted of a realistic mixture of streams: transfers of TeraByte event datasets both as large aggregate files, and as individual event transactions, plus sets of background flows of varied character to absorb remaining capacity. This simulated the environment in which distributed physics analysis will be carried out at the LHC. Achieving over 150 Gbits/s we easily beat our SC2004 record of ~100Gbits/sec.

Results | Booths | Network Requirements | Storage | Clusters | Applications | Documentation | Logos

Participants

Contributors

Publicity Releases

 

Results

SC05 Weathermap | Winner Announcement | Results | Other entries | Other SC05 award winners

Our team captured the SuperComputing 2005 (SC|05) BandWidth Challenge (BWC) for the third year in succession following successes in 2003 and 2004, and a second place in 2002, This year we reached a measured peak of 150.7 Gbits/s easily beating last year's record of 101.13 Gbits/s. We sustained more than 100 Gbits/s for several hours using multiple applications including bbcp and xrootd (developed at SLAC), gridftp and dcache (developed at FNAL and DESY). The extraordinary achieved bandwidth usage was made possible in part through the use of the FAST TCP protocol which was utilized by some of the above transfer protocols and is developed at Caltech.

Within 2 hours an aggregate of 95.37 TB (Terabyte) was transferred, with sustained transfer rates ranging from 90 Gbps to 150 Gbps and a measured peak of 151 Gbps. During the whole day (24 hours) on which the bandwidth challenge took place approximately 475 TB where transferred. This number (475 TB) is lower than the Caltech/SLAC/FNAL led team was capable of as they did not always have exclusive access to waves, outside the bandwidth challenge time slot. If you multiply the 2 hours where 95.37 TB was transferred, times 12 (to represent a whole day) you get approximately 1.1 PB (Petabyte). Transferring this amount of data in 24 hours, is equivalent to a transfer rate of 3.8 (DVD) movies per second, assuming an average size of 3.5 GB per movie. Or since MPEG2 HDTV is usually 13-15Mbits/s (max 19 Mbits/s) one could serve 10,000 users in real time.

Both SLAC and Caltech simultaneoulsy wrote physics data to Storcloud via HBA fibre links:

A total of 164 Tbytes of data was transferred across UltraScienceNet (USN) over a period of one day at an average rate of ~14Gbits/s. Peak one way USN utilization during the period observed was 9.1 Gbits/s for Caltech and 8.4Gbits/s for SLAC (Source Nagi Rao ORNL/USN).

The SciNet Sc2005 network team assigned taps to monitor 17 of the waves at our booths and recorded a peak of 131 Gbits/s during a 15 minute measurement period.

We learnt how to tune various applications:

The exercise was not at all trivial. We needed to work through repeated system and/or network interface crashes under stress. A great number of kernel, configuration and routing issues had to be worked out in the days before the BWC itself. It is a tribute to the team that these were all worked through successfully.

The result was a great learning experience, and it had lasting value in several areas (a partial list):

We also were very pleased with the participation of our international partners from Brazil, Japan, Korea and the U.K. who worked hard in the days to weeks before the competition to be able to participate effectively. For example using a 10Gbits/s we were able to sustain about 6.5 Gbits/s between the SLAC/FNAL booth and hosts in the U.K. at the University of Manchester, University College London (UCL), and the University of London Computer Center (ULCC).

The team would like to thank all the companies, institutes and organizations who contributed to this success.

Aggregate bandwidth
Challenge overall bandwidth
Cumulative
Cumulative bandwidth
Component bandwidth
Component bandwidths
SLAC/FNAL booth contributions
SLAC/FNAL/UK contributions

Booths

SLAC/FNAL booth | Caltech booth
We were located in booths 302 (FNAL/SLAC) and 428 (Caltech). We had fibres from SciNet (booths 230, 630, 914, 1316, 2115), StorCloud (booth 1116), ORNL (booth 2226) and the Internet2 booth (2435). See the floor plan. Delivery of equipment to booth 302 was on 11/9/05 between 7:00am and 12:00pm. Shipping instructions.

Network Requirements

LAN

spreadsheet of UltraLight hosts | UltraSCienceNet at SC2005 cartoon

We use 10GE NICs from Neterion (both XFrame I and II's) and Chelsio T110's with TCP Offload Engines (TOE).

  • Neterion used SR optics with LC connectors, and needed SR XENPAKs (with SC connectors).
  • The Chelsio T110 (ToE) NICs used LR optics, and needed LR XENPAKs.

WAN/Waves

Spreadsheet of links | HEP waves | HOPI waves for SC05 | SLAC/FNAL booth waves | FermiLab/SLAC traffic contributions | FNAL at StarLight | CACR/LA/Sunnyvale configurations | Caltech booth, waves, racks etc. | USN Data Plane | USN at SC2005 | Pacific NW GigaPop provides > 0.5Tbits/s for SC|05

We had 22 10Gbits/s waves to the Caltech and SLAC/FNAL booths. Of these:

  • There were 15 waves to the Caltech booth (from Florida (1), Korea/GLORIAD (1), Brazil (1 * 2.5Gbits/s), Caltech (2), LA (2), UCSD, CERN (2), U Michigan (3), FNAL(2)).
  • There were seven 10Gbits/s waves to the SLAC/FNAL booth (2 from SLAC, 1 from the UK, and 4 from FNAL).
The waves were provided by Abilene, Canarie, Cisco (5), ESnet (3), GLORIAD (1), HOPI (1), Michigan Light Rail (MiLR), National Lambda Rail (NLR), TeraGrid (3) and UltraScienceNet (4).

Caltech:
SLAC/FNAL: Of these waves, as seen in the diagram, four went to FNAL via StarLight, two to SLAC via ESnet (one routed, the other dedicated Layer 2 via USN), and one to UKLight.

Storage

StorCloud Logical configuration | Caltech StorCloud connection | StorCloud WeatherMap
We had 100 Gbits/s of fibre channel capacity from the Caltech and SLAC/FNAL booths to StorCloud. All partitions were striped across the 160 disks, i.e. as if 160 spindles served a single volume. We striped 2 FC volumes in each node to make a single raid0 using mdadm tool and a large 1024KB block size:
mdadm -Cv -f /dev/md0 -l 0 -c1024 -n2 /dev/sd{b,c}
Each LUN from Storcloud showed up 750GB disk space in the system. The storage was provided by 3PARData and consisted of 2 systems, where each system could achieve 2.8GBytes/s read and 1.6GBytes/s write. The HBAs were loaned from QLogic.

The two booths were provisioned as follows:

Clusters

SLAC racks | FNAL racks | Caltech racks | Caltech configuration | Host configurations SLAC equipment | Caltech Equipment
Caltech:
SLAC: FNAL: At FNAL there was a large dCache cluster. At SC2005 there was a LambdaStation.

Applications

We transferred files of HEP data using the SLAC developed bbcp application. We also displayed single events from the BaBar detector using the WIRED visualization application.

We monitored and displayed network performance using the Caltech developed MonALISA application.

We also used MonALISA to control waves on the network using GMPLS in collaboration with Cisco, Pacific wave and Calient.

SLAC used the SLAC developed xrootd application to fetch BaBar events between mini-Petacache clusters at SLAC and SC05. We used 3 pairs of hosts per 10 Gbits/s wave with 125 xrootd clients per receiving host. Using standard Linux 2.6.12 New Reno TCP on a single 10Gbits/s wave we were able to achieve over 9.7Gbits/s in one direction and over 16Gbits/s peak for 5 minutes in two directions simultaneoulsy.

Documentation etc.

Photos | Award ceremony photo |
FNAL/SLAC: Open SCience Grid at SC2005 | Caltech Handout | PingER poster | Lambda Station Presentation | Bandwidth Challenge presentation | KEK presentation
Posters: BandWidth Challenge ESLEA Mini-Petacache Logos

Logos of Participants/Partners

CERN FIU" KAIST KEK kyungpook University U U Florida
Abilene Level(3) NLR ultralight
Ciena" Cisco Chelsio Neterion QLogic
Boston HP NexSan Sun Microsystems 3PARdata

Julian Bunn's page | SLAC/Manchester Setup | eVLBI setup | Bandwidth and GMPLS demos | SLAC logo (Adobe Illustrator)

More on bulk throughput
Bulk throughput measurements | Bulk throughput simulation | Windows vs. streams | Effect of load on RTT and loss | Bulk file transfer measurements | FAST TCP Stack Measurements | QBSS measurements

Demonstrations
SC2001 challenge | iGrid2002 demonstration | SC2002 SLAC/FNAL SC2002 bandwidth challenge | SC2003 bandwidth challenge | SC2004 bandwidth challenge

Created June 15, 2005: Les Cottrell, SLAC