IEPM

SC2000 bulk thruput measurements to SLAC

SLAC Home Page
Bulk thruput measurements | Windows vs. streams | Effect of load on RTT and loss | SC2KYASCC

Introduction

SLAC participated with FNAL in the SC2000 show in Dallas to illustrate the needs and challenges of data intensive science and in particular for the Particle Physics Data Grid (PPDG). At the show our booth had connectivity via SCInet to Internet 2 and to NTON. We measured the thruput on both links during the show. This was done in the spirit of the SC2000 Network Challenge, though we did not officially enter the challenge since we did not expect to have NTON connectivity to SLAC in time. The NTON link to SLAC went live late Monday afternoon November 6, 2000. The NTON link to SCInet was an OC48 (2.4Gbits/s) packet over SONET link.

On the show floor in our booth there were 2 Dell Intel PCs, one a dual 533MHz Pentium III PowerEdge with a 64 bit PCI bus, the other a single Intel processor running at 833MHz with a 32 bit PCI bus. Both were running the Linux kernel (2.4-test10). Both had 3Com Gigabit Ethernet (GigE) interfaces to a Cisco Catalyst 6009. The Catalyst 6009 had 2 GigE interfaces to SCInet and a SUP1A with an MSFC and was graciously loaned to us by Cisco for the duration of the show. The interfaces were bonded together using GigE channel and connected to an Extreme Networks switch at the SCInet NOC. At the SLAC end the NTON OC48-POS comes into a Cisco 120012 GSR router. From the GSR there was a GigE interface to pharlap.slac.stanford.edu (a Sun E4500 with 6 processors running at 336MHz) via a Catalyst 5500 and a 2nd GigE interface also via a Catalyst 5500 to datamove5.slac.stanford.edu (a Sun E4500 with 4 cpus running at 400MHz).

The measurements were made by Davide Salomoni and Steffen Luitz with help from Les Cottrell all of SLAC.

NTON tests

The early tests ran into problems with packet losses on the show floor etc. On the morning of Thursday 11/9/00 (the last day of the show), our tests showed a peak transfer rate from the booth in Dallas to SLAC via NTON of around 990 Mbit/s, achieved using two PCs on the floor and two Suns (pharlap and datamove5) at SLAC. A screenshot of the application (iperf) shows the peak rate as measured in MBytes/s by the MIB variables on the Catalyst interface ports. They were not read on the MSFC router (because the MSFC was doing hardware switching, and therefore the counters on the MSFC did not reflect the actual load). The best results were achieved with a 128KB window size and 25 parallel streams. Bigger window sizes caused noticeable performance degradation.

In the last timeslot we also made another interesting experiment: Pushing data via UDP as fast as possible to SLAC and see what arrives at the GSR. With our two PCs we achieved about 1.25GBit/s (1GBit/s from the Dell Poweredge and 250MBit/s from the other PC - we don't understand why the rate was only 250MBit/s). Unfortunately we could not take advantage of the GigaEtherChannel to scinet - the two PCs happened to be mapped to the same interface. Out of the 1GBit raw bandwidth we sent we received ca. 975MBit/s (5min average) at the GSR at SLAC. Still an interesting result.

The 990 MBit/s were measured in a few second peak (screenshot included) and we were using 2 second sampling (as opposed to 5 sec in the bandwidth challenge).

This thruput exceeds the goal of 100MBytes/sec that was set in the PPDG early in 2000.

Internet 2 tests

In another test, we also tried to pump traffic from the booth to SLAC using the normal Internet2/Stanford link, and we saw (shown by rtr-msfc-dmz on a 5 minutes average) slightly more than 300 Mbit/s sustained coming into SLAC (plus around 40 Mbit/s going out). The CPU utilization on the MSFC was around 81%.

The path characterization shows the route and some pchar estimates of the various links along the route.

Ping measurements (56 byte pings separated by 1 second intervals) from SLAC to the booth, starting at 13:56 PST on Thursday 9th November just before the show closed and we lost connectivity, showed a min/avg/max RTT of 48.6/53.1/172 msec. and no loss in 370 pings. The ping distribution is shown below.
ping frequenecy distribution from SLAC to SC2000

SC2000 floor tests

We also made tests with iperf running on the same PowerEdge in the SLAC booth and sending TCP data to a 2 cpu 733MHz Dell host running Red Hat Linux (2.2.12) at the Caltech SC2000 booth. With this we acheived only 300Mbits/s. With the PowerEdge sending iperf TCP data to both the Caltech machine and the 2nd Dell in the SLAC booth we acheived 800Mbits/s. At this rate the PowerEdge was saturated.

Conclusions

The future ready availability of such high speed connectivity on wide area links will change how we approach things and open up new applications requiring the transfer of large amounts of data. Such applications include data intensive science and multi-media. Some examples are given below.

Acknowledgements

We would like to especially thank the following for their assistance during the show. Richard Mount of SLAC for encouraging and challenging us to "go for it", Dave Millsom of SLAC for making the NTON link work to SLAC just in time, Hal Edwards of Nortel and Bill Lennon of LLNL for providing the NTON connection at SLAC, Paul Dasprit for coordinating NTON activities at SC2000, Bill Wing of ORNL for coordinating our connection to SCInet, and Cisco for loaning us the Catalyst 6009 and MSFC.

Back to top


Created November 9, 2000, last update November 16,2000.
Comments to iepm-l@slac.stanford.edu