IEPM

Comparison of Performance between ESnet, .Edu, and XIWT sites

SLAC Home Page

Introduction

Most of the IEPM/PingER measurements in the U.S. focus on sites that are connected to National Research and Education Networks (NRENs), in particular to ESnet and Internet 2. PingER Deployment provides more information on the deployment of the IEPM/PingER project. ESnet provides connectivity the DoE run National labs and Universities with major DoE ER programs. Internet 2 provides connectivity for the research and higher education community of the U.S. Most of the universities monitored by IEPM/PingER are or have plans to connect via Internet 2.

In addition to the IEPM/PingER project, the Cross Industry Working Team has also deployed the PingER tools. 70% of the XIWT monitor-remote site pairs are in the .com domain.

Thus for the U.S. we have data for pairs of sites in very different domains:

  1. ESnet sites seen from 3 ESnet monitoring sites (BNL, HEPNRC/FNAL and SLAC);
  2. .Edu (mainly Internet 2 sites) seen from 2 Edu sites (CMU and UMD);
  3. XIWT (mainly .com sites) seen from 8 sites (Bell South, Compaq, DirecPC, HP, WestGroup, NIST, CNRI and Intel).
We can thus compare the results from these 3 domains to see how the performance today and to look at the history. The graph below show the results, the error bars are for the 75 percentile and 25 percentile of the distributions:
ESnet vs. Edu vs XIWT loss
It can be seen that all three sets have performance which is good (better than 1% loss). Until recently, at least, ESnet had the best performance. To within the statistical accuracy ESnet performance has been holding steady at < 0.1% loss. Edu pair performance has been improving as more sites are connected to one of the Internet 2 backbones: vBNS or Abilele. Edu performance is appropaching that of ESnet-vs-ESnet pairs. XIWT performance is about 3-5 times worse than ESnet or .Edu pairs.

Looking in more detail at the ESnet to ESnet measurements they include the 3 monitoring sites: BNL, HEPNRC and SLAC; and about 7 remote sites: ANL, BNL, FNAL, JLAB, LBL, ORNL, LLNL, and SLAC. The results broken out by monitoring site are shown in the graph below.
ESnet Labs to ESnet Labs packet loss
The lines are linear least squares fits to exponentials to guide the eye. It can be seen that the SLAC (green open triangles) and HEPNRC results are holding fairly steady, while BNL (the red open squares) has increased by about 30% since January 1998.
Looking in more deatil at BNL it is not immediately apparent why the increase should have occured. The traffic load there fits well within the T3 capacity according to Jim Leighton, ESnet. The packet loss is low. The half-hourly PingER loss and RTT measured from BNL to SLAC and SLAC to BNL are seen below from August 1999 though Jan 2000 (note that the 100% losses contribute to unreachability but not to losses as reported above).
BNL - SLAC PingER RTT & Loss Aug-99 thru Jan-00
SLAC - BNL PingER RTT & Loss Aug-99 thru Jan-00
The plots show that the RTT is larger and more variable when measured from BNL to SLAC than measured from SLAC to BNL. Given that ping measures RTT this may be due to the monitoring host at BNL being more loaded than the one at SLAC.
Looking at the SLAC archived PingER data it appears that the losses observed Jan 1-Jan 5 are related to something at BNL, since the measurements from SLAC to ESnet-Labs only show losses to BNL whereas BNL shows losses for this period to all ESnet-Labs. Mike O'Connor (BNL) reports that "The losses at BNL Jan. 3rd through the 5th were caused by an Ethernet switch we inserted into the FE link between the site firewall and our border gateway router (horus), the switch was replaced following autonegotiation problems."

Looking at the utilization of the BNL border router (plot provided by Chin Guok of ESnet) below it can be seen that there were several times over the last year that they had huge burst of traffic (above 45Mbs).
MRTG plot of BNL border router ESnet 43Mbps port utilization
The discard graph for the ATM interface showed corresponding peaks, see below.
MRTG plot of BNL  ESnet ATM interface packet discards

Now if you look at peaks on the 'ESnet to ESnet Labs median % packet loss' graph above for BNL, they fall pretty close to the discard peaks in the above chart. Chin Guok's guess is that BNL was over-driving the DS3 with their fastether, and the PingER's pings were getting dropped.
Back to top


Created 4 August 1999, updated 30 Jan, 2000.
URL: http://www-iepm.slac.stanford.edu/pinger/tools.html
Comments to iepm-l@slac.stanford.edu