Effect of Load on RTT and Loss

SLAC Home Page


With the success of BaBar and the need to support multiple remote computer centers, the need for high performace between the remote computer centers and SLAC was imperative. To assist with understanding the performance we set out to identify the bulk data flows, see how well they performed, identify where the bottlenecks were located, and identify the impact on other traffic. For more information on how we tuned the TCP stack and application to optimize bulk-data throughput and the impact on hosts etc. see High Performance Throughput Tuning/Measurements. Also for measurements made on a test network see: Internet2 Land Speed Record.

SLAC Link Traffic

SLAC has 2 high speed production links to the Internet: The utilisation of the SLAC ATM link is seen below in an MRTG plot. It is seen that for long periods (days at a time), the link is saturated and during this period there was more traffic to SLAC (green) than from SLAC (blue).
SLAC ATM link load
The protocol distribution (this data is obtained from the JNFlow tool that is used to analyze and report on data obtained from Netflow data from the SLAC border router interface to the ESnet ATM) is seen below. In this period (the 24 hours prior to 9:00am 8/26/00) there is more traffic inbound to SLAC (the negative numbers). The main protocols are FTP, TCP Other (this is mainly Objectivity database traffic), and ssh, all of which are TCP based. there is also some UDP traffic supporting the Advanced File System (AFS). Outbound the traffic is mainly FTP.
Protocols for 24 shours
The top 25 sources and destinations are shown in the following graphs.
Top Top 25 sources
The top 14 communicating pairs are shown in the table below:
Source			    Dest                        Protocol                Packets   Bytes
DATAMOVE3.SLAC.Stanford.EDU 	       	TCP:51823/4020  	61141037  47420682306     
FTP2.SLAC.Stanford.EDU 	TCP:ftp-data    	9467220   13094015557     
FTP2.SLAC.Stanford.EDU      tourmalet.Colorado.EDU      TCP:ftp-data    	5174732   7271402667      
NORIC03.SLAC.Stanford.EDU            TCP:ssh         	850924    1272630791      
DATAMOVE6.SLAC.Stanford.EDU              TCP:6779/3954   	844144    1187437954      
DATAMOVE6.SLAC.Stanford.EDU          TCP:6779/2155   	841254    1182204791      
LOWRIE.SLAC.Stanford.EDU       TCP:ssh         	706996    1059782372      
DATAMOVE1.SLAC.Stanford.EDU          TCP:6779/4946   	1177886   976244220       
DATAMOVE1.SLAC.Stanford.EDU              TCP:6779/3965   	1132436   962653009       
DATAMOVE6.SLAC.Stanford.EDU               TCP:6779/3879   	622026    874387920       
DATAMOVE3.SLAC.Stanford.EDU  TCP:ssh         	15733257  818168976       
AFS05.SLAC.Stanford.EDU     neutrino3.Stanford.EDU  UDP:afs3-callbackafs3-fileserver 536720 767098934       
AFS09.SLAC.Stanford.EDU       UDP:afs3-callbackafs3-fileserver 551399 752882947       
DATAMOVE1.SLAC.Stanford.EDU              	TCP:6779/4131   	643259   666342669       


For the record the traceroutes from SLAC to the top 5 sites were recorded.


To try and characterize the paths to the top 5 sites we used pathchar. Pathchar allows a user to find the bandwidth, delay, average queue and loss rate of every hop between a source & destination on the Internet. Pathchar is a useful tool, however, it does not give exact results, the error is not neglectable, in our case probably all results between 20 and 30 Mbps should be considered as equivalent

Performance between SLAC and top communicating sites

The PingER RTT and losses for the top 5 sites for the week up until 9am 8/25/00, are shown below. The generally good tracking between similar hosts and the lack of tracking between sites indicates that the increases in RTTs not due a common cause across all sites. This would appear to rule out that the congestion on the SLAC-ESnet ATM link is a common cause in the increases in RTT.
The transfer rates measured over 5 minute intervals when the top sites are communicating heavily with SLAC vary. This may be partially due to multiple transfers competing for network bandwidth, though it may also be partially due to the applications and hosts.
5-37Mbps, avg=15Mbps
3.6-27Mbps, avg=10.1Mbps

One way performance

Surveyor provides one way losses and delays between SLAC and the UK (UKERNA), Colorado and CERN. It is een that there is considerabel asymmetry in the performance, the RTTs to SLAC being more variable (a sign of more congestion). As mentioed above most of the traffic is going to SLAC so this is consistent with the asymmetry.
One way delays to CERN One way delays to Colorado One way delays to UK
Below we show the one way delay and losses of the link from Colorado to SLAC. It is seen that around 18:30pm UTC (11:30am PDT) the median one way delay (green dots) increases by a factor of 3 to 4.
Delay Colorado to SLAC Loss Colorado to SLAC
The route between Colorado and SLAC are shown below. No changes in the routes were noted in the 24 hours (the routes were measured every 20 minutes).
Route Colorado to SLAC Route SLAC to Colorado

Active measurements

To provide further insight we used iperf to generate TCP traffic from SLAC ( a Sun Ultra 2 running Solaris 5.6 without the SACK extension) to CERN ( Iperf was set to have 256kbyte windows and 10 parallel streams. We ran iperf in this fashion for 40 minutes from 17:19:17 8/26/00 thru 17:59:21 8/26/00 simultaneously measuring the ping (sending a 100byte ping once a second with a timeout of 20 seconds) RTT and loss. While doing this we also observed the link utilization. The aggregate measured througput from SLAC to CERN was 20Mbps (26Mbps) which is close to the bottleneck bandwidth. The ping loss was less than 0.05% (0 pings lost in 2400 sent), the minimum ping RTT was 166ms, the average 258ms (226ms) and the maximum was 411ms (373ms). We followed this up by measuring the ping RTT and loss for 40 minutes without generating any iperf traffic starting at 18:53 and ending at 19:33 8/26/00. In this case there was no packet loss (2400 pings sent, 2400 received) and the minimum RTT was 166ms, the average was 167ms and the maximum was 189ms. The loading on the CERN to USA link and on the SLAC to ESnet ATM link is shown below. The blue peak just around 17:00 hours PDT on Aug 26 on the SLAC-ESnet ATM link and the green peak at about 2:30 on the CERN-USA corresponds to the iperf traffic.
SLAC ESnet ATM load
Plots of the RTTs for the load and the no load measurements are shown below. Some of the statistics for the RTT in msec. are:
	Loaded	NoLoad	Units
Average	258.7	167.2	msec.
Stdev	52.3	0.9	msec.
Median	260	167	msec.
IQR	109	0	msec.
Min	166	166	msec.
Max	411	189	msec.
It can be seen that the iperf load appears to increase the average and median RTT by about 100msec and the distribution is much flatter for the loaded case. On a 25Mbit/sec link a queuing delay of 100msec. would correspond to about 2.5Mbits or about 300kbytes of data or 2100 packets with a maximum segment size of 1460 bytes. The increase in RTT with loading seen on the SLAC CERN link is similar to the increase in RTT observed on the Colorado link above. It is interesting that on the SLAC CERN link no ping packet loss was observed in either the loaded or unloaded case.
Load vs. noload RTTs
Load vs noload Hist
We also measured the effect of varying the TCP window size for iperf (with 10 parallel streams) on the RTT. To do this we used iperf to send TCP data from SLAC to CERN for 40 minutes at times when the SLAC to CERN link was not congested. At the same time, once a second, we measured the ping RTT and loss for 100 byte packets with a timeout of 20 seconds. This was repeated for various TCP window sizes from 8kbytes to 300 kbytes. For each measurement we noted the ping loss and RTT, and the iperf thruput and window size. The losses observed in these measurements were always less than or equal to 3 packets in 2400. The graph below shows the RTT versus window size. It is seen that the RTT (median, average, 90 percentile and IQR) increases steeply between a TCP window size 55 kbytes and 64 kbytes The magnitude of the increase is about 60-10msec. for the median and average RTTs. At the same time there is little change in the iperf thruput. The RTT distributions are also shown below to illustrate the marked change as one goes from a TCP window size of 55kbytes to 64kbytes.
RTT versus window size RTT dist for 2 window sizes
To further study the impact of iperf and other link loading on the ping performance, we wrote a script that for each of a set of parallel streams (1, 2, 5, 10, 15) and for each of a set of window sizes (8kB, 16kB, 32kb, 50kB, 55kB, 60kB, 64kB, 128kB, 256kB and 500kB) it sent a TCP iperf stream for 60 seconds (loaded case) and simultaneously measured the iperf thruput, the ping minimum/average/maximum RTT and losses (one 100Byte ping/second with a timeout of 20 seconds), then for the following 60 seconds it sent no iperf load but measured the ping RTT and loss again (this is referred to as the iperf unloaded case). The measurements were then repeated for a different window size.
From observing the link MRTG utilization plots it appears that when this link gets loaded (i.e. over 50% utilization) it stays that way for long intervals (one hour to days). Thus the differences in background (non test generated iperf) link utilization between the iperf loaded measurement and the iperrf unloaded measurements should be small since the measurements are made within 60 seconds of one another. Thus to a fair aproximation we assume the background loads are similar for the iperf loaded and unloaded cases. Further the thruput we achieve with iperf is expected to be dependent on the competing background load.
A scatter plot of the average and maximum loaded and unloaded RTTs versus the iperf measured thruput can be seen below. The power series curves are fits to the average RTTs and are to guide the eye. The R2 values for the 2 curves are shown. It is seen that there is a strong correlation for the loaded average RTT with the iperf thruput and a medium correlation for the unloaded average RTT (uAvg). It is also seen that the loaded RTTs is higher than the unloaded by about 50 msec. Though not shown the minimum RTTs differ by less than 3 msec. on average and the medians of the minimum RTTs are identical.
RTT vs Iperf Mbps

To look more closely at the effect of thruput loading on loss we made measurements over the weekend of September 30 thru October 1, 200, with and without iperf loading for a longer period (12 hours) between SLAC and CERN and between SLAC and Caltech. At this time the SLAC link was lightly loaded (apart from our traffic), as was the CERN to USA link. The measurements were made in 2 sets: 256kByte window with 2 streams, 64kByte window with 8 streams. The details of the methodology are available in Bulk thruput: windows versus streams. The results are shown in the table below (the numbers in parentheses are the losses for the unloaded case, i.e. iperf not running). It is seen that though the absoloute loss is low (< 1%) in all cases, in the case of the SLAC to CERN link the impact of the iperf thruput is large (> a factor of 20 in the loss percentage for the smaller window/more streams case and a factor of 5 for the larger window case). It is also seen that the difference in the loaded versus unloaded losses are greater for a given link when the thruput is greater (i.e when we are using smaller windows and larger numbers of streams).

Destination256kB window64kB window
CERN Loss0.26% (0.01%)0.37% (0.07%)
Caltech Loss0.21% (0.42%)0.69% (0.39%)
CERN Thruput8.82Mbits/s24.7Mbits/s
Caltech Thruput46Mbits/s62.8Mbits/s


Other Interesting High Performance Links

Back to top

Created August 25, 1999
Comments to