Bulk Throughput Measurements - U Michigan

Bulk Throughput Measurements | Bulk Throughput Simulation | Windows vs. streams | Effect of load on RTT and loss | Bulk file transfer measurements

SLAC to U Michigan

On November 17, 2001, measurements were made between from to Speedy was dual 797MHz PC running Linux 2.4. The site was connected to Internet 2 via an OC12 (622Mbps) link. It was located at the University of Michigan. Pharlap was a Sun E4500 with 6*336MHz cpus and a GE interface running Solaris 5.8. SLAC had a 1Gbps link to the Stanford campus and from there a 622Mbps link to CalREN and Internet 2. The routes in both directions used Abilene. The ping response for 119 default length (64 Bytes) from SLAC to U Michigan was min/avg/max (mdev) = 67/67.24/67.5 (0.29) ms. The pipechar from SLAC to U Michigan was also recorded. Losses to individual nodes along the path were also recorded with pingroute. The losses are higher (8 in 1000 versus 0 in 1000) for large (1000Byte) versus small (100Byte) packets. We also tried 1000Byte "Fast" pings (i.e. send a packet as soon as a response is received without delaying for the regular timeout period.) The loss rate was 0 in 1000 pings.

The window buffer sizes on pharlap are shown below:
ndd /dev/tcp tcp_max_buf = 4194304
;ndd /dev/tcp tcp_cwnd_max = 2097152
;ndd /dev/tcp tcp_xmit_hiwat = 16384
;ndd /dev/tcp tcp_recv_hiwat = 24576

The window buffer sizes on speedy are shown below:

more /proc/sys/net/core/wmem_max = 1677721600
;more /proc/sys/net/core/rmem_max = 1677721600
;more /proc/sys/net/core/rmem_default = 1677721600
;more /proc/sys/net/core/wmem_default = 1677721600
;more /proc/sys/net/ipv4/tcp_rmem = 4096        87380   174760
;more /proc/sys/net/ipv4/tcp_wmem = 4096        16384   131072

The iperf throughput from SLAC to UMich as a function of streams and window size, seen below, appears to indicate that streams are more effective than windows in achieving high throughput, and even the maximum number streams available from iperf (40) does not appear to be sufficient. The maxima (the top 10% throughputs) are above 8.5Mbits/s which is quite low.

The iperf throughput as a function of time for a window of 256KBytes and 3 streams in the second of the two graphs below, appears to have dropped from about 70-80Mbits/s to under 1 Mbit/s around Novermebr 15th.
To try and uncover the reason for this drop, we made iperf TCP throughput measurements for varying durations (3-350 secs), streams (1-25 streams), (and window sizes (64KBytes - 4096KBytes). The only corelation seemed to be as a function of time, where it is seen below, that there are periods of a few hundred seconds where the throughput increases to over 60Mbits/sec. The various smaller step functions in throughput reflect changes in the number of parallel streams (going from 1 stream to 20 streams at about 13000 seconds, from 20 to 5 streams at 26000 seconds, 5 to 12 streams at 41000 seconds and 12 to 8 streams at about 53000 seconds).
We also measured the bbcp performance with varying windows and streams (first of the 2 graphs seen below). The large variability of the bbcp performance requires further study. Also the large discrepancy in the best performances measured with iperf compared to those measured with bbcp (much larger) is noteworthy.

After talking to Thomas Hacker of UMich he said that the problem might be that we were monitoring the host by name ( and since it is multi-homed, might not be using the new GE interface. We therefore repeated the above tests using the IP address of the GE interface. The results shown below indicate that we now see maxima of over 320Mbits/s.

Created October 28, 2001, last update October 29, 2001.
Comments to