Progress Report – 28 April 2003

 

 

Project Title:     INCITE:  Edge-based Traffic Processing and Inference

                   for High-Performance Networks

 

 

 

 

2.  SLAC - Current Accomplishments  (through April 2003)

 

ABwE – Available Bandwidth Estimator (Tasks 2, 4)

 

We have installed, configured and tested a new method for Aavailable Bbandwidth[1] estimation (ABwE). During practical measurements, which we startedstarting in November 2002, we have collected and evaluated hundreds of monitoring results over a wide range of paths and studied the behavior of this tool in real network conditions. In the past few months we have started to compare these results to the results of other tools used for network monitoring of bandwidth, particularly to the Iperf type of measurements.

            The Iperf measurement is very popular and it looks relatively simple. It is widely used by many ISP's or network specialists for different types of network measurement including Aavailable Bbandwidth. Unfortunately, the Iperf results can be misleading because, for large congestion window paths, they are very dependent on the parameters such as the number of parallel streams and TCP’s maximum window size. Setting the optimal parameters is quite a delicate process. Such “tuning” often uses quite sophisticated methods. Some of these methods have been developed at SLAC and some of the methods are built into other network measurement tools, for example the nettest-2 tool developed at NERSCLBNL. During our analysis we discovered that even if the Iperf parameters have been setup by these methods or by an expert, the optimal parameters do not stay constant, after some time (several hours or days) they could can become invalid. This is because the load on the path plays an important role and the path itself can change. On occasions the path between two hosts can change quite often. Iperf is a TCP type tool and we know that all TCP applications are dependent not only on the bandwidth between source and destination but also on the load in the reverse path. For example, setting a large window size to achieve good performance might be counter-productive on heavily loaded lines. The Iperf results between well and poorly set parameters could differ by several hundred percent.

            Instead of using of our own Iperf measurements, we compared the ABwE results with the iperf results from the IEPM-BW project. The source (Iperf client) uses TCP to sends as many packets as it can in a fixed amount of time. The aggregated traffic, from all the parallel streams,  achieved during a set number of seconds represents the “achievable bandwidth” via for TCP applications. If the line path is relatively empty, the aggregated traffic is approachingapproaches the real capacity of the path and the estimate is approximately correct. However, if the line path is busycongested on some link(s), the Iperf packets share this path link with other user traffic (Ccross-traffic) and the Iperf result will probably an over-estimate of the Aavailable Bbandwidth. This is because the capability to send packets to the line (the "strength") of all applications is not the same.since Iperf is typically configured set to useto achieve "the maximum of the possible" so it will grab all available bandwidth on the path by using multiple parallel streams, so other single stream applications, on congested links shared with the Iperf traffic, will not get their fair share. This is especially the case if the Iperf traffic is a major component of the bottleneck link. And in some cases, because Iperf is very aggressive, it can even suppress some light type applications such as interactive traffic or web traffic and via this suppression to obtain a little bit more bandwidth than only the free part of the path capacity. This principle (“grabbing all bandwidth and even a little bit more”) has been verified by several people in the past on experimental paths with one or two hops path. However, how much Iperf suppresses the other applications on real networks has not yet been quantified. The Iperf results can vary this way to reflect the situation on the path at a particular time. The practical results obtained by this method usually come in the range between the Available Bandwidth defined by “available bandwidth” and the “real capacity” of the path. A further disadvantage of tThe Iperf method is uses brute force andthat during a measurement it can saturate the bottleneck link sends to the network hundreds of Mbytes of testing dataduring the testing period. This is reason why itThus it should not be used  cannot be used for very frequent measurements.

            We believe that packet dispersion techniques can report results are quite comparable to Iperf measurements. The packets dispersion methods used in ABwE or pPathchirp are more modest in the sense that they don’t load  the network with more little extra monitoring traffic. The ABwE or pPathchirp methods are also much less dependent on setting of parameters and so don’t create a space for ambiguity (getting much different results) in one path. However, these methods have other weak points, so the results could be different in some situations (compared to the Iperf or between themselves).

            Currently, we are in the phase of evaluating ABwE and pPathchirp. One of our main tasks for the near future will be to prove that these tools give good and accurate results in for real network pathss, especially those with high speeds (>= 100 Mbits/s). We started this type of testing in the beginning of 2003 and the first stage was to find a relationship between the Iperf and the ABwE results. Some results of this work are shown in the  following graphs. The first results were presented at PAM2003 in the talk by Jiri Navratil (the paper “The Practical Approach to Bandwidth Monitoring” contains only the results from 2002).

 

            We illustrate the current capability of ABwE with several example graphs shown below. Each of the examples has been selected to characterize different situations that we encounter. ABwE always reports 3 values: the DBC – Dominating Bottleneck Capacity (i.e. the capacity of the link that is currently dominant limitation of the available bandwidth); CT- Cross-traffic; and AB-Available Bandwidth.  The graphs show the situation on different paths during 24 hours with measurements made at 2.5 minute intervals (one monitoring cycle for all 22 remote nodes/paths takes 2.5 minutes). In all examples we are comparing ABwE data (AB) with the throughput results of obtained by Iperf (the bars) as described in previous paragraphs.

            For a better understanding of the graphs (especially the high peaks) we must point out that the real paths that we measured are very dynamic. Thus at one moment the path is relatively empty, and in the next moment the path may be heavily congested. ABwE can reflect these changes quite well as demonstrated in Fig. 2. There is a Narrow Link (the capacity limiting link) in the path, and it determines the upper limit of the path (in Fig. 2 the Narrow Link is 100 Mbits). Very often the Narrow Link is also the DBC. However, since most paths consist of many links with different capacities, it is possible for the DBC to move to another faster link that has much more Cross-traffic and less available bandwidth. In such a case the packets passing being transmitted by the router/switch node are delivered with the full speed of this link and are thus compressed. Thus we see peaks of cross-traffic which are much higher than the Narrow Link capacity. This doesn’t mean that the original Narrow Link disappeared; rather there is another source of the bottleneck that dominated at this particular moment.

            The other type of negative peaks, or valleys (negative DBC), seen in Fig 4 & 5 for example, represent another situation on the path, i.e. a fully loaded path (utilization close to the 1). Such valleys are caused by extremely intensive and “aggressive” traffic loads, usually it is traffic which is directed to the same destination host that ABwE is monitoring. One of the sources of such situation is illustrated in Fig. 4 and is caused by high local networking activity (e.g. NFS or AFS activity) on the remote host being monitored. A second source of such valleys are the IEPM-BW Iperf measurements themselves. This is illustrated in Fig. 5 where the valleys of DBS are in good agreement with the beginning of the Iperf bars. If there is an overlap between the 0.5 second ABwE measurement and the 10 second Iperf then such a valley appears. 

 

 

 

 

 

Figure 1:  The result of the experiment for testing the narrow-band in the path between SLAC and NERSC. There was very low cross-traffic (XTraffic) on this  path (red-line).  The Bottleneck Capacity measured by ABwE (green line) agreed well with the narrow band  capacity (100 Mbits/s). The estimate of an Available bandwidth is close to the capacity. The results of IEPM  iperf measurements are black bars (repeated every 90 minutes). The agreement between ABW and Iperf  is within 5%.

 

 

 

Figure 2:  The results of the experiment between SLAC and Internet2 office in Ann Arbor. The route is via Abilene and MichNet (The Michigan regional network). Low cross-traffic (red line) dominates on this path with individual peaks of  high traffic. The Bottleneck Capacity measured by ABwE (green line) shows domination of the narrow band  capacity (100 Mbits/s) somewhere in the path. If the Xtraffic in any part of the path increases, we can see it as a value which determines the new bottleneck. The “Instant” bottleneck caused by high speed devices in the path could be very different from the dominating bottleneck. ABwE remains constant at the level of 100Mbits/s. The agreement with Iperf (black bars) is within 5%.

Figure 3:  The results of the experiment between SLAC and RICE. There is an expected amount of cross-traffic (red-line) at about 30-40% capacity during the 24 hour period. The Bottleneck Capacity measured by ABwE (green line) shows an average value around 114 Mbits/s. The estimate of an Available bandwidth is moving between 60 - 100 Mbits/s. The agreement with the IEPM Iperf results (black bars) is very good (between 5-10%).

 

 

Figure 4:  The results of the experiment for testing the ESnet high speed path between SLAC and FNAL. There is modest cross-traffic (red-line) at about 40 – 90 Mbits/s.  The Bottleneck Capacity measured by ABwE (green line) shows an average value of 410 Mbits/s. The estimate of Available bandwidth varies between 300 - 400 Mbits/s with individual drops caused by randomly appearing cross traffic. The agreement with the IEPM Iperf results (black bars) is very good.

 

 

Figure 5:  The results of the experiment for testing the high speed path between SLAC and NERSC. There is cross-traffic (red-line) with visibly increasing and decreasing trends over 24 hours.  The Bottleneck Capacity measured by ABwE (green line) shows a value of 622 Mbits/s, which is the real capacity OC12 line between both labs. The estimate of the Available bandwidth is moving according to the cross-traffic profile. Individual drops in ABwE corresponds to the Iperf measurements made by IEPM-BW (in most cases). The drop is visible in the case when the measurements via Iperf (10 seconds) and the probing time of ABwE (0.5 seconds) match  in the time. The agreement with the IEPM Iperf results (black bars) is very good, within 10%.

 

 

 

Figure 6:  The results of the experiment for testing the high speed path between SLAC and CALTECH. There is high cross-traffic (red-line) at about 100 – 300 Mbits/s with a  special pattern typical of paths where the total traffic is an aggregate of the activity of many people. The cross-traffic is increasing during the day time and decreasing during the night. The agreement with the IEPM Iperf results (black bars) is very good for most of the time. However, there are periods, when the Iperf gives much lower results. Compare to the previous examples, there is also different curve of the bottleneck capacity measured by ABwE (green line). This curve is also smoothly changing during the day. It is probably because the ABwE method is using relative relations between all measured values.

 

            The ABwE lightweight bandwidth estimation toolkit has been carefully evaluated and now provides good bandwidth estimates (i.e. good agreement with Iperf and our general experiences) in over 80% of the cases. It can make an estimate in real time (< 1 second) with minimal impact (40kbits).  The ABwE is providing feedback to IEPM. A significant difference between measurements can be an indication that Iperf used in IEPM needs to re-evaluate the parameters (windows & streams) used for our heavier weight Iiperf estimator.  A paper on ABwE was presented at the PAM03 conference.

 

.

3.  Future Accomplishments and Milestones

 

ABwE – Available bandwidth Estimator (Tasks 2, 4, 6)

 

In previous paragraphs we have demonstrated the current capability of ABwE. It allows us to do continuous monitoring with the possibility of estimating available bandwidth on the path with an  accuracy between 10-1580% and 85%. Currently, we are trying to monitor more than 20 representative paths to different destinations. We are probing several ESnet sites, many Abilene sites, 5 - 7 sites in Europe (connected via Géant), 3 three sites in Japan and one site in Canada. It is quite a good set of probes and we can say that in about 80% of cases it gives us similar results as we presented in Fig.1- Fig. 6.

            Unfortunately, on some sites we still have difficulties interpreting our results because they did not match well with other measurements. It is hard to say which methods give more accurate results in these situations. There are still more factors which should be tested and verified. The problems with such situations can possibly be split into three main categories: tThe devices on the path works with thein a different mode fashion than we expect (packet dispersion problems); tThere is a traffic policy (traffic shaping) in some devices on the path which can limit any type of transfer (including Iperf); aAnd the bottleneck node(s) problems on high speed segments, which can generate bursts of packets.

            In the near future, we will concentrate on comparing both our methods (ABwE and pPathchirp) and developing them in the framework of INCITE. We are optimistic based on the first results obtained in March.

            With its real-time capability and low impact it is very suitable to use ABwE or ppathchirp for providing real time feedback of anomalous changes in bandwidth performance. We will also work on the prediction algorithms using ABwE as a source of information. Due to increasing interest from the networking community to test ABwE methods, we are going to prepare a standalone version for common use. We will also work on publishing our monitoring results via tools used in other systems (such as MonaLlisa used in some Grid projects, etc.)

 

 

 

PathChirp – Chirp Probing Tool (Tasks 2, 4, 6)

 

We will continue to discuss the possibility of how to run chirping from many locations (main destination points as DOE Labs, CERN, IN2P3, RAL, INFN and also sites located on the “paths” to these points) like PingER or ABwE and permanently monitor selected links links in order to present the spectra of cross-traffic on these links links on our web pages.  Current version of pPathchirp report results into a file on remote site. In the version which could be useful for wide area monitoring we will need to send this report on-line (as user specified by option –w)  into sender site. Having this feature in pathchirp we can continue to discuss how to present results from packet chirp tool on the SLAC and INCITE web pages and other general usages of this tool. 

            The packet dispersion tools works on different time scales than most other standard methods, which are typically on the order of hours rather than msec. These tools can give much more information.  Its greatest advantage and strongest feature will likely be “on-line mapping of cross-traffic” on a chosen path at a very fine time scale. Other tools are unable to do this because they would generate heavy load into the line to make such a measurement. This feature allows us to see details of traffic on the line.

 

 

 

Tomo/Topo Tools (Tasks 3, 4, 6)

 

We will conduct validation tests of the new “sandwich probe” developed at RICE.  We will also continue programming new graphical representations (e.g. zoom, subnet graphs, single route displays) to present the complicated tree structures we obtain from the topo/tomo measurements. We will include extending the node information available via drill-down, adding new metrics (loss, CT) to how we display the linkshops, providing an archive of measurements with the ability to search historically. We will continue deploy  the traceroute measurements tools to more SciDAC and Grid sites.

 

New Theory and Tools (Tasks 1, 2, 3)

 

We will continue the development and implementation of the packet probing and tomo/topo toolsets.

SLAC

·        Regression test the new versions of the ABwE and chirp probing and topology/tomography tools, validate results, compare with other tools, find the regions of applicability

·        Set up regular measurements with chosen tools, archive data, provide historical navigation of the archive.

·        Provide public access to the archive via standard mechanisms.

·        Modify an application to make a proof of concept of a network-aware application using the Rice measurements.

 

 

4.  Project Management

 

Interactions

We are planning a visit by Dr. J.Navratil to Rice University in September. He will present a lecture about the wide area monitoring, ABwE tool principals and results.

 



[1] The available bandwidth “is the maximum IP-layer throughput that the path can provide to a flow, given the path’s current cross-traffic load”, What do packet dispersion techniques measure?, Dovrolis, Ramanathan, Moore