Project Title: INCITE:
Edge-based Traffic Processing and Inference
for
High-Performance Networks
2.
SLAC - Current Accomplishments (through April 2003)
ABwE – Available Bandwidth Estimator (Tasks 2, 4)
We have installed, configured and
tested a new method for Aavailable Bbandwidth[1]
estimation (ABwE). During practical measurements, which we startedstarting
in November 2002, we have collected and evaluated hundreds of monitoring
results over a wide range of paths and studied
the behavior of this tool in real network conditions. In the past few months we
have started to compare these results to the results of other tools used for
network monitoring of bandwidth, particularly to the Iperf type of
measurements.
The
Iperf measurement is very popular and it looks relatively simple. It is widely
used by many ISP's or network specialists for different types of network
measurement including Aavailable Bbandwidth.
Unfortunately, the Iperf results can be misleading
because, for large congestion window
paths, they are very dependent on the parameters such as the number
of parallel streams and TCP’s maximum window
size. Setting the optimal parameters is quite a delicate process.
Such “tuning” often uses quite sophisticated methods. Some of these methods
have been developed at SLAC and some of the methods are built into other
network measurement tools, for example the nettest-2 tool
developed at NERSCLBNL. During
our analysis we discovered that even if the Iperf parameters
have been setup by these methods or by an expert, the optimal parameters do
not stay constant, after some
time (several hours or days) they could can become
invalid. This is because the load on the path plays an important role and the
path itself can change. On occasions the path between two hosts can change
quite often. Iperf is a TCP type tool and we know that all TCP applications are
dependent not only on the bandwidth between source and destination but also on
the load in the reverse path. For example, setting a large window size to
achieve good performance might be counter-productive on heavily loaded lines. The
Iperf results between well and poorly set parameters could differ by several
hundred percent.
Instead of using of our own Iperf measurements, we compared
the ABwE results with the iperf results from the IEPM-BW project. The source
(Iperf client) uses TCP to sends as many
packets as it can in a fixed amount of time. The
aggregated traffic, from all the parallel streams, achieved during a set number of
seconds represents the “achievable bandwidth” via for TCP
applications. If the line path is
relatively empty, the aggregated traffic is approachingapproaches
the real capacity of the path and the estimate is approximately correct.
However, if the line path is
busycongested
on some link(s), the Iperf
packets share this path link with
other user traffic (Ccross-traffic)
and the Iperf result will probably an over-estimate of the Aavailable
Bbandwidth.
This is because the capability to send packets to the line
(the "strength") of all applications is not the same.since
Iperf is typically configured set to useto achieve
"the maximum of the possible" so it will
grab all available bandwidth on the path by using multiple parallel
streams, so other single stream applications, on
congested links shared
with the Iperf traffic, will not
get their fair share. This is especially the case
if the Iperf traffic is
a major component of the bottleneck link. And in some cases, because
Iperf is very aggressive, it can even suppress some light type applications
such as interactive traffic or web traffic and via this suppression to obtain a
little bit more bandwidth than only the free part of the path capacity. This principle
(“grabbing all bandwidth and even a little bit more”) has been verified by
several people in the past on experimental paths with one or two hops path.
However, how much Iperf suppresses the other applications on real networks has
not yet been quantified. The Iperf results can vary this way to
reflect the situation on the path at a particular time. The practical results
obtained by this method usually come in the range between the
Available Bandwidth defined by “available
bandwidth” and the “real capacity” of the path. A further
disadvantage of tThe Iperf
method is uses brute force andthat
during a measurement it can saturate the bottleneck link sends to
the network hundreds of Mbytes of testing dataduring the testing period.
This
is reason why itThus it should not be used cannot be used for very frequent
measurements.
We
believe that packet dispersion techniques can report results are quite comparable
to Iperf measurements. The packets dispersion methods used in ABwE or pPathchirp
are more modest in the sense that they don’t load the network with more little extra
monitoring traffic. The ABwE or pPathchirp
methods are also much less dependent on setting of parameters and so don’t
create a space for ambiguity (getting much different results) in one path.
However, these methods have other weak points, so the results could be
different in some situations (compared to the Iperf
or between themselves).
Currently,
we are in the phase of evaluating ABwE and pPathchirp.
One of our main tasks for the near future will be to prove that these tools
give good and accurate results in for real
network pathss,
especially those with high speeds (>=
100 Mbits/s). We started this type of testing in the beginning of
2003 and the first stage was to find a relationship between the Iperf and
the ABwE results. Some results of this work are shown in the following graphs. The first
results were presented at PAM2003 in the talk by Jiri
Navratil (the paper “The Practical Approach to
Bandwidth Monitoring” contains only the results from 2002).
We illustrate the current
capability of ABwE with several example graphs shown
below. Each of the examples has been selected to
characterize different situations that we encounter. ABwE
always reports 3 values: the DBC – Dominating
Bottleneck Capacity (i.e. the capacity of the link that is
currently dominant limitation of the available bandwidth); CT- Cross-traffic; and
AB-Available Bandwidth. The graphs show
the situation on different paths during 24 hours with
measurements made at 2.5 minute intervals (one
monitoring cycle for all 22 remote nodes/paths takes 2.5
minutes). In all
examples we are comparing ABwE data
(AB) with the throughput results of obtained by Iperf (the bars)
as described in previous paragraphs.
For a better
understanding of the graphs (especially
the high peaks) we must point out that the real
paths that we measured are very
dynamic. Thus at one
moment the path is relatively empty, and in
the next moment the path may be heavily congested. ABwE can
reflect these changes quite well as demonstrated
in Fig. 2. There
is a Narrow Link (the
capacity limiting link) in the
path, and it determines the upper
limit of the path (in Fig. 2 the Narrow Link is 100
Mbits). Very often the Narrow Link is also the DBC. However, since most paths
consist of many links with different capacities, it is possible
for the DBC to move to another faster link that has much more
Cross-traffic and less available bandwidth. In such a
case the packets passing being
transmitted by the router/switch node are delivered
with the full speed of this link and are thus
compressed. Thus we see peaks
of cross-traffic which are much higher than the Narrow
Link capacity. This doesn’t
mean that the original Narrow
Link disappeared; rather there is
another source of the bottleneck that dominated at this
particular moment.
The other type of negative peaks, or valleys
(negative DBC), seen in
Fig 4 & 5 for example, represent
another situation on the path, i.e. a fully
loaded path (utilization close to the 1). Such valleys are
caused by extremely intensive and
“aggressive” traffic loads, usually
it is traffic which is directed to the same destination host that ABwE is monitoring.
One of the sources of such situation is illustrated in Fig. 4 and
is caused by high local networking activity (e.g. NFS
or AFS activity) on the remote host being monitored. A second source
of such valleys are the IEPM-BW Iperf
measurements themselves. This is
illustrated in Fig. 5 where
the valleys of DBS are in good
agreement with the beginning of the Iperf bars. If there is an overlap
between the 0.5 second ABwE measurement and the 10 second Iperf then such a
valley appears.


Figure 1: The result of the experiment for testing the
narrow-band in the path between SLAC and NERSC. There was very low cross-traffic (XTraffic)
on this path (red-line). The Bottleneck Capacity measured by ABwE
(green line) agreed well with the narrow band
capacity (100 Mbits/s). The estimate of an Available bandwidth is close
to the capacity. The results of IEPM
iperf measurements are black bars (repeated every 90 minutes). The
agreement between ABW and Iperf is
within 5%.


Figure 2: The results of the experiment between SLAC
and Internet2 office in


Figure 3: The results of the experiment between SLAC
and RICE. There is an expected amount
of cross-traffic (red-line) at about 30-40% capacity during the 24 hour period.
The Bottleneck Capacity measured by ABwE (green line) shows an average value
around 114 Mbits/s. The estimate of an Available bandwidth is moving between 60
- 100 Mbits/s. The agreement with the IEPM Iperf results (black bars) is very
good (between 5-10%).


Figure 4: The results of the experiment for testing the
ESnet high speed path between SLAC and FNAL. There is modest cross-traffic (red-line)
at about 40 – 90 Mbits/s. The Bottleneck
Capacity measured by ABwE (green line) shows an average value of 410 Mbits/s.
The estimate of Available bandwidth varies between 300 - 400 Mbits/s with
individual drops caused by randomly appearing cross traffic. The agreement with
the IEPM Iperf results (black bars) is very good.


Figure 5: The results of the experiment for testing the
high speed path between SLAC and NERSC.
There is cross-traffic (red-line) with visibly increasing and decreasing trends
over 24 hours. The Bottleneck Capacity
measured by ABwE (green line) shows a value of 622 Mbits/s, which is the real
capacity OC12 line between both labs. The estimate of the Available bandwidth
is moving according to the cross-traffic profile. Individual drops in ABwE
corresponds to the Iperf measurements made by IEPM-BW (in most cases). The drop
is visible in the case when the measurements via Iperf (10 seconds) and the probing
time of ABwE (0.5 seconds) match in the
time. The agreement with the IEPM Iperf results (black bars) is very good,
within 10%.


Figure 6: The
results of the experiment for testing the high speed path between SLAC and
CALTECH. There is high cross-traffic (red-line) at about 100 – 300 Mbits/s with
a special pattern typical of paths where
the total traffic is an aggregate of the activity of many people. The
cross-traffic is increasing during the day time and decreasing during the
night. The agreement with the IEPM Iperf results (black bars) is very good for
most of the time. However, there are periods, when the Iperf gives much lower
results. Compare to the previous examples, there is also different curve of the
bottleneck capacity measured by ABwE (green line). This curve is also smoothly
changing during the day. It is probably because the ABwE method is using
relative relations between all measured values.
The ABwE
lightweight bandwidth estimation toolkit has been carefully evaluated and now provides
good bandwidth estimates (i.e. good agreement with Iperf
and our general experiences) in over 80% of the cases. It can make
an estimate in real time (< 1 second) with minimal impact (40kbits). The ABwE is providing feedback to IEPM. A
significant difference between measurements can be an indication that Iperf
used in IEPM needs to re-evaluate the parameters (windows & streams) used
for our heavier weight Iiperf
estimator. A paper on ABwE was
presented at the PAM03 conference.
.
3. Future Accomplishments
and Milestones
In previous paragraphs we have demonstrated the current
capability of ABwE. It allows us to do continuous monitoring with the
possibility of estimating available bandwidth on the path with an accuracy between 10-1580% and 85%.
Currently, we are trying to monitor more than 20 representative paths
to different destinations. We are probing several ESnet
sites, many 3 three sites
in
Unfortunately,
on some sites we still have difficulties interpreting our results because they
did not match well with other measurements. It is hard to say which methods
give more accurate results in these situations. There are still more factors
which should be tested and verified. The problems with such situations can
possibly be split into three main categories: tThe
devices on the path works with thein a different mode fashion than
we expect (packet dispersion problems); tThere
is a traffic policy (traffic shaping) in some devices on the path which can
limit any type of transfer (including Iperf); aAnd
the bottleneck node(s) problems on high speed segments, which can generate
bursts of packets.
In
the near future, we will concentrate on comparing both our methods (ABwE and pPathchirp)
and developing them in the framework of INCITE.
We are optimistic based on the first results obtained in March.
With
its real-time capability and low impact it is very suitable to use ABwE or ppathchirp
for providing real time feedback of anomalous changes in bandwidth performance.
We will also work on the prediction
algorithms using ABwE as a source of information. Due to increasing interest
from the networking community to test ABwE methods, we are going to prepare a
standalone version for common use. We will also work on publishing our monitoring
results via tools used in other systems (such as MonaLlisa
used in some Grid projects, etc.)
We will continue to discuss the possibility of how to run
chirping from many locations (main destination points as DOE Labs, CERN, IN2P3,
RAL, INFN and also sites located on the “paths” to these points) like PingER or
ABwE and permanently monitor selected links links in
order to present the spectra of cross-traffic on these links links on
our web pages. Current version of pPathchirp
report results into a file on remote site. In the version which could be useful
for wide area monitoring we will need to send this report on-line (as user
specified by option –w) into sender
site. Having this feature in pathchirp we can continue to discuss how to
present results from packet chirp tool on the SLAC and INCITE web pages and
other general usages of this tool.
The packet dispersion tools works on different time scales than most other standard methods, which are typically on the order of hours rather than msec. These tools can give much more information. Its greatest advantage and strongest feature will likely be “on-line mapping of cross-traffic” on a chosen path at a very fine time scale. Other tools are unable to do this because they would generate heavy load into the line to make such a measurement. This feature allows us to see details of traffic on the line.
We will conduct validation tests
of the new “sandwich probe” developed at RICE.
We will also continue programming new graphical representations (e.g.
zoom, subnet graphs, single route displays) to present the complicated tree
structures we obtain from the topo/tomo measurements.
We will include extending the node information available via drill-down, adding
new metrics (loss, CT) to how we display the linkshops,
providing an archive of measurements with the ability to search historically.
We will continue deploy the traceroute measurements tools to more SciDAC
and Grid sites.
We will continue the development and implementation of the packet probing and tomo/topo toolsets.
SLAC
· Regression test the new versions of the ABwE and chirp probing and topology/tomography tools, validate results, compare with other tools, find the regions of applicability
· Set up regular measurements with chosen tools, archive data, provide historical navigation of the archive.
· Provide public access to the archive via standard mechanisms.
· Modify an application to make a proof of concept of a network-aware application using the Rice measurements.
4.
Project Management
We are planning a visit by Dr. J.Navratil to
[1] The available bandwidth “is the maximum IP-layer throughput that the path can provide to a flow, given the path’s current cross-traffic load”, What do packet dispersion techniques measure?, Dovrolis, Ramanathan, Moore