Report on IEPM PPDG Efforts for the Quarter October - December 2005
Report by Les. Cottrell for the
IEPM team,
SLAC
Bandwidth/Throughput Monitoring
Since our last update, we incorporated several features to
improve the manageability of the toolkit. Included in this
are: functionality for entering comments about the target
and monitoring host nodes; methods to synchronize what is
being monitored from each of the Monitoring hosts; and also
a mechanism for archiving the "current" state of Iepm-BW on
Monitoring hosts to simplfy reinstallation of an existing
system in case of problems.
We reintroduced Pathload measurements into the Iepm-BW
tests and also updated our versions of probes such as
Iperf and Pathchirp tests.
Several new types of scatterplots have been added to
display the probes vs each other.
Work was done on the plateau anomaly detection code.
For example, to prevent multiple alerts, once an alert is
generated (after a 6 hour sustained drop), the bandwidth
must recover for at least 3 hours for another alert to
be generated. The Holt-Winters
anomaly detection algorithm was incorportated to run in parallel
with the plateau algorithm so that we can compare the two.
Analysis of the ping data is now done on a regular basis to
look for packet loss and packet reordering. Time series graphs
are generated from this analysis showing the packet loss and
reordering so that they can be compared side by side with the
probe timeseries graphs.
Functionality has been added to save the alerts in a data
base table so that a report of all alerts for a given period
can be generated. Currently all 2005 data is being run
through the new plateau algorithm to check that it is
working and for alert archival purposes.
Passive Monitoring
Given the promise of using passive monitoring via Netflow to
measure the performance of bulk-data applications over the
network, we proposed a technique for using this data for
forecasting. Basically we look at the modes of the application
performance distributions, come up with methods for choosing
the most appropriate (e.g. eliminate anomalous peaks, choose the
most frequent for a particular time
of day, day of week etc.) and provide the parameters of this
mode for forecast estimation.
PingER and Developing Region Monitoring
Through Internet2 we have made contact with people in Palestine and have
agreement to install PingER on a couple of hosts there.
With a student from NIIT/Pakistan we are working on developing a web site to
enable locating specified hosts by triangulating on ping RTTs from
landmark sites. As part of this we wrote a new version of the
reverse traceroute server script to also enable pings. This has been
successfully installed at about 5 landmark sites.
One of our future goals is to integrate different monitoring infrastructures
into a Federation. As a first step in this,
we are working on a front end to the AMP/NLANR ping data measurements
so they can be accessed, analyzed and displayed by the PingER project.
We are also working to integrate our traceroute analysis program
with AMP. We are also working with MonALISA to make IEPM-BW data available
via MonALISA.
In collaboration with NIIT/Pakistan, we installed two new
PingER monitoring sites at NTC/PERN in Paksitan. This
should enable us to have a better evaluation of Internet performance within
Pakistan.
With the conversion of BINP from the dedicated 512kbits/s
Novosibirsk to KEK link to using GLORIAD we measured and analyzed
the performance to show the performance has gone up by a factor of 10.
However it is not consistent and has large diurnal variations which need
to be understood.
Testbeds
With Caltech, Manchester, FNAL, CERN and others, once again
we prepared for and entered the
SC2005 (in Seattle)
BandWidth Challenge (BWC). We put together a
web site to publicize our efforts. Equipment loans were secured from
Sun, Cisco, Boston Computers, QLogic, Neterion, and Chelsio.
We arranged for seven 10 Gbits/s waves to the SLAC/FNAL booth
(2 from SLAC,
4 from FNAL and one from the UK). At SLAC we installed an
xrootd
cluster of ten Sun v20z dual 1.8GHz Opterons, plus 4 file servers. At
SC2005 installed eight file servers from Boston Computers,
a cluster of ten Sun v20z with dual 2.4GHz Opterons,
40Gbits/ fibre channel connection to 20 TBytes in the
StorCloud
booth at SC2005.
Our team won the BWC for the third year in succession, this year achieving
over 150Gbits/s, and we put out
several press releases.
Following the success of using
xrootd
in the bandwidth challenge we worked with the
developers to evaluate its performance with 10Gbits/s Network Interfaces
from Chelsio and Neterion.
We have made contact with Microsoft and have put together an MOU to evaluate
a new TCP stack for Windows Vista, on real networks.
Admin, visits, papers, presentations, proposals etc.
We hosted a visit by the Rector of the National University of Sciences and
Technology and the Dean of NUST's Institute of Information Technology.
Met with the vice president of Stanford, the director of SLAC, the Dean
of Stanford Hospital and many others.
Yee Ting Li, a postdoc from UCL England, joined the IEPM team to work on
the Terapaths project.
Publications:
- Dynamically Forecasting Network Performance of Bulk Data Transfer Applications using Passive Network Measurements
accepted by CHEP06.
- Quantifying the Digital Divide: A scientific overview of the connectivity of South Asian and African Countries
accepted by CHEP06.
-
Bringing the Internet to China
Essay by Les Cottrell in Symmetry magaizine volume 02, issue 09, November 05.
-
Characterization and Evaluation of TCP and UDP-based Transport on Real Networks
R. Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard
Hughes-Jones, Michael Chen, Larry McIntosh and Frank Leers,
to be published in NOMS 2006.
We made the following presentations: