Report on IEPM PPDG efforts for PPDG for the quarter October
- June 2004
Report prepared by Les Cottrell, June23, 2004
Collaboration with IEPM, Network Performance Monitoring
Web/Grid Services
The web services access to the IEPM-BW single and multi-stream TCP throughput
(from iperf), and to bandwidth capacity and utilization (from ABwE) has been
upgraded to use the latest NMWG
response (May 4 '04) and request (Mar 25 '04) schema. We continue to
support the older schema for
MonALISA. We have added access to the PingER Round Trip Times (RTT) for the
most recent month of measurements from SLAC. To facilitate this we upload the
PingER data into the Oracle database. We have also added web services support to
access the IEPM-BW traceroute hop information. The
documentation
for the web services has been updated. We updated the web service client
examples that provide
interactive access the IEPM-BW data.
We have also added a
client example for the PingER data. A
presentation on the status of the SLAC
web services was made at GGF10 in Berlin.
We co-authored updates for and submitted the GGF NMWG recommendation entitled
A
Hierarchy of Network Performance Characteristics for Grid Applications and
Services,
Bandwidth/Throughput measurement (IEPM-BW)
We extended our traceroute analysis and visualization tool to add: drill down to view
the bandwidth anomalous detection plots; detection of multiport end hosts, hop
stuttering, 30 hop timeouts, added AS caching and timeouts, and annotations. We wrote and published
Correlating Internet Performance Changes and Route Changes to Assist in
Trouble-shooting from an End-user Perspective. We
presented
the above paper at the
Passive and Active
Monitoring Workshop, Antibes, Juan-les-Pins, France, April 19-20, 2004.
We added an IEPM-BW monitoring host at NIIT in Pakistan.
Lightweight Bandwidth Estimation
We extended ABwE to provide measure the RTT and to redesign and implement the
internal protocol to allow it to be work better with typical security
constraints. It is now deployed on more than 100 nodes (over
70 on
PlanetLab and 40 on IEPM-BW).
Bandwidth performance anomalous events
To address the problem of too many network performance plots to manually
review in order to detect important anomalies, we designed and developed an
implementation of an enhanced version of the
NLANR
"plateau" algorithm. We created a
library of interesting anomalous events from our IEPM-BW monitoring data. We
evaluated the performance of the modified algorithm in terms of events captured
versus false positives. We wrote a
paper describing this work. It was accepted for publication through the
SIGCOMM Network
Trouble Shooting conference in Portland OR in September 2004. Our next steps
will be to integrate this into IEPM-BW and to improve the algorithm.
PingER
We extended the PingER monitoring to include a monitoring sites at NIIT in
Rawalpindi, Pakistan in Rio de Janeiro and Sao Palo. We extended the time series
plots to add TCP throughput and also to better enable access to the data (see
for example
http://pinger.fnal.gov/cgi-bin/graph_pings.pl?src_regexp=slac.stanford.edu&dest_regexp=slac.stanford.edu).
Much work went into updating remote hosts that were no longer pingable. In
particular recovering all Australian hosts (blocked at WU, Seattle), Africa
and Alaska. We also added monitoring hosts hosts in Pakistan and Brazil and
ensuring hosts were correctly entered into the database.
We attended the Internet2
"Extending
the Reach of Advanced Networking - International Workshop" in Arlington VA
22 April 2004, and gave 2 talks:
We prepared a report on
Internet performance to Bangladesh for Professor H. Cerdeira of ICTP to
present on her trip to Bangladesh.
Outreach
We worked with DESY to understand the throughput performance for various HENP
applications, and with NASA to assist GLAST in achieving adequate throughput to
GSFC. We worked with BaBar and IN2P3 to try new TCP stacks with real BaBar data
transfers. To aid in this we set up a host at SLAC for outside users to use for
high speed transport testing.
We worked with the MIT Haystack project to assist them in achieving higher
throughput performance.
Advanced TCP Stack and UDP Evaluation
We worked with the developers of LTCP, H-TCP, UDT, altAIMD and FAST to get
the latest kernels and code and ensure they were ported to an acceptable (from
security viewpoint) version of Linux.
Proposals and Representation
We submitted proposals to the DoE Office of Science:
- Measurement and Analysis for the Global Grid and Internet End-to-end
performance (MAGGIE) - with LBNL, Internet2, PSC, and U Delaware
- INCITE Ultra – New Protocols, Tools, Security, and Testbeds for Ultra
High-Speed Networking - with Rice and LANL
- TeraPaths: A QoS Enabled Collaborative Data Sharing Infrastructure
for Peta-scale Computing Research with BNL, U Michigan, Stony Brook
University and the University of New Mexico.
- Automatic and Continuous Buffer Tuning for Data Transfers in High-Energy
Physics - with LANL
We also submitted proposals on:
- Understanding Effective Connectivity to and within Africa (AfricaPingER)
together with eJDS/ICTP, NITDA and FOSSFA - submitted to
IDRC/Canada
- Gateway to Science and Development: an Innovative Integrated Approach
to the Enhancement of Science in Developing Countries with applications to
Disaster Prevention and Evaluation of Natural Resources together
with ICTP, Aidworld Humanitarian ICT, CONAE, National Academy of Sciences,
Kharkov, Ukraine, STAC Vietnam and the VUB, Belgium - submitted to the EU
INCO
- Middleware for Optimized Network-Aware Data Dissemination (ONADD)
with GATech - submitted to NSF
We successfully set up a
formal collaboration on
network monitoring with the NIIT in Rawalpindi, Islamabad. A formal MOU
was signed, and we are now actively working together with regular
fortnightly phone meetings.
We attended:
IPv6
We continue to maintain the PingER6 monitoring infrastructure with
currently about 30 remote hosts. ABwE has been ported to IPv6 and is now
successfully running on about 5 sites (CERN, CESnet, SOX/Atlanta, GATech,
SLAC).