Report on IEPM PPDG efforts for PPDG for the quarter October - June 2004

Report prepared by Les Cottrell, June23, 2004

Collaboration with IEPM, Network Performance Monitoring

Web/Grid Services

The web services access to the IEPM-BW single and multi-stream TCP throughput (from iperf), and to bandwidth capacity and utilization (from ABwE) has been upgraded to use the latest NMWG response (May 4 '04) and request (Mar 25 '04) schema.  We continue to support the older schema for MonALISA. We have added access to the PingER Round Trip Times (RTT) for the most recent month of measurements from SLAC. To facilitate this we upload the PingER data into the Oracle database. We have also added web services support to access the IEPM-BW traceroute hop information. The documentation for the web services has been updated. We updated the web service client examples that provide interactive access the IEPM-BW data. We have also added a client example for the PingER data. A presentation on the status of the SLAC web services was made at GGF10 in Berlin.

We co-authored updates for and submitted the GGF NMWG recommendation entitled A Hierarchy of Network Performance Characteristics for Grid Applications and Services,

Bandwidth/Throughput measurement (IEPM-BW)

We extended our traceroute analysis and visualization tool to add: drill down to view the bandwidth anomalous detection plots; detection of multiport end hosts, hop stuttering, 30 hop timeouts, added AS caching and timeouts, and annotations. We wrote and published Correlating Internet Performance Changes and Route Changes to Assist in Trouble-shooting from an End-user Perspective. We presented the above paper at the Passive and Active Monitoring Workshop, Antibes, Juan-les-Pins, France, April 19-20, 2004.

We added an IEPM-BW monitoring host at NIIT in Pakistan.

Lightweight Bandwidth Estimation

We extended ABwE to provide measure the RTT and to redesign and implement the internal protocol to allow it to be work better with typical security constraints. It is now deployed on more than 100 nodes (over 70 on PlanetLab and 40 on IEPM-BW).

Bandwidth performance anomalous events

To address the problem of too many network performance plots to manually review in order to detect important anomalies, we designed and developed an implementation of an enhanced version of the NLANR "plateau" algorithm. We created a library of interesting anomalous events from our IEPM-BW monitoring data. We evaluated the performance of the modified algorithm in terms of events captured versus false positives. We wrote a paper describing this work. It was accepted for publication through the SIGCOMM Network Trouble Shooting conference in Portland OR in September 2004. Our next steps will be to integrate this into IEPM-BW and to improve the algorithm.

PingER

We extended the PingER monitoring to include a monitoring sites at NIIT in Rawalpindi, Pakistan in Rio de Janeiro and Sao Palo. We extended the time series plots to add TCP throughput and also to better enable access to the data (see for example http://pinger.fnal.gov/cgi-bin/graph_pings.pl?src_regexp=slac.stanford.edu&dest_regexp=slac.stanford.edu). Much work went into updating remote hosts that were no longer pingable. In particular recovering all Australian hosts (blocked at WU, Seattle), Africa  and Alaska. We also added monitoring hosts hosts in Pakistan and Brazil and ensuring hosts were correctly entered into the database.

We attended the Internet2 "Extending the Reach of Advanced Networking - International Workshop" in Arlington VA 22 April 2004, and gave 2 talks:

We prepared a report on Internet performance to Bangladesh for Professor H. Cerdeira of ICTP to present on her trip to Bangladesh.

Outreach

We worked with DESY to understand the throughput performance for various HENP applications, and with NASA to assist GLAST in achieving adequate throughput to GSFC. We worked with BaBar and IN2P3 to try new TCP stacks with real BaBar data transfers. To aid in this we set up a host at SLAC for outside users to use for high speed transport testing.

We worked with the MIT Haystack project to assist them in achieving higher throughput performance.

Advanced TCP Stack and UDP Evaluation

We worked with the developers of LTCP, H-TCP, UDT, altAIMD and FAST to get the latest kernels and code and ensure they were ported to an acceptable (from security viewpoint) version of Linux.

Proposals and Representation

We submitted  proposals to the DoE Office of Science:

We also submitted proposals on:

We successfully set up a formal collaboration on network monitoring with the NIIT in Rawalpindi, Islamabad. A formal MOU was signed, and we are now actively working together with regular fortnightly phone meetings.

We attended:

IPv6

We continue to maintain the PingER6 monitoring infrastructure with currently about 30 remote hosts. ABwE has been ported to IPv6 and is now successfully running on about 5 sites (CERN, CESnet, SOX/Atlanta, GATech, SLAC).