INCITE voice meeting between SLAC & Rice, 3/27/02
Authors: Les Cottrell. Created:
March 27, 2002
Attendees: Rob Nowak, Yolanda Tsang, Jiri Navratil, Connie
Logg, Les
Cottrell
Introduction
Some thoughts before the meeting (from Les, Jiri & Yolanda):
Since only Bob maybe with us for the meeting, we should focus on the
topology/tomography measurements.
- Topology: we would like to see topology maps of the connectivity from SLAC
to a few 10's of sites (e.g. the IEPM-BW sites). These should be produced
automatically as web displayable images (e.g. gif, jpg or png) on a daily basis,
then they can be linked to from various web pages. To do this:
- We need the data from the Rice topology measurements (fat boy) to be
interfaced to a graphics package that can display the map. Jiri has such a
graphics package that he wrote a year or so ago. Jiri can you post a typical
graph created by this package? Jiri needs to revive this package or try another
say from CAIDA (Warren has some pointers to possible packages). Then Jiri and Yo
need to define an interface from her topology packge to the graphics package and
then implement.
- We need to automate the measurements and analysis. Among other things this
means the job must be callable from a cron table, and we need to be able to
either replace MatLab with a program that uses some numerical library (e.g. IMSL,
NAG etc.), or we need to decide how to run MatLab automatically in a robust
form. This may require SLAC to purchase a MatLab license, or the jobs to be run
at Rice, or run the jobs at SLAC when there are spare licenses (e.g. late at
night).
- We (SLAC) need to set up an infrastructure to install the various Rice
servers at the IEPM-BW sites. It would help if there was only one Rice server
that did chirp, fat boy etc. at each remote host. Is this reasonable? We (SLAC)
need to start and stop the Rice server(s) at the remote sites on demand for
robustness and security reasons. Once a server is started we can make the
measurement(s) from the SLAC end and then kill the server.
- We need to match the fat boy topologies to the traceroute topologies.
- Automating traceroute measurements is easy. In fact Warren already has
traceroutes(1)
(and even pings to individual nodes along the route) for about 80
sites seen from SLAC and we could extend this for about another 10 measuring
hosts). Yo can already take multiple traceroutes into her program and using
MatLab create tables of connectivity. So what is needed (as above) is for a
graphics program and an interface between the fat boy connectivity tables and
the graphical mapping program. This is probably easier than item 1, and maybe
should be attempted first or at least in parallel with 1.
- In addition we want to get the RTT measurements out of the traceroute
measurements, so the links (edges) can be labelled (and possibly colored) with
the average RTTs to the node at the end of the edge.
- We want to make tomography loss measurements and apply them to the maps.
- Yo has discussed a way to to this. It is quite intensive so we need to
discuss how to do this, i.e. how to reduce the number of end nodes (e.g. by
breaking the world in regions with common connectivity), how many probes per end
node, how often etc.
- Once we have the measurements we need to decide how/whether we can label
the graphs or how to visualize the data.
Yolanda wrote:
I have some clarifications to the points that Les made. I also try to point out some difficulties and limitations that arise. If we can identify the problems in an earlier stage, it will definitely ease the future development.
Item 1
SLAC is interested in the topology estimation, specifically branch point identification
which we call the logical topology. Rice has a tool in estimating the logical topology,
however, the current tool can only do binary tree identification, that is each node
can only have two children. The estimation can be modified such that some of the
links with low "index" value can be collapsed in order to have a topology closer
to the real one. This is done manually at the current stage and automation is
needed here. Besides, since it is a delay based estimation, links with very
high bandwidth are likely to have insignificant or zero delay. For those links,
we might have difficulties in identifying them.
Item 2
Mark or Rui has a program for plotting the traceroute information. However,
it simply shows the logical topology without extra information on the links
and internal nodes. What Les meant is instead of the topology, we should also
take advantage of the existing information and locate them. These include but
not limit to, average the round trip delay on each logical link and indicate
the ip address on the internal nodes (branching nodes)
Currently, we plot the logical tree using matlab. The coordinate of the nodes are
computed as well in matlab. However, it cannot be scaled to a large number of
receivers. Jiri has a better program in plotting the connectivity, however,
scaling might remain as a problem.
Item 3
Based on the number of measuremetns (number of packets sent and received),
we will estimate the individual link loss rate on the logical tree structured network.
To provide correlated information, we will send closely spaced packet pairs
to receivers. consider the following simple example:
A <- sender
|
/ \
B C <- receivers
In order to isolate information for each link, we need four types of measurements,
(AB, AB), (AB, AC), (AC, AC) and (AC, AB), where <.> is a packet pair and AB implies
packet from A to B. To achieve reasonable estimates, we need to have an adequate
amount of probes for each measurement.
The questions arising include (1) what does it mean by adequate? If it is
a good link, we might have difficulties in seeing losses and thus will require
more packets. If it is a lossy link, we do not want to further degrade the
network service and we will only send a small amount of probes. (2) how many
receivers should we include in the tree? You can see that the number of
measurements depends on the number of receivers. If there is a large number of
receivers, the number of measurements might be considerable.
Meeting notes
We went over the information above and decided on the following:
- It appears that we should start with getting topology information from
traceroutes since they are easy to automate and we have a rich set of data
already available and being continuously updated. Yolanda knows how to
access this data.
- Ryan will be going to CAIDA for an internship this summer, so he may be
able to uncover & use some of CAIDA's traceroute analyses and mapping
tools (e.g. those used in skitter).
- Jiri will see if he can get his topology drawing tool to work.
- Warren will document whether any of the CAIDA tools he knows about would
be of any help, and if nott why not.
- Yolanda will tell Jiri what the format of the connectivity output
is from the topology program.
- It may turn out to be difficult to get topology information from the
traceroutes since the interface addresses in a single router typically have
different IP addresses so correlating which links go into and out of a given
router may be hard. If there is a lot of ambiguity then using the Rice
topology tool to help resolve them may be useful. Rui has been thinking
about how to use the traceroute to et the general topology and the Rice
fatboy to resolve ambiguities. Rob will get Ruui to put his ideas down in a
document so we can understand.
- Ryan reports (2) that he has developed a tool that can send ICMP requests back
to back to 2 different hosts. This could be used to do topology mapping to
remote hosts without requiring a server to be installed. However, he needs
to validate that the results from using the round trip ICMP probes are close
to those obtained by using the one way UDP probes. If this is
successful then this would be a big advantage especially for probing remote
hosts where we cannot install a server. In this case we could use the tool
for mapping the ~ 75 PingER Beacon sites.
- Given the intensity of probes needed for the topology (goes as roughly the
square of the number of hosts in the group being mapped) we will probably
need to group the remote nodes into sets of about 10, e.g. by world region
initially. Nodes in a given region will have similar connectivity.
- Rob believes it would be relatively easy to remove the dependence of
Matlab from the fatboy topology mapping. Rob will discuss this with Ryan.
- Warren, Yolanda and Les will discuss how we (SLAC) might deploy the Rice
fatboy tool in PingER.
(1): try looking at the data files in /nfs/oceanus/u4/traceping/data/,
I think the format is pretty straightforward and a perl script can easily
convert to a more matlab friendly format.
(2): I've written code that can send back to back ICMP echo request
packets(pings) as fast as they can be transmitted onto the network. Yes, it
does support sending them to different hosts. The spacing between them is
usually less than a millisecond, depending on rate of the link at the sender.
I do not use "ping". The new code was written from scratch.
Ryan Christopher King [ryanking@owlnet.rice.edu]
[ Feedback ]