11/1/00, at SLAC
Rough notes by Les Cottrell and Robin Tasker
Paul Kummer, Robin Tasker, Alan Clarkson from DL and Warren Matthews, Les Cottrell from SLAC
The GTS has been tested on a small (1605) router at DL. It appears to work well for big packets (1000 bytes) but it appears to kick in early for 104 byte packets, probably due to power of the test router (1605), which with 104 byte packet can only transmit < 4Mbits/s. The ACLs allow one to select the machines and port numbers at both ends. There was no CAR available in the 1605 (as far as can be told from the Cisco release notes), but it does have WRED and WFQ. They hope to test WRED and WFQ on the DL test bed before they try with the UKERNA router.
The effect of the marking and applying QoS is noticeable on PingER peak packet loss (went from 40% to 10%), but since Abilene opened up the peering and bandwidth it is not as noticeable.
The goal is to make tests from DL to Stanford. We will have accounts on the Stanford machine (loggy).
We discussed how to make measurements. We will probably use pings (possibly with an extended version of PingER) on the low and high priority queues when there is a generated load. Gen_send (at http://www.citi.umich.edu/projects/qbone/generator.html) allows one to specify the UDP bits/second to send, the packet size, and the frequency. This allows one to generate evenly spaced or bursty traffic and report on thruput and losses etc. We also want to record the routes.
RL still has congestion problems at their firewall due to security ACLs. This will be fixed with a new line speed firewall when they upgrade to 622Mbps later this year. DL will be getting an Extreme Networks switch/router that is advertised to run at wire speed even with filtering. There will be a dedicated Extreme box for external fltering. This upgrade will be summer 2001.
Richard Hughes-Jones (RHJ) of Manchester has some tools for measuring jitter. There are good contacts with Richard.
Standard tools to be available on the end hosts will be nping, ntrace, tcpdump, iperf, gen_send/gen_recv, pchar, possibly Richard Hughes Jones jittter tools. Need to set iperf servers to run all the time (e.g. use inetd). Loggy and rtlin1 will have iperf tied into inetd to keep them running all the time. We also need to see if we can allow ports 5001-5009 into rtlin1 so we can use rtlin1 as an iperf server from clients outside DL. Gen_send needs modifying to allow selection of port, and maybe some other features such as selecting output file, allo setting the duration of the test. Paul is looking at modifying gen_send/gen_recv.
Idea is to have 2 queues chosen by CAR. Iperf will be on ports 5001-9 and will be normal priority and then other apps on port 6000.... Can we also mark the ICMP packets (e.g. by using TOS bits)? Can multi-homing help, will CAR over-ride ping TOS bits (Paul believes it will)? We agreed to start out by not using port number marking rather onl using IP address for marking. We also discussed multi-home versus multi-host, and agreed multi-home is preferred since simplifies in case the machines are not identical. In the background run ping to characterize the performance etc. Every now and again run traceroute to keep track of the route changes (only keep diffs). Use gen_send for traffic generation, e.g. for background traffic, and wind it up and see effect on regular priority and high priority pings (loss/RTT/jitter). Run gen_send vs iperf both at regular priorities and see the effect on iperf thruput as gen-send increases then repeat with iperf at high priority. Try something similar with RHJ to measure jitter and see if it is more sensitive to queue management and also how it agrees with ping jitter measurements.
Next can play with the scheduler that decides how to select things (e.g. by using WFQ) from multiple queues. Can also play with more queues for example with one emulating Less than Best Effort Service, one for normal traffic and one for expedited service. It would be useful to learn from Cisco whether they have any plans on this. The GTS is just there to guarantee congestion by limiting bandwidth available (currently set to 2Mits/s).
Another thing a bit closer to applications is to do HTTP GETs on standard multisized files and see impact of QoS. This will help show the TCP set up effects.
It may be interesting to look at changing GTS on the DL testbed to see the effect on iperf. There may be other Cisco QoS features we can test, e.g. FRED. May need to improve our contacts with Cisco to learn more of upcoming features and to have better sources of information.
DL will dicsuss with UKERNA about access to NY router utilization (e.g. MRTG) information and whether there is other information available (e.g. flows from OCXMON). Also may be desirable for other routers on the paths. Also maybe other information in the routers that may be instructive. e.g. queue lengths etc.
It would be good to have a Surveyor at DL. DL have already talked to ANS about this. Les will contact Guy Almes and Matt Zekauskas to try and move this forward. This would also provide route history.
The UKERNA schedule has slipped by a month. UKERNA (project management) wants a more detailed project plan in 2 months. So we need to focus on what happens in the next 3 months.
UKERNA wants monthly reports from the project. There will be a more formal report to a NetyworkShop in March. Les will make sure that DL is on the distribution list for the monthly IEPM reports, and DL will make ure the DL/UKERNA reports are also sent to SLAC.
Tasks Identified ================ - accounts on loggy - WM - buffer size upgrade on loggy - WM - install general software on loggy and rtlin1 - WM RT - install PingER on loggy and rtlin1 - WM RT - install traceroute monitor stuff on loggy and rtlin1 - WM RT - multihome rtlin1 - RT - characterise link both ways between rtlin(H) & loggy rtlin(L) & loggy use PingER at 15 min interval with 100 pings of 100 and 1000 bytes WM RT traceroute script to compare against previous routes - WM RHJ jitter tests (modified to run from a script) - RT iperf for throughput - LC queue lengths etc in NY PoP - needs checking with cisco and UKERNA - PSK - further DL testbed testing - AC/RT/PSK - understand existing pkt marking in NY PoP and relative marking of our test traffic - PSK - final config of cisco 1605R -> UKERNA router i.e. - PSK ACL WRED ACL | | | V V V |-- queue -- rtlin(H) --| <--GTS--- sched --| |-- CAR <--- Abilene |-- queue -- rtlin(N) --| where, rtlin1 is multihomed with IP addresses rtlin(H) and rtlin(N) so (reversed) WRED GTS | | V V |-> everything else |-- queue1 ----------> 622Mbps (620?) (high) | ->| | |-> loggy->rtlin(H) |-- queue2 --- | -| CAR-< | | |-> evagore->icfamon |-- queue3 --- -| | (normal)| ->| V |-> loggy->rtlin(N) |-- queue4-----------> 2 Mbps| but where does all the other traffic go in this scheme, i.e. high, normal or elsewhere? - ability to get throughput data out of NY PoP router during tests - PSK - check of scheduling algorithms available within NY PoP router - PSK - more detailed Project Plan - PSK - modify gen-send to used specifed port and be scriptable - PSK - modify RHJ stuff as appropriate - RT - modify ping for testbed via Perl to be scriptable etc - WM - iperf via Perl script - LC - http-get Perl script needed - WM RT - experimental test suite, - ALL for ------------ background increasing ------------------> via gen_send() input: pkt size= 100, 1000, 1400? output: actual thruput do ping N | input pkt sizes (15) between 100 - 1000 * 100 ping H | output rtt, pkt iperf N | input windows, streams iperf H | output thruput RHJ N | input pkt sizes (15) * 100 pkts RHJ H | output oneway transit time, loss, jitter http-get N | input filesize (4K, 64K) http-get H | output load time estimate that each test run will produce < 1Mbyte of data but need to add textual description of the data in the raw dataset - done at times through the day/week (to show no effect!) and frequency of each time point - need to know time for single test suite to run - set up email list - use of Log Book - archiver within mail (URLs allowed!) - RT
Notes from 03/11/2000 ===================== - possible problems with flow routing and GTS in cisco, they're mutually exclusive. So alternatives.... ---> --> CAR ---> Queue --> Shape --> ---> is it possible to use CAR to mark the pkt rather than drop but the problem remains that unless there is congestion there will be no WRED effect, i.e. what we do now with ns2 and evagore testing. BGP can use multiple links on a per pkt basis which would lead potentially to pkt re-ordering which for IP isn't a problem. Could seperate functions such that CAR is performed at NY and the Queue/Shape function in a router at DL but does this invalidate the purpose. Probably not except that it could be viewed as a glorified benchtest but this would not control the trans-atlantic congestion and the effect on jitter. Could this be done on an I2 router? This must be the preferred option! PSK to talk with UKERNA; WM to Hawaii - traceroute monitor done by Warren on loggy/rtlin2 route - see http://www-iepm.slac.stanford.edu/monitoring/qos/ - accounts done at Stanford (loggy) but we need passwords!!!! - need consolidated action list with timescales -> project plan for UKERNA - need VC in December (early) to continue progression of work items - RT WM - division of work for test suite etc - PingER on rtlin1, send via ftp the data files to SLAC and to DL - RT