Dr. R. Les Cottrell, MS 97, Stanford Linear Accelerator Center (SLAC), 2575
Sand Hill Road, Menlo Park, California 94025,
<cottrell@slac.stanford.edu>
Prof. Harvey Newman, Caltech, Pasadena, California <newman@hep.caltech.edu>
Other contacts
. Shipping & contacts: CERN,
StarLight, Sunnyvale
This is a joint SLAC, Caltech and CERN project with Level 3, Cisco and
StarLight as sponsors. We wish to demonstrate high network throughput on trans-continental
(10 Gbits/s) and
trans-Atlantic (2.5 Gbits/s) links between Sunnyvale, CERN/Geneva and SC2002/Baltimore.
We hope
to demonstrate high (> 1 Gbit/s) disk-to-disk throughput and even
higher (several Gbits/s sustained) memory-to-memory throughputs between
the above sites. The initial showcase will be at SC2002
in Baltimore November 18-21 2002.
Results
Our first tests were made iperf TCP measurement
using the Caltech FAST
TCP stack, from 6 hosts
from SC2002 to Sunnyvale. We achieved about 5Gbits/s.
The 3 hosts in the SLAC booth delivered about 3Gbits/s according to the
router SNMP statistics.
For the formal challenge we ran iperf to Sunnyvale from 8 hosts in the
Caltech booth and 2 in
SLAC booth, and to Chicago from 5 hosts in SLAC booth. We started up the
challenge around 9pm, soon after we lost
power in SLAC booth, recovered, and achieved about
11.5Gbits/s sustained to the 2 links acording to the
SCinet measurements. The Sunnyvale link
was getting about 8-9Gbits/s.
The
official SCiNet results indicated that LBNL won the bandwidth
challenge achieving over 16Gbits/s and the Caltech/SLAC challenge
was second with a 12.44 Gbits/s peak and 10.67Gbits/s sustained over 15
minutes.
We (NIKHEF, SLAC & Caltech) also used the tetstbed to win the
Internet 2 Land Speed Record.
Setups:
Routing plan,
Chicago - Amsterdam network setup,
DataTag Testbed,
Overall including Sunnyvale,
10GE setup
Overall,
Chicago
(6*2.2GHz PCs 192.91.236.1,2,3,4,5,6),
CERN.
(6*2.2GHz PCs, 192.91.239.1,2,3,4,5,6),
DataTag, Sunnyvale (12
PCs + 4 disk servers, addresses),
SC2002 (5PCs, addresses),
SARA (4 HP 2*2.4GHz PCs 145.146.97.16,17,18,19)
DataTag
reservation form
available to
DataTag users
following the
rules.
FAQ on FAST
Network
The demonstration will utilize a loaned Level 3 OC192 POS (10Gbits/s) circuit from StarLight
in Chicago to the Level 3 gateway at 1380 Kifer Road, Sunnyvale.
In Sunnyvale Cisco has loaned us a GSR 12406
router with an OC192 POS and 20 1GE interfaces. Also in Sunnyvale we will have
2 racks of colocation space loaned from Level 3. The GSR will go
in one rack, and in the second rack will
go 12 Linux servers plus a RAID disk farm to be provided by Caltech.
At Baltimore and CERN there will be similar setups.
At StarLight we will be connecting to a Juniper 640 router and
then to Baltimore (10Gbits/s) and CERN (2.5 Gbps). . We will
have the GSR on a 90 day loan and Level 3 will leave
the circuit lit for 30 days, with the possibility of negotiating for longer. We
will have an estimate of the turn up date from Level 3 on the afternoon of Monday 4th
November.
Routing:
- Traffic between CERN, Starlight and Sunnyvale:
We are connected at 10 GbE to the Juniper T640 managed
by Linda. By default the traffic is not routed via this
10 GbE link. In order to route this traffic via the 10
GbE link, I need to know the address of the subnet at
Sunnyvale.
- Traffic between CERN, Starlight and SLAC:
We have two connections to Abilene: one at 1 Gbps for
production and one at 10 Gbps for tests (Via the juniper
T640). By default, the traffic is routed via the
production peering. In order to route tests traffic
via the 10 GbE link, I need to know from which machines
(from which subnet) you are conducting tests at SLAC.
Please note that if you reach the Starlight via ESnet,
we cannot route the traffic via the 10 GbE link.
- Linda assigned the following /30 for the pt-to-pt link
192.5.175.129 CHI/T640
192.5.175.130 SLAC
Its set up for static routing.
1 GSR model 12406 10 rack units, 6 slots, draws 16A, need 20Amps/120V AC circuit
OC192/POS interface + 20 * 1 GE interfaces, plus 20 850nm multimode
1000Base-SX small form factor pluggable (SFP) GBICs. Parts
list.
Weight of 12406 chassis & power supplies is 140lbs, 10 port GE card is
10lbs, OC192 card is 9.5lbs, route processor 6lbs. Overall weight ~ 180lbs with
GBICs & cables.
OC192 POS connection 1310nm SC connector at Sunnyvale.; OC192 POS connection 1550nm SC connector at StarLight
12*1 computer server ea 1 rack unit & 2.5Amps, 120V
Model: ACME Server 6012PE
Motherboard: Supermicro P4DPR-I
CPU : Intel 2.4 GHz
Memory : 1 GB PC2100 DDR ECC Registered
Hard Drive : 80GB IDE, Maxtor, 7200 RPM
4 disk servers ea with 16 IDE drives, 4 rack units & 50 Amps, 480W to run
600W to spin up
weight 90 lbs/server
Dual P4 2.4 , e7500 Chipset, dual gigabit should be approximately 200W.
8 disks of 120 GB on each ATA RAID array; two such arrays per server.
PCIX 1/2/3/6 33MHz, PCIX 4 66MHz and PCIX 5 100MHz.
Slot
occupancy: 1 & 3
had a RAID controller,
slot 3 and 6 a Syskonnect (1GE) NIC, and slots 3 and 6 were empty.
Need 2 racks (rack has 42 units),
1 for GSR utilize 10 rack units and 20Amps/120V
1 for servers utilize 28 rack units and 50Amps/110V
Need punch outs between cabinets.
People needing access to Level 3 Gateway in Sunnyvale, access
procedure.
- September 24: face to face discussions at iGrid2002 between Les Cottrell of
SLAC, Paul Fernes of Level 3, and Linda Winkler of StarLight to discuss
concepts.
- September 28: face to face discussions between Les Cottrell and Harvey Newman of
Caltech at ICTP, Trieste, Italy to further expand the concept to include Caltech.
- October 11: Harvey Newman proposes involving CERN also.
- October 14: Paul Fernes agrees to propose project to Level 3 management.
- October 18: Cisco contacted to see if they can loan a GSR router to be
located at Sunnyvale, to accept the OC192 POS from StarLight and
provide 1GE interfaces to servers.
- October 21: Phone meeting between Les Cottrell, Richard Mount and Gary
Buhrmaster of SLAC, Harvey Newman of Caltech, and Paul Fernes, Raymond.Struble
and Sarah.Bleau of Level 3.
- October 25: Phone meeting with SLAC, Caltech, Level 3 people, Linda Winkler of
StarLight, Mark Potter and Rodney Sepulveda of Cisco.
- October 30: Cisco send loan agreement for Cisco GSR 12406 plus OC192 and 20 GE
interfaces to SLAC.
- October 31: Level 3 puts order into system to turn up circuit from
Sunnyvale to StarLight. Phone meeting between Cisco, Level 3 & SLAC.
List of people
needing access to Sunnyvale sent, shipping
address for Sunnvvale determined, the Cisco loan agreement signed by
SLAC and returned to Cisco. Cisco expects to be able to ship within one work day.
Web site created.
- November 1: Caltech ship 12 cpu & 4 disk servers to Sunnyvale. Cisco
locks in the GSR, will need a 1550nm interface instead of 1310nm. Voice
conference.
- November 5: Cisco locating card to support 1550nm interface, reconfiguring
order, getting approval for increased loan cost. Hope for answer later
today. Level 3 say they will have the circuit turned up and tested by Friday
November 8, 2002. Linda assigns address etc. for
Sunnyvale - StarLight OC192 link.
- November 6: Cisco has 1550nm card, have reconfigured order and got
approval, still need a power supply. Procedures for
access to Sunnyvale gateway sorted out.
- November 7: Suresh Singh from Caltech visits Sunnyvale. SLAC able to ssh
to hosts at CERN & StarLight. Voice conference. Cisco hopes for delivery
of GSR Nov 8, 2002. Level 3 will turn up circuit Nov 8, 2002, time unknown
at moment , Ray will determine. If GSR not available circuit will be put in
loop-back at Sunnyvale end. Linda is traveling Sunday 10, Aug, and will be
engulphed by SC2002 Monday and part of Tuesday. Gary will determine AS
(autonomous system) for Sunnyvale end. Fiber patched through at StarLight,
need circuit ID & loop-back to proceed.
- November 8: 16 * 1GE port card fails in Cisco 7609 at CERN. Gary
recommends using 198.51.111.0/24 for the Sunnyvale subnet (not advertised
anywhere else), the AS is 3671. Cisco has all parts for GSR, tests and burnt
it in by 1:30pm. Mark Potter & Rodney Sepulveda deliver to Sunnyvale 3pm
and with Gary install in rack. Level 3 were unable to conduct their
end-to-end tests and will complete them Monday. In the meantime they have
made the link available to us. Gary & Linda have worked out the subnets
etc. and Gary has (theoretically) configured the GSR. Gary & Linda will
try out the circuit Saturday. Suresh has about half the compute servers
configured. He hopes to complete them Saturday Nov 9 '02, then they will
need to be mounted in the rack, and connected to the GSR. The first 4u disk
server was delivered to Caltech.
- November 9: Linda reports "I think they still have a loop up in
Chicago
SONET alarms : LOL, PLL, LOS
SONET defects : LOL, PLL, LOF, LOS, SEF, AIS-L, AIS-P
The Chicago end is configured and ready to go."
- November 10-11: Linda flew to Baltimore on Nov 10. Gary reconnected the
loop-back at Sunnyvale on Nov. 10. Charlie Cumalat reported the circuit
is being tested starting at 10:00am Nov. 11. Should be completed by 2pm
- Nov. 11. Ray Struble will check with Sarah about re-connecting
the Juniper router
at StarLight when the circuit test is completed. 2:45pm Level 3 call to
report circuit testing (circuit ID SNVACAID-CHCGILDC-00001) completed. Gary
connects GSR at Sunnyvale. Can't see Juniper at StarLight. Level 3
found test sets still connected & so disconnected. Still did not have
end-to-end connectivity. Called Level3 Technical Customer Assistance
support, opened ticket 536745.
- Nov 12: Level 3 set up conference call with Sarah Bleau at StarLight, Bryan Brown
at TCAM, Fernando Romero at Level 3 Virginia, Gary Buhrmaster at Sunnyvale
and Les at SLAC. Also brought in tech at Level 3 Gateway in Chicago.
With light-meter at StarLight, deduced after 45 mins that the circuit had
been left in loopback. Measured light after removing loopback, and check
xmit (cct 14 from Level3) and rcv (cct 20 from Level 3). After plugging back
into Juniper, it went green. Gary also sees link as live from GSR at
Sunnyvale. Gary now able to see router at StarLight. But we then needed a
static route inserted at StarLight. This has been done. Still need to get
external routing from StarLight to rest of world.
- Nov 13: Julian and Jan Lindheim reached the
point were they feel they can buy the remaining Seven 4U servers required.
Of the 8 in total, four will go to Sunnyvale and four to the SC2002 show
floor. With the XFS file system and two striped RAID0 arrays per box they
are getting 200 MB/sec read and 170 MB/sec write (info. from Julian). They
will try to understand and do better from now on, but they are already
reasonably close to their target of 240 MB/sec per box. With the additional
use of the disks in the one-disk 1U servers, they should be able to read and
write 1 GB/sec or more. When they try to read/write and also use 2 GbE ports
on each file server more careful measurements, etc. will be needed.
- Nov 14: Linda says Bill Nickless will work on configuring this tonight the
Juniper so.we expect to be able to reach the Sunnyvale addresses via
Abilene.
- Nov 15: 7am PST, can't see Sunnyvale from Starlight CERN/Caltech cluster
or from CERN. Bill Allcock runs into problem/bug with Juniper router that
hinders announcing Sunnyvale.
- Nov 16: CERN, SciNet and others get routing working between CERN,
StarLight and SC2002. Can't see Sun nyvale hosts from SLAC, probably will
remain that way, but we can get to SLAC from CERN & StarLight. Gary sets
up network in booth at SC2002.
- Nov 17: Set up NT and Linux hosts in booth. The "commodity" and the QBSS
network is up at SC2002. This means SC2002-4,5,6 are connected. 3:00pm Connie
has sc2002-4 & 6 running. We are still waiting for SciNet to get the 10Gbits/s
interface running. The remaining Linux hosts (sc2002-1,2,3 hebe iphicles) are
connected to the 10Gbit/s. Antony Antony from NIKHEF stopped by and we got
264Mbits/s iperf TCP from NIKHEF/Amsterdam to Sunnyvale with 1 stream and a
64MByte window. The hosts (145.146.96.3 and 198.51.111.10) both have 1GE
interfaces and the path should be 10Gbit/s from NIKHEF through StarLight to
Sunnyvale.
- Nov 18: 10GE interface and advertising working. Jerrold gets many of
problem remote sites ssh, iperf problems resolved. Got regular demos (pingworld,
ABW & replay working (still has problems with stalling)). Problems with power
in booth. Installed 3 Caltech machines in our booth to get around Cisco 7609
8Gbits/s limitation. Interesting ping RTTs seen to SLAC, FNAL, TRIUMF
possibly due to testing causing congestion.
- Nov 19: Get Sunnyvale router SNMP monitoring working. Work on QBSS demo.
Made iperf TCP measurement
using the Caltech FAST
TCP stack, from 6 hosts
from SC2002 to Sunnyvale. We achieved about about 5Gbits/s. The 3 hosts in the
SLAC booth delivered about 3Gbits/s according to the
router SNMP statistics.
- Nov 20: Yesterday's tests caught attention of NOC. They decide to move
challenge for us to a later time (5pm) to avoid competition with other
traffic. Start running at 5pm with 8 hosts in Caltech booth and 4 in SLAC
booth. Get about 9 Gbits/s to Sunnyvale. Also run floodperf from 4 hosts and
sequential from 1. Floodperfs generate about 1.2Gbits/s. NOC reports
they need to do more work to accommodate measuring throughput from 2 booths.
Challenge is rescheduled to a later time when NOC can measure. After intense
discussions decide to formally merge SLAC & Caltech challenges so we can
aggregate the 2 10GE links (one via TeraGrid to Sunnyvale) the other via
Abilene. Move 1 host from Caltech to SLAC booth and recompile and reinstall
the Linux kernel to support the new Caltech kernel on 3 SLAC hosts (sc2002-2,3
and iphicles). Run iperf to Sunnyvale from 8 hosts in Caltech booth and 2 in
SLAC booth, and to Chicago from 5 hosts in SLAC booth. Startup
challenge around 9pm, lose power in SLAC booth, recover, achieve about
11.5Gbits/s sustained to the 2 links acording to the
SCinet measurements. The Sunnyvale link
was getting about 8-9Gbits/s. Finish up around 10:45pm. Went for dinner to
celebrate. At the time our challenge was leading in bandwidth by a wide
margin (next closest was ANL with about 3.6MBits/s). LBNL still has to
make its challenge.
- Nov 21: LBNL using 3 * 10GE links and the UDP based Visapult application
won with 17Gbits/s.
- Nov 22: Three disk servers delivered to Sunnyvale.
- Nov 26: Les, Gary & Charley go to Sunnyvale, install disk servers in rack,
connect up power & network fibers. Run into disk image problems. Succeed
in booting 2 disk servers (198.51.111.56 and 198.51.111.64), however there
could be file system problems and the RAID needs work.
- Nov 27: Cisco sends SLAC loan agreement for 10GE card. Can logon to
Sunnyvale disk servers from CERN.
- Dec 2: Ray Struble of Level(3) reports he hopes to extend the
circuit loan for an additional 60-90 days. Also working with Cisco to
loan more 10GE cards. T640 at StarLight cannot currently
accomodate more 10GE cards. Loan agreement signed and sent from SLAC
to Cisco.
Created October 31
Comments to iepm-l@slac.stanford.edu