IEPM

Retrieving PingER Historical Data

SLAC Home Page

Retrieving Data

Since it's inception, the PingER project has amassed more than 5 years worth of historical ping data from a few tens of monitoring sites around the world to many hundreds of remote sites in over 100 countries. PingER provides some canned tabular and graphical reports. If further insight is needed then the PingER data has/can be accessed and used by collaborators and others in the industy to research network related theories. When access to archived data is needed for a user's research/use, there are a few options available for the user to selecting the data to download. When using this data, please acknowledge the IEPM-PingER project.

PLEASE NOTE: All times associated with the rawdata downloaded from PingER are measured with respect to GMT.

  1. Historical data is available for ftp download from here. Within the directory, you can choose to receive data for a specific year, dating back to 1997, or recent data contained in the hep/ directory. Simply download the data to your local area and go to work on analysing it.

  2. The user can visit our online distribution of historical data accessible at http://www-iepm.slac.stanford.edu/cgi-wrap/pingtable.pl, selecting various options in the form based on the user's needs, to view the data. From the web page, the data is available for download in tab-separated-value (.tsv) format which makes it easy to import into Excel. Another option via the Pingtable is that the user can write a script to use an HTTP GET with the appropriate QUERY_STRING in the URL and work with the data locally:
    e.g. http://www-iepm.slac.stanford.edu/cgi-wrap/pingtable.pl?format=tsv&dataset=hep&file=packet_loss&by=by-site&size=100&tick=hourly&year=2004&month=02&day=22&from=SLAC&to=WORLD&ex=none
    The example above will get you the hourly data from SLAC to all the sites in the world for Feb 22, 2004. Stepping through all the days for the last X days will get you all the data per the request. It will come back in tab separated format which can be imported into Excel or your favorite spreadsheet application.

  3. If you as a user are looking for collective data from all sites monitored based on a specific metric (see fig.1), please contact us by email describing what data (time window, metric) you need and what your intended use is. We will get back to you and probably prepare a zipped tar file of the data and make it available. Contained in this tarball are zipped files, one for each day, for each metric, for each packet size (100 or 1000). If no metric is specified, only a tarball of the 'average_rtt' will be returned. The tarball will be stored for a few days in a read only FTP directory where you can simply perform an FTP download of the data. A typical tarball is quite large (over a GByte).

    Once the tarball is detarred and the files unzipped, you will find that the individual files are space separated and contain a line header with the hour numbers. Each following line contains the structure:
    source_host_name destination_host_name metric_for_1sthour ... metric_for_nthhour source_host_name destination_host_name (see fig. 2).

    Figure 1: Metrics

    average_rtt/                   out_of_order_packets/
    conditional_loss_probability/  packet_loss/
    duplicate_packets/             throughput/
    ipdv/                          unpredictability/
    iqr/                           unreachability/
    minimum_packet_loss/           zero_packet_loss_frequency/
    minimum_rtt/
    

    Figure 2: Example of format

    For example the 1st 2 lines of file: packet_loss-100-by-site-2004-02-24.txt
    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 
    pinger.slac.stanford.edu www.jinr.dubna.su 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 pinger.slac.stanford.edu www.jinr.dubna.su www.jinr.dubna.su
    
    (i.e. the losses were always 0.000, no packets lost out of 10.)
  4. * If the user is in need of rawdata values (as gathered by getdata.pl from the monitoring hosts) seen from SLAC, you can use HTTP GET and a URL like: http://www.slac.stanford.edu/cgi-wrap/ping_data.pl to enter the begin and end dates for which you are searching for data. The user can also enter
    http://www.slac.stanford.edu/cgi-wrap/ping_data.pl?in_form=1&begin_hour=00&begin_min=00&begin_sec=00&begin_day=dd&begin_month=mm&begin_year=yyyy&begin_offset=&begin_point=y&end_hour=23&end_min=59&end_sec=59&end_day=dd&end_month=mm&end_year=yyyy&end_offset=&end_point=y
    and write a script to get all the data needed. NOTE:You will have to replace dd,mm, and yyyy respectfully to reflect the begin/end dates you are requesting data for.

    The contents of each line returned for the request are as follows:

    source_host_name source_host_addr destination_host_name destination_host_addr size unix_epoch_time sent rcvd min avg max seq_rcv(i=1,rcvd) rtt_rcv(i=1,rcvd) 
    
    Where: For example:
    pinger.slac.stanford.edu 134.79.240.30 multivac.sdsc.edu 132.249.20.57 100 1077235276 10 10 28.682 32.445 35.427 1 2 3 4 5 6 7 8 9 10 31.4 32.4 33.3 35.1 35.4 34.9 32.6 28.6 31.2 29.1
    
    The raw data is saved locally at slac in files of the form: /nfs/slac/g/net/pinger/pingerdata/hep/data/<node>/ping-<yyyy-mm-dd>.txt.gz for example /nfs/slac/g/net/pinger/pingerdata/hep/data/pinger.slac.stanford.edu//ping-2007-03-23.txt.gz

Revised 23 March 2007.
URL: http://www-iepm.slac.stanford.edu/pinger/mon-req.html
Comments to iepm-l@slac.stanford.edu