|
Restoring PingER Historical Data
|
|
Restoring Data
Recently, we have found ourselves restoring several portions of missing PingER data for
the pingtable.
This can be observed by noticing complete columns in pingtable.pl
with a period (.), i.e. no data.
Although this is not a 'common' practice, the frequency of having to repeat this task
has exposed the need for documentation on how to conduct this process.
This process essentially consists of running the analysis codes that are
typically run overnight via crontabs. You should NOT attempt to reanalyze the e
xisting data until you have corrected the problem that caused the gap in the data
to begin with. Otherwise, reanalyzing the data is a waste of time. Unfortunately,
problems that cause data gaps are too widespread and cannot be fully summized here
although, as problems do arise, they should be documented and linked to from this page.
Typically you may need to run getdata.pl to gather the data for the missing dates.
- All the analysis scripts are located in /afs/slac/package/netmon/pinger/analysis.
- The primary scripts used for analysis are (click for command line usage info):
- The analyze-all.pl script is a wrapper script taht runs analyze-hourly.pl for all values of packet size and site.
- analyze-all.pl executes analyze-hourly.pl for the number of days specified using the --date [nn]days option.
This will eliminate your need to execute analyze-hourly.pl for each day that needs to be restored.
This will analyze the data for the 5 days starting at the given date and can save a lot of time.
#Example of exec. analyze-all.pl
analyze-all.pl --date [nn]days
- You MUST run analyze-all.pl first, followed by analyze-daily.pl, then finally analyze-monthly.pl. This is needed because each script feeds off the data
generated by the previous script. Running analyze-all.pl takes about 2.5 mins/day for each of
100Byte and 1000Byte packets and by-site and by-node, i.e. about 10 minutes/day.
- If needed, to analyze data for a specific date in the past, analyze-hourly.pl can be run for a specific day of data.
Just include "--date 2004-05-16" to run for hourly data on May 16, 2004.
#Example of exec. analyze-hourly.pl for May 16, 2004:
analyze-hourly.pl --basedir /nfs/slac/g/net/pinger --usemetric --dataset hep --date 2004-05-16
# Input data comes from file names of the form:
#/nfs/slac/g/net/iepm-bw/pingerdata/hep/data/[machinename]/ping-2004-05-16.txt.gz
#The format of these files is described in example 2 of
#http://www-iepm.slac.stanford.edu/pinger/tools/retrievedata.html
- analyze-daily.pl and analyze-monthly.pl have to be run for packet sizes of 100 and 1000 and also must run specifying by-node or by-site data. analyze-daily.pl must also be run for 60 and 120 day calculations. Essentially, analyze-monthly.pl needs to be run 4 times while you will have to run analyze-daily.pl 12 times to account for the 60 and 120 day calculations.
- analyze-all.pl runs serially (one analyze-hourly job after another). To submit all the jobs for a certain time dimension (hourly, daily, monthly, etc.) to the batch system, use the following scripts (in the same directory as the analyze-* scripts):
- submit_hourly_jobs.sh <yyyy-mm-dd>
- submit_daily_jobs.sh <yyyy-mm>
- submit_monthly_jobs.sh
- submit_allmonth_jobs.sh
- submit_allyears_jobs.sh
Be sure to specify the date/year and month on the hourly and daily submissions as you would when running the analyze_* scripts individually, e.g.
- submit_hourly_jobs.sh 2004-05-16
- submit_daily_jobs.sh 2004-05
Note: the submit_* jobs must be run on a system licensed for the LSF batch tools, like the members of the Systems' group public machines.
- The crontabs to facilitate this are located on pinger@pinger at "[HOMEDIR]/.trs/crontab" and can
be reviewed with trscrontab -l and edited with trscrontab -e. A recent copy of the trscrontab
is available
here.
The trscrontab will be executed each night after midnight so once analyze-all.pl has been run
you can just await tomorrow morning to update the other files.
- Once these lines have been executed, you should be able to see the restored data in the
pingtable.
Revised 9 April 2008
URL: http://www-iepm.slac.stanford.edu/pinger/mon-req.html
Comments to iepm-l@slac.stanford.edu