Analysis of canonical examples | Spreadsheets of canonical examples

In this report we will explain the different aspects of Holt-Winters algorithm to detect loss of performance in time series dynamic bottleneck capacity (dbcap) measured by the Available Bandwidth Estimation (ABwE) abing tool. We will explain how the Holt-Windows algorithm works, and show some hosts analysis to demonstrate the correct functioning of this method.

Holt-Winters algorithm is basically a quantitative forecasting method that uses mathematical recursive functions to predict the trend behavior. It uses a time series model to make predictions assuming that the future will follow the same pattern as the past. In this particular case we have seasonal patterns corresponding to weekly periodicity; this means that we should use a factor in our equations that uses information from past weeks to make a more accurate prediction of what’s going to happen in the future. The Holt-Winters algorithm equations [1] are given by:

_{}

Where:

**y _{t}** is the observed value at time t. In this case the
time of the dbcap measurement.

**S _{t}** is the smoothed observation at time t.

**b _{t}** is the trend smoothing at time t.

**I _{t}** is the seasonal smoothing at time t. In the equations
we can see that we always use

**L** is the number of periods that complete 1 season. If we are making
our measurements every 3 minutes and the season is 1 week long, *L= 3360* (there are 10080 minutes in one
week so just divide this number by 3).

**F _{t+m}** is the forecast/prediction at m periods ahead

**m** is the number of periods ahead we want to predict. If we are
making our measurements every *n*
minutes, an m=3 will means that we are predicting *3n* minutes in the future.

_{} is the overall
smoothing parameter.

_{} is the seasonal
smoothing parameter.

_{} is the trend
smoothing parameter.

*But, what do these parameters mean?*

_{} is basically
the short term parameter. A large value of_{} will give a large weight to
measurements very near in the past, while a small value of alpha will give more
weight to measurements further in the past.

_{} is the trend
parameter. A big value of _{} will give more
weight to the difference of the last smoothed observations; while a little
value of _{} will use information further in the past.

_{} is the seasonal parameter. A big value of _{} will give a big weight to the present relation
between the observation and the smoothed observation, and little values of _{} will give more weight to past weeks relation
between the observation and the smoothed observation.

In the equations we can see that
we have values corresponding to inexistent periods of time, like the case of *S _{t-1}* or

We will just use* S _{0}* =0.

_{}

This value of
the trend factor *b* is mainly an
average of the differences in the observations of the first two weeks divided
by the length of a week.

For k=1,2,3,…,L-1,L

_{}

This value of the seasonal index is basically making an average of the observed values of every week and then making an average of the day n of every week divided by the average of the observed values for its week.

To
find the parameters , _{} and _{} we will use the minimum mean square error,
where the error is the difference between the forecast and the observed values
at time *t*. The minimization equation
is given by:

_{}

In
every case we got a _{} equal to zero. and _{} generally take values
from 0^{+} to 0.3.

This method requires regularly time spaced data, which we don’t have. To fix this we order the data in 3 minutes bins. If there is more than one measurement in a particular bin we average the data points, but there may be certain bins without data, and the Holt-Winters algorithm not only requires regularly time spaced measurements, but also existent data at all times. To manage this problem, we update the first week data such that it has measurements for all the 3 minute intervals. Find the nearest week () in the future such that for each empty bin in the first week (), there is some intermediate week () having data corresponding to the empty bin. For each empty bin in the first week, average over the corresponding data points from to. With the next weeks, if there is missing data we use the previous week to fill the hole.

This is an example of some measurements made.

#date time dbcap

06/20/2004 18:01:05 916.200

06/20/2004 18:07:39 966.800

06/20/2004 18:20:33 943.500

06/20/2004 18:25:14 1000.0

06/20/2004 18:27:04 700

We transform this data into this:

#date time dbcap

06/20/2004 18:01:05 916.200

06/20/2004 18:04:05 0

06/20/2004 18:07:05 966.800

06/20/2004 18:10:05 0

06/20/2004 18:13:05 0

06/20/2004 18:16:05 0

06/20/2004 18:19:05 943.500

06/20/2004 18:22:05 0

06/20/2004 18:25:05 850 (here we average 1000 and 700)

In the following graphic we can see the results for SLAC in Caltech. We see the moving averages of the measurement and the forecast. The problem zone is shown with the red circle. Actually we are looking for the best way to trigger alerts. In this case we are using the standard deviation of the forecast. The condition is then: if the residual (Forecast-Observation) is bigger than the standard deviation in 70% of the points for the last 2,5 hours, we will trigger an alarm.

We can make a zoom on the red circle to see what happens:

We
can see that the alarm is triggered, so the loss of performance is detected. We
can see that in

On

Detailed results from all the nodes using this algorithm can be seen HERE.

In [2] Jake Brutlag presents an implementation of Holt-Winters algorithm which for the most part is same as the version discussed in previous sections. The two techniques differ primarily in their seasonal coefficients calculation. Brutlag’s implementation stores the seasonal contributions as an additive component to the forecast, while the implementation discussed above stores the seasonal contribution as a proportion of the baseline and hence is a multiplicative factor. Further, [2] also handles seasonal variability (e.g. application requests fluctuate wildly minute by minute during the peal hours of 4-8 pm, but at 1 am application requests hardly vary at all) as a deviation envelope (confidence band) with respect to the forecast.

The measure of deviation is a weighted average absolute deviation, updated via exponential smoothing:

Where,

d_{t-m} = Deviation from
last seasonal cycle (m is the seasonal period)

= Observed value at time t

= Forecast value

= Seasonal adaptation parameter

The confidence band is the
collection of intervals for each time point y_{t}
in the time series. A simple mechanism to detect an anomaly is to check if an
observed value falls outside the confidence band (a violation). However this
often yields a large number of false positives. A more robust mechanism is to
use a moving window of a fixed number of observations and generate an alert if
the number of violations within the window exceeds a specified threshold.
Details on how to choose model parameters () and initializing and updating baseline, linear trend,
seasonal trend can be found in [2].

The results in this section were produced with the following model parameter guidelines:

: Observations in the last one day account for 99% of the weights in baseline calculation

: Observations in the last one day account for 50% of the weights in linear trend calculation

: Observations in the last seven days account for 99% of the weights in seasonal trend calculation

The numerical values based on the above guidelines and formulae in [2] are:

= 0.00955

= 0.001443

= 0.00137

The time points correspond to a 3 minute bucket each, and the seasonal cycle period is kept as 7 days (=480*7 time points). The window length is 28 with a threshold value of 23 to trigger alerts. The original dataset has several spans of missing observations. These when used directly generate forecasts in which the seasonal cycle is out of sync with the observations causing incorrect alerts. Hence the dataset is preprocessed as in the previous implementation to fill the gaps with estimates from previous week, where the first week is initially generated to ensure no gaps.

**Florida**** Dataset**

The vertical black lines indicate the alerts triggered by the algorithm. In an expanded view (shown below) of the black line near Week 32, we find that there is a sharp dip in dbcap value on a Sunday for about an hour which is unlike what the history dictates and the algorithm has experienced. Hence, it triggers an alert indicating anomalous behavior. Similar reason for the initial black lines, although the dip in these cases is smaller, the number of violations exceeds the threshold in the window of observations, causing an alert.

**Florida**** Dataset (around Week 32-33)**

We have
already seen how the data in

**NSLAB
: Nice Periodicity**

**MCS
: Interesting step change and loss of performance during a whole week**

**LSA
: Clear abnormal behavior with no seasonal periodicity**

**JPAPAN
: very continuous data with some down impulses**

**UTDALLAS
: Impulses every 2 weeks and some random ones**

**TRIUMF
: Step changes followed by a very periodic behavior**

**SOX
: Important weekly periodicity shape with different magnitudes followed by an
almost flat behavior. **

[1] Engineering Statistics Handbook, 6.4.3.5 Triple Exponential Smoothing, Available at http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc435.htm

[2] Jake D. Brutlag, “Aberrant Behavior Detection in Time Series for Network Monitoring”, In Proceedings of the USENIX Fourteenth System Administration Conference LISA XIV,

__Question or comments on this document, please contact us:__

*Felipe Haro* felipeharo@gmail.com

*Mahesh Chhaparia* maheshkc@slac.stanford.edu

*Les Cottrell* cottrell@slac.stanford.edu