We now begin the description of our approach. Note that we could
choose a threshold level anywhere in the continuous range from zero to any
large positive number. As a first order of business, we reduce the continuum
of choices to a finite set of discrete choices for the threshold level. This is be-
cause from a trading perspective, discretization of the levels makes a lot of
sense. To convince ourselves of that, let us consider the situation where we
calculate the observed spread using the last traded price in both the securi-
ties. Even though the spread is calculated that way, we can typically expect
to buy a security on the bid price and sell the other on the offered price. The
implication is that we may have to be willing to give up the bid-ask spread
in both the stocks when putting on the spread. This is called slippagein the
process of trading. Now let us consider two candidate threshold levels that
are spaced very close to each other. Note that if the spacing between them is
less than slippage costs, then as far as trading goes, the two levels are virtu-
ally indistinguishable. Therefore, it does make sense to have the candidate
threshold levels apart by at least the estimated slippage on trading.
Having established the candidate threshold levels that extend symmet-
rically above and below the mean value of the spread series, we are now
ready to start the process. We begin with a simple count of the number of
times the spread exceeds a particular threshold. When the threshold is above
the mean, this is the number of times the spread is greater than the thresh-
old. Similarly, when the threshold is below mean, it is the number of times
it is the number of times the spread value is below the threshold. This count-
ing method mimics the trading style where we put on a spread position
whenever we observe that the threshold has been exceeded and liquidate the
position when we hit the mean value. If we go with the assumption that the
spread moves are symmetric about its long run mean value, we can margin-
ally improve the estimate and reduce the bias by averaging the frequency
count for the positive and negative values for the same absolute value for the
threshold.
This count for each threshold level can then be multiplied by the profit
value corresponding to the threshold level to obtain the raw profit function.
Given that this data is from a small sample set, we can expect this to be
noisy. Figure 8.2a is the plot of the probability estimates from a simple
count of the number of level crossings. The underlying spread series is a
white noise sample with 75 data points. Figure 8.2b is the plot of the profit
made for trading the spread at a particular threshold, which is equal to the
threshold value itself. Figure 8.2b is therefore a straight line with slope 1.
The raw profit profile is a product of the two values for a given threshold.
Figure 8.3 is a plot of the raw profit profile for the white noise sample
with 75 data points, as shown. One can see how noisy and jagged the curve
is. If the raw curve were to be used as is, it could be rather confusing to arrive
at any meaningful conclusion on where exactly the thresholds must be placed.
128 STATISTICAL ARBITRAGE PAIRS