many different names. Methods that use a log-like function for the cost
measure are popularly called maximum entropy methods, or MEM methods
for short. If the cost measure is the sum of the squared difference between
adjacent points in the estimated function, then it is called Tikhonov-Miller
regularization.
So, the question now turns to what kind of regularization we should use
in our case; that is, what property of the function should we try to capture
in the regularizing cost measure? In addition to being monotonically de-
creasing, the other property that we can expect from the function is that it
is smooth. We could therefore use the Tikhonov-Miller regularization to en-
sure smoothness of the resulting estimate. We now describe the Tikhonov-
Miller process.
We are given data points. The xi
series in the data set refer to the threshold values, and the yiseries are the
counts corresponding to them. If this data set is representative of the actual
function, then we should use the values of yas is in the final function. How-
ever, the data are from a single sample set and therefore contain the peculi-
arities unique to this sample, leading to a step function form for the counts.
We expect the curve to be a smooth monotonic decreasing function. As
mentioned earlier, we introduce a penalty term for the roughness of the
curve. This penalty term is the sum of squared differences between adjacent
points of the estimated function. The cost function to minimize is now a
weighted sum of the two cost measures as shown next.
Let z 1 ,z 2 ,z 3 ,...,znrepresent the estimated function for the points
x 1 ,x 2 ,x 3 ,...,xncorrespondingly. The cost function with the two terms is then
(8.1)
It is easy to recognize that the first part is a least-squared cost measure, and
the second part is the penalty for roughness of the curve. Note that this cost
measure is multiplied by a term l. This is the trade-off factor, and it a meas-
ure of how much of a fit error we are willing to allow for reducing the
smoothness cost by one unit. The problem of Tikhonov-Miller regulariza-
tion is to minimize this function with an appropriate value of l. Note that
the choice of lis crucial, as it determines the trade-off between smoothness
and fit error and, in turn, the final shape of the function. So, how do we de-
termine the correct value for land the resulting regularized function? We
will describe the process by way of example.
Let us consider a simulated white noise sample that we generated. We
design the threshold values and do a basic count of the number of times the
()− +−()+...+()−
−
zz 12 zz z znn
2
23
2
1
2
λ
cost=−()yz yz 11 +−()+...+ −()yznn+
2
22
22
()( )()xy xy xy 11 ,, ,, , ,, ,2 2 33 ...()xynn
132 STATISTICAL ARBITRAGE PAIRS