working with 0 and 1 data, the number of 1s is simply the sum ofx[j]among
those days, which we can conveniently obtain as follows:
sum(x[i:(i+(k-1))])
The use ofsum()and vector indexing allow us to do this computation
compactly, avoiding the need to write a loop, so it’s simpler and faster. This
is typical R.
The same is true for this expression, on line 9:
mean(abs(pred-x[(k+1):n]))
Here,predcontains the predicted values, whilex[(k+1):n]has the actual val-
ues for the days in question. Subtracting the second from the first gives us
values of either 0, 1, or−1. Here, 1 or−1 correspond to prediction errors
in one direction or the other, predicting 0 when the true value was 1 or vice
versa. Taking absolute values withabs(), we have 0s and 1s, the latter corre-
sponding to errors.
So we now know where days gave us errors. It remains to calculate the
proportion of errors. We do this by applyingmean(), where we are exploiting
the mathematical fact that the mean of 0 and 1 data is the proportion of 1s.
This is a common R trick.
The above coding of ourpreda()function is fairly straightforward, and
it has the advantage of simplicity and compactness. However, it is proba-
bly slow. We could try to speed it up by vectorizing the loop, as discussed
in Section 2.6. However, that would not address the major obstacle to speed
here, which is all of the duplicate computation that the code does. For suc-
cessive values ofiin the loop,sum()is being called on vectors that differ by
only two elements. Except for cases in whichkis very small, this could really
slow things down.
So, let’s rewrite the code to take advantage of previous computation. In
each iteration of the loop, we will update the previous sum we found, rather
than compute the new sum from scratch.
1 predb <- function(x,k) {
2 n <- length(x)
3 k2 <- k/2
4 pred <- vector(length=n-k)
5 sm <- sum(x[1:k])
6 if (sm >= k2) pred[1] <- 1 else pred[1] <- 0
7 if (n-k >= 2) {
8 for (i in 2:(n-k)) {
9 sm <- sm + x[i+k-1] - x[i-1]
10 if (sm >= k2) pred[i] <- 1 else pred[i] <- 0
11 }
12 }
13 return(mean(abs(pred-x[(k+1):n])))
14 }
38 Chapter 2