The Art of R Programming

for (i in 1:(n-k)) { if (sum(x[i:(i+(k-1))]) >= k2) pred[i] <- 1 else pred[i] <- 0 } return(mean(abs(pred-x[(k+1):n]))) }

predb <- function(x,k) { n <- length(x) k2 <- k/2 pred <- vector(length=n-k) sm <- sum(x[1:k]) if (sm >= k2) pred[1] <- 1 else pred[1] <- 0 if (n-k >= 2) { for (i in 2:(n-k)) { sm <- sm + x[i+k-1] - x[i-1] if (sm >= k2) pred[i] <- 1 else pred[i] <- 0 } } return(mean(abs(pred-x[(k+1):n]))) }

Since the latter avoids duplicate computation, we speculated it would be faster. Now is the time to check that.

> y <- sample(0:1,100000,replace=T) > system.time(preda(y,1000)) user system elapsed 3.816 0.016 3.873 > system.time(predb(y,1000)) user system elapsed 1.392 0.008 1.427

Hey, not bad! That’s quite an improvement. However, you should always ask whether R already has a fine-tuned function that will suit your needs. Since we’re basically computing a moving aver- age, we might try thefilter()function, with a constant coefficient vector, as follows:

predc <- function(x,k) { n <- length(x) f <- filter(x,rep(1,k),sides=1)[k:(n-1)] k2 <- k/2 pred <- as.integer(f >= k2) return(mean(abs(pred-x[(k+1):n]))) }

328 Chapter 15

The Art of R Programming

Get our desktop app

Company

Features

Documentation

Resources