Advanced Rails - Building Industrial-Strength Web Apps in Record Time

Measurement Tools | 149

This gives us predictable results:

samples.sum # => 53.0 samples.length # => 5 samples.mean # => 10.6

Everyone is familiar with the mean, but the problem is that by itself, the mean is
nearly worthless for describing a data set. Consider these two sets of samples:

samples1 = %w(10 11 12 10 10 9 12 10 9 9).map{|x|x.to_f} samples2 = %w( 2 11 6 14 20 21 3 4 8 13).map{|x|x.to_f}

These two data sets in fact have the same mean, 10.2. But they clearly represent
wildly different performance profiles, as can be seen from their graph (see
Figure 6-1).

We need a new statistic to measure how much the data varies from the mean. That
statistic is thestandard deviation. The standard deviation of a sample is calculated by
taking the root mean square deviation from the sample mean. In Ruby, it looks like this:

module Enumerable def population_stdev Math.sqrt( map{|x| (x - mean) ** 2}.mean ) end end

This code maps over the collection, taking the square of the deviation of each ele-
ment from the mean. It then takes the mean of those squared values, and takes the
square root of the mean, yielding the standard deviation.

However, this is only half the story. What has been introduced so far is the
population standard deviation, while what we really want is thesample standard
deviation. Without completely diving into the relevant mathematics, the basic differ-
ence between the two is whether the data represent an entire population or only a
portion of it.

Figure 6-1. Two vastly different response-time profiles with the same mean

samples1

16

2

6

11

20

samples2

Advanced Rails - Building Industrial-Strength Web Apps in Record Time

2

6

11

20

Get our desktop app

Company

Features

Documentation

Resources