Let’s plot this on a graph, where we’ll make the x coordinate the number of
customers (we call this the explanatory variable) while the number of staff is
plotted as the y coordinate (called the response variable). It is the number of
customers that explains the number of staff needed and not the other way
around. The average number of customers in the stores is plotted as 6 (i.e. 6000
customers) and the average number of staff in the stores is 40. The regression
line always passes through the ‘average point’, here (6, 40). There are formulae
for calculating the regression line, the line which best fits the data (also known as
the line of least squares). In our case the line is ŷ = 20.8 + 3.2x so the slope is
3.2 and is positive (going up from left to right). The line crosses the vertical y
axis at the point 20.8. The term ŷ is the estimate of the y value obtained from
the line. So if we want to know how many staff should be employed in a store
that receives 5000 customers a month we could substitute the value x = 5 into
the regression equation and obtain the estimate ŷ = 37 staff showing how
regression has a very practical purpose.