Discrete Mathematics for Computer Science

(Romina) #1
Complexity of Programs 299

(T(100)), and suppose you need to use this information to estimate how much time P
will take on a data set of size 10,000. (That is, suppose you need to estimate T(10,000)).
Can you do so?
One method is to find a function that passes through the given points and then evaluate
this function at other points. We call such a function an approximating curve. A simple
solution is the following: Plot the two points, (10, T(10)) and (100, T(100)) on a graph,
draw a straight line through them, and extend the line over larger values of x. See where
the straight line crosses the line x = 10,000, as shown in Figure 5.7. Use this value for
T(10,000).


Time
Consumed

10 100 5000 10,000
Time

Figure 5.7 Linear extrapolation of time.


This approximation makes a totally unjustified assumption. For it to be reasonable to
approximate the function T with a straight line, that time function must be a linear function
(or be fairly close to one at all points), and there is no reason to assume that T is linear.
You could improve the approximation a bit by measuring time for three sizes of data sets
(say, 10, 100, and 1000) and choosing the parabola that passes through those three points
as the approximating curve. However, that approximation would only be reasonable if you
had reason to believe the function was either quadratic or linear. For example, for some
computer programs, T will be an exponential function of the size of the data set. If that is
the case, then no amount of curve fitting with polynomial functions would give reasonable
grounds for extrapolation such as this.
The ideas presented in this section will provide tools for one to determine that the
time consumption of an algorithm is approximately linear, approximately quadratic, ap-
proximately exponential, or whatever. After coming to such a general conclusion, you can
(if you are very careful) legitimately use tools such as the curve-fitting procedure just de-
scribed to approximate times.
We are not saying that actual timing statistics are unimportant. We know this can be a
very important way to compare algorithms. All we are saying is that timing statistics are
usually far from enough.

Free download pdf