1290
Part IX: Business Intelligence
Problem Type Primary AlgorithmsForecasting Time Series
Sequence Analysis Sequence ClusteringThese are guidelines only because not every data mining problem falls into these catego-
ries. In addition, there may be other algorithms that you can apply to the listed problem
types.Decision Trees
The decision trees algorithm is the most accurate for many problems. It operates by build-
ing a decision tree beginning with the All node, corresponding to all the training cases, as
shown in Figure 57-3. Then an attribute is chosen to split those cases into groups, which
then separate based on another attribute, and so on. The goal is to generate leaf nodes with
a single predictable outcome. For example, if the goal is to identify who will purchase a
bike, then leaf nodes should contain cases that are either bike buyers or not bike buyers,
but no combinations (or as close to that goal as possible).FIGURE 57-3
This is a great example of the decision tree being implemented.TABLE 57-2 (continued)c57.indd 1290c57.indd 1290 7/31/2012 10:35:02 AM7/31/2012 10:35:02 AM
http://www.it-ebooks.info