Microsoft® SQL Server® 2012 Bible

(Ben Green) #1

1290


Part IX: Business Intelligence


Problem Type Primary Algorithms

Forecasting Time Series
Sequence Analysis Sequence Clustering

These are guidelines only because not every data mining problem falls into these catego-
ries. In addition, there may be other algorithms that you can apply to the listed problem
types.

Decision Trees
The decision trees algorithm is the most accurate for many problems. It operates by build-
ing a decision tree beginning with the All node, corresponding to all the training cases, as
shown in Figure 57-3. Then an attribute is chosen to split those cases into groups, which
then separate based on another attribute, and so on. The goal is to generate leaf nodes with
a single predictable outcome. For example, if the goal is to identify who will purchase a
bike, then leaf nodes should contain cases that are either bike buyers or not bike buyers,
but no combinations (or as close to that goal as possible).

FIGURE 57-3
This is a great example of the decision tree being implemented.

TABLE 57-2 (continued)

c57.indd 1290c57.indd 1290 7/31/2012 10:35:02 AM7/31/2012 10:35:02 AM


http://www.it-ebooks.info
Free download pdf