Microsoft® SQL Server® 2012 Bible

(Ben Green) #1

1284


Part IX: Business Intelligence


■ (^) Key:Choose the columns that uniquely identify a row in the training data.
■ Input:Mark each column to use in a prediction — generally this includes the
predictable columns as well. The Suggest button may aid in selection after
the predictable columns have been identifi ed. Avoid inputs with values that are
unlikely to occur again as input to a trained model. For example, an address
might be effective at training a model, but when the model is built to look for a
specifi c address, it is unlikely new customers will ever match those values.
■ Predictable:Identify all columns the model should predict.
■ (^) Specify Columns’ Content and Data Type: Adjust the data type (Boolean, Date,
Double, Long, and Text) as needed. The Detect button calculates continuous ver-
sus discrete content types for numeric data. Available content types include the
following:
■ (^) Key:Contains a value that, either alone or with other keys, uniquely identifi es a
row in the training table.
■ (^) Key Sequence:Acts as a key and provides order to the rows in a table. It is used
to order rows for the sequence clustering algorithm.
■ (^) Key Time:Acts as a key and provides order to the rows in a table based on a
time scale. It is used to order rows for the time series algorithm.
■ (^) Continuous:Continuous numeric data — often the result of some calculation or
measurement, such as age, height, or price.
■ (^) Discrete:Data that can be thought of as a choice from a list, such as occupation,
model, or shipping method.
■ (^) Discretized:Analysis Services transforms a continuous column into a set of dis-
crete buckets, such as ages 0–10, 11–20, and so on.
■ (^) Ordered:Defi nes an ordering on the training data, but without assigning sig-
nifi cance to the values used to order. For example, if values of 5 and 10 are used
to order two rows, 10 simply comes after 5; it is not “twice as good” as 5.
■ Cyclical: Similar to ordered, but repeats values defi ned by a cycle in the data,
such as day of month or month of quarter.
■ Create Testing Set: In SQL Server 12, the mining structure can hold both the
training and testing data directly, instead of using separate tables. Specify the per-
centage or number of rows to be held out for testing models.
■ (^) Completing the Wizard: Provide names for the overall mining structure and
the fi rst mining model within that structure. Select Allow Drill Thru to enable the
direct examination of training cases from within the data mining viewers.
When the wizard fi nishes, the new mining structure with a single mining model is created,
and the new structure opens in the Data Mining Designer where you can make changes,
such as adding columns, to the model.
c57.indd 1284c57.indd 1284 7/31/2012 10:35:01 AM7/31/2012 10:35:01 AM
http://www.it-ebooks.info

Free download pdf