Microsoft® SQL Server® 2012 Bible

(Ben Green) #1

1289


Chapter 57: Data Mining with Analysis Services


57


■ (^) Analysis Management Objects (AMO) provides an environment for creating and
managing mining structures and other metadata, but not for prediction queries.
■ (^) The Data Mining Extensions (DMX) language supports most model creation and
training tasks and has a robust prediction query capability. DMX can be sent to
Analysis Services via the following:
■ ADOMD.NET for managed (.NET) languages
■ (^) OLE DB for C++ code
■ ADO for other languages
DMX is a SQL-like language modifi ed to accommodate mining structures and tasks. For
purposes of performing prediction queries against a trained model, the primary language
feature is the prediction join. Because the DMX query is issued against the Analysis
Services database, the models can be directly referenced. DMX also adds a number of
mining-specifi c functions such as the Predict and PredictProbability functions that
return the most likely outcome and the probability of that outcome, respectively.
Another useful form of the prediction join is a singleton query, whereby data is provided
directly by the application instead of read from a relational table. One example would be to
return the probability that a particular case would perform an action, for instance, return-
ing [Bike Buyer] = 1 to see if a customer is likely to buy a bike.
The Business Intelligence Development Studio aids in the construction of DMX queries via
the Query Builder within the mining model prediction view. Like the Mining Accuracy chart,
select the model and case table to be queried, or alternatively press the singleton button in
the toolbar to specify values. Specify SELECT columns and prediction functions in the grid
at the bottom. SQL Server Management Studio also offers a DMX query type with metadata
panes for drag-and-drop access to mining structure column names and prediction functions.


Algorithms


When working with data mining, it is useful to understand mining algorithm basics and
when to apply each algorithm. Table 57-2 summarizes common algorithms used for the
problem categories presented in this chapter’s introduction.

TABLE 57-2 Common Mining Algorithm Usage

Problem Type Primary Algorithms
Segmentation Clustering, Sequence Clustering
Classifi cation Decision Trees, Naive Bayes, Neural Network, Logistic Regression
Association Association Rules, Decision Trees
Estimation Decision Trees, Linear Regression, Logistic Regression, Neural Network
Continues

c57.indd 1289c57.indd 1289 7/31/2012 10:35:02 AM7/31/2012 10:35:02 AM


http://www.it-ebooks.info
Free download pdf