Microsoft® SQL Server® 2012 Bible

1182

Part IX: Business Intelligence

Transform Defi nition/Use

Percentage Sampling Splits a data fl ow by randomly sampling the rows for a given percentage. For example, this could be used to separate a data set into training and testing sets for data mining. Pivot Denormalizes a data fl ow, similar to the way an Excel pivot table oper- ates, making attribute values into columns. Row Count Counts the number of rows in a data fl ow and places the result into a variable. Confi gure by populating the VariableName property.

Row Sampling Nearly identical to the Percentage Sampling transform, except that the approximate number of rows to be sampled is entered, rather than the percentage of rows. Script Component Using a script as a transformation enables transformations with com- plex logic to act on a data fl ow. Slowly Changing Dimension

Compares the data in a data fl ow to a dimension table, and, based on the roles assigned to particular columns, maintains the dimension. This component is unusual in that it does not have an editor; instead, a wizard guides you through the steps to defi ne column roles and interactions with the dimension table. At the conclusion of the wizard, several components are placed on the design surface to accomplish the dimension maintenance task. Sort Sorts the rows in a data fl ow by selected columns. Confi gure by selecting the columns to sort by. Then, in the lower pane, choose the sort type, the sort order, and the comparison fl ags appropriate to the data being sorted. This is a fully blocking transform; it can adversely affect performance. Term Extraction Builds a new data fl ow based on terms it fi nds in a Unicode text column (DT_WSTR or DT_NTEXT). This is the training part of text mining, whereby strings of a particular type generate a list of commonly used terms, which is later used by the Term Lookup component to identify similar strings. Term Lookup Provides a “join” between a Unicode text column (DT_WSTR or DT_ NTEXT) in the data fl ow and a reference table of terms built by the Term Extraction component. Union All Combines rows from multiple data fl ows into a single data fl ow, assuming the source columns are of compatible types. Confi gure by connecting as many data fl ows as needed to the component. Then, using the editor, ensure that the correct columns from each data fl ow are mapped to the appropriate output column. Unpivot Makes a data fl ow more normalized by turning columns into attribute values. For example, a data fl ow with one row for each quarter and a column for revenue by region could be turned into a three-column data fl ow: Quarter, Region, and Revenue.

TABLE 52- 4 (continued)

c52.indd 1182c52.indd 1182 7/31/2012 10:29:30 AM7/31/2012 10:29:30 AM

http://www.it-ebooks.info

Microsoft® SQL Server® 2012 Bible

Get our desktop app

Company

Features

Documentation

Resources