Microsoft® SQL Server® 2012 Bible

(Ben Green) #1

1147


Chapter 51: Business Intelligence Database Design


51


Instead of placing the product, productcategory, and productsubcategory into one
dimension, each individual table was included in the data warehouse schema. This
approach does add some slight complexities when building an OLAP cube. Therefore, careful
consideration should be taken before implementing this modeling approach.

Ensuring Consistency within a Data Warehouse


Inconsistent data can quickly void any confi dence that end users have garnered when
working with any data system. Because one of the primary reasons for designing a star
schema is to provide faster query response time when performing OLAP queries, it is pivotal
that the results are accurate and consistent. To ensure this level of accuracy and consis-
tency, data warehouse developers typically institute two basic design rules during the load
process.

The fi rst rule is to ensure that you are properly handling null or invalid values. This is
usually accomplished by replacing these values with values that are more meaningful
to an end user. The decision as to what these values are should not be determined by a
single person or group. This is an effort that should be decided upon by the organiza-
tion as a whole because the data warehouse will likely be used by a broad audience.
This results in a consistent interpretation across the entire organization. A common
practice is to insert rows into dimensions that satisfy as many inconsistent or invalid
data scenarios. The values are usually replaced with values such as Unknown, NA,
Internal, and so on.

The second rule relates to the fact table. After data has been inserted into the fact
table, it should not be updated or deleted. This could lead to varying results from
day to day. If this is a requirement, you should consider creating a separate table to
maintain these transient transactions. You should also inform the users of the level
of volatility of this data. This can establish user expectations about data and query
results for particular data sets and reduce the likelihood of inconsistent data for other
data sets.

Loading Data


When loading a data warehouse, it typically starts with loading the dimensions and then
the facts. The obvious reason is to satisfy the fact tables’ foreign key constraints. In the
SQL Server world, the Extraction, Transformation and Loading (ETL) is accomplished using
SQL Server Integration Services (SSIS). Chapter 52, “Building, Deploying, and Managing
ETL Workfl ows in Integration Services,” provides a detailed explanation of how to do this.
Figure 51-3 shows a simple dimension load with SSIS, and Figure 51-4 illustrates a fact load.
This chapter focuses on loading the data with pure Transact-SQL.

c51.indd 1147c51.indd 1147 7/31/2012 10:28:42 AM7/31/2012 10:28:42 AM


http://www.it-ebooks.info
Free download pdf