Microsoft® SQL Server® 2012 Bible

(Ben Green) #1

1144


Part IX: Business Intelligence


that acts as the source for OLAP. The result of coupling these two is typically a cube, which
Chapter 53, “Building Multidimensional Cubes in Analysis Services with MDX,” discusses.

This chapter defi nes key concepts for data warehousing and techniques used to build and
load a data warehouse.

Data Warehousing


A data warehouse is the structural foundation of most OLAP and BI solutions. Data
warehouses, in most cases, are less normalized than the OLTP databases that act as its
sources. A typical data warehouse contains two types of tables: facts and dimensions. Fact
tables contain the measurable values, such as sales, number of sales, and count of items, to
mention a few. The values are often referred to as measures; dimensions categorize or group
those measurable values. As SQL Server has evolved, so has its capability to build OLAP
solutions on database structures that are not data warehouses. However, using a data ware-
house as the data source has several benefi ts and offers the developer a seamless process
for design and data refresh.

Designing a Data Warehouse Using a Star Schema


Data warehouse design technique is referred to as a Star Schema. As mentioned earlier, a
data warehouse is composed of facts and dimensions. The layout of these tables resembles a
star, as shown in Figure 51-1.

You see from the diagram that the one fact is central to multiple dimensions. Each
dimension is representative of a way that the measures in the fact can be grouped,
aggregated, or categorized. For example, you could use the SalesAmount column from
the FactInternetSales table and aggregate it by Customer to see the total sales for each
customer.

A fact table usually consists of two types of columns: The surrogate key column, which is
defi ned in the Surrogate Key section of this chapter, and the facts (or measures) column.

The dimension tables, on the other hand, contains a primary key (surrogate key), and
alternate key, and one or more attributes. The alternate key is not always present in the
dimension. It is, in most cases, the primary key for each row from the operational system.
The attributes are the data that categorizes the dimensions. For example, a student dimen-
sion may include attributes for name, address, city, state, and e-mail address. Dimensions
in many instances are denormalized versions of the data structures in the OLTP databases.

The most diffi cult decision that most organizations face is determining which dimensions
and facts to use in their data warehouse. The information technology group often tries to
make this decision without the involvement of the business, which is a critical mistake.

c51.indd 1144c51.indd 1144 7/31/2012 10:28:41 AM7/31/2012 10:28:41 AM


http://www.it-ebooks.info
Free download pdf