data-architecture-a

(coco) #1

The link in this case carries the key matches (from and to). This type of link structure can
be utilized to connect master key selections or to explain the key mapping/changing from
one source system to another. It can also be utilized to represent multilevel hierarchies
(not shown here).


Note that importing the Excel spreadsheet shows the first step toward managed self-
service BI (managed SSBI). Managed SSBI is the next step in the evolution of data
warehousing. Allowing the business users to interact with the raw data sets in the
warehouse and affect their own information marts by changing the data.


The data vault model not only provides immediate business value but also is capable of
tracking all relationships over time. It demonstrates the different hierarchies of data (even
though this is highly focused on two particular business keys at the moment) that are
possible when loading into the warehouse.


By tracking the changes to business keys that exposes the relationship across and
between business keys, the business can then begin to ask and answer the following
questions:



  • How long does my customer account stay in sales before it is passed to procurement?

  • Can I compare an AS-SOLD image with an AS-CONTRACTED image and an AS-
    MANUFACTURED image with an AS-FINANCED image?

  • How many customers do I actually have?

  • How long does it take for a customer/product/service to make it from initial sale to final delivery in
    my business?


Many of these questions cannot be answered without a consistent business key that spans
the different lines of business.


Why Restructure the Data From the Staging Area?


Restructuring allows integration across multiple systems into a single place in the target
data warehouse without changing the data set itself (i.e., no conformity). This is called
passive integration. Data are considered passively integrated by business key because
there is no change to the raw data. It is integrated according to the location (i.e., all
individual customer account numbers will exist in the same hub, while all corporate
customer account numbers exist in a different hub).


In the age of big data, staging areas are also known as landing zones, data dumps, or data
junkyards. Staging areas are a logical concept that can manifest themselves physically in


Chapter 6.2: Introduction to Data Vault Modeling
Free download pdf