data-architecture-a

(coco) #1

and RDBMS engines on demand. It is not suggested that it will be fast, but rather that it
can be easily accomplished.


Deeper analysis of this subject is covered in Data Vault 2.0 boot camp training courses
and in Data Vault 2.0 published materials. It is beyond the scope of this book to dive
deeper into this subject.


Business Keys


Business keys have been around for a long time, if there have been data in operational
applications. Business keys should be smart or intelligent keys and should be mapped to
business concepts. That said, most business keys today are source system surrogate IDs,
and they exhibit the same problems that sequences mentioned above exhibit.


A smart or intelligent key is generally defined as a sum of components where digits or
pieces of a single field contain meaning to the business. At Lockheed Martin, for
example, a part number consisted of several pieces (it was a superkey of sorts). The part
key included the make, model, revision, and year of the part, like a vehicle identification
number (VIN) found on automobiles today.


The benefits of a smart or intelligent key stretch far beyond the simple surrogate or
sequence business key. These business keys usually exhibit the following positive
behavior at the business level:



  • They hold the same value for the life of the data set.

  • They do not change when the data are transferred between and across business OLTP applications.

  • They are not editable by business (most of the time) in the source system application.

  • They can be considered master data keys.

  • They cross business processes and provide ultimate data traceability.

  • Largest benefit can allow parallel loading (like hashes) and also work as keys for geographically
    distributed data sets—without needing recomputation or lookups.


They do have three downfalls: (a) length, generally, smart business keys can be longer
than 40 characters; (b) meaning over time, the base definition can change every 5–15
years or so (just look at how VIN number has evolved over the last 100 years); (c)
sometimes, source applications CAN change the business keys, which wreaks havoc on
any of the analytics that need to be done.


If given the choice between surrogate sequences, hashes, and natural business keys,
natural business keys would be the preference. The original definition (even today) states


Chapter 6.2: Introduction to Data Vault Modeling
Free download pdf