The management of large amounts of data
The management of large amounts of data is a consuming issue because there are indeed
large amounts of data that need to be handled. There is a science to the handling of large
amounts of data unto itself.
Notwithstanding the need to manage large amounts of data, there is still a need for
creating an architecture of data.
Active/Passive Indexing of Data
One of the most useful design techniques the architect can use is that of creating different
kinds of indexes of data. In any case, an index is useful in helping find data. It is always
faster to locate data through an index than it is to search the data directly. So, indexes
have their place in analytic processing.
The way that most indexes are built is through starting with a user requirement to access
data and then building an index to satisfy that requirement. When an index is built in this
manner, it can be called an active index because there is an expectation that the index
will be actively used.
But there is another type of index that can be built, and that index is a passive index. In a
passive index, there is no user requirement to start with. Instead, the index is built “just in
case” somebody in the future wants to access the data according to how the data are
organized. Because there is no active requirement for the building of the index, it is called
a “passive” index.
Fig. 9.2.8 shows both active and passive indexes that can be built.
Chapter 9.2: Analyzing Repetitive Data