Chapter 4 • The Data Resource 105
related to meaningful data tend not to be desirable
because they are not stable. For example, a customer
identification number based on geographical region
and standard industrial classification (SIC) code will
no longer be valid if a customer moves or changes
primary businesses. Thus, it is wise to design a
meaningless, sequentially assigned code as the iden-
tifier and use such data as geographical location and
SIC code as other descriptive data.
2.Naming Distinct and meaningful names must be
given to each kind of data retained in organizational
databases. If two data elements have the same name,
their meaning will be confusing to users. If the same
data element is referred to by different names that are
never associated, business managers will think that
these are different pieces of data. Many organizations
develop a naming scheme or template for constructing
all data names, with common terms to be used for dif-
ferent elements of the scheme. For example, a data
name of employee-monthly-pay indicates which enti-
ty, which time period, and which type of data. Each of
the three components of this data name would be lim-
ited to a restricted vocabulary; for example, the time
period would have values such as daily and weekly,
and abbreviations for each could be assigned.
Standard names make naming new data elements eas-
ier and give a user a quick start on knowing what data
are on a report or in a certain database.
3.Definition Each data entity and element is given a
description that clarifies its meaning. The definition
should apply to all business circumstances and users.
Terms such as customer,employee, and productmight,
surprisingly, not have universal meaning. For example,
does customer refer to someone who has bought from
you or any potential consumer of your products or
services? Over the years, different parts of the business
might have developed their own interpretation of such
terms, so definitions must be constructed through
review by a broad range of organizational units.
4.Integrity rules The permissible range or set of
values must be clear for each data element. These
integrity rulesadd to the meaning of data conveyed
by data definitions and names. For example, a data
element of region is probably limited to some set of
valid values based upon sales territories or some
other artificial construct. In addition, a central and
single standard for valid values can be used by those
developing all data capture applications to detect
mistakes. Also, because exceptions might be per-
mitted, the integrity rules might specify who can
authorize deviations or under what circumstances
values outside of the valid set can be authorized.
5.Usage rights These standards prescribe who can
do what and when to each type of data. Such securi-
ty standards state the permissible uses for every type
of data (e.g., whole databases, individual files in a
database, particular records, or data elements in a
file). For example, a business manager might be
restricted to retrieving only the employee-monthly-pay
data element, only during regular business hours,
from an authorized device, and only about herself
and those people she supervises.
These data standards should be retained in a stan-
dards database called a metadata repositoryordata
dictionary/directory (DD/D). This central repository of
data about data helps users learn more about organization-
al databases. Database management systems should also
use the DD/D to access and authorize use of data.
MASTER DATA MUST CONFORM Almost all information
systems and databases refer to common subject areas of data
(i.e., people, things, places) and often enhance that common
data with local data relevant to only that application or
database. All applications that use common data from these
areas, such as customer, product, employee, invoice, and
facility, must refer to the same values, or different parts of the
organization cannot talk with one another without confusion.
Master data management (MDM)refers to the disciplines,
technologies, and methods to ensure the currency, meaning,
and quality of reference data within and across various
subject areas (White and Imhoff, 2006). MDM ensures that
everyone knows the current description of a product, the
current salary of an employee, and the current billing address
of a customer. MDM does not address sharing transactional
data, such as customer purchases.
No one source system usually contains the “golden
record” of all relevant facts about a data subject. For exam-
ple, customer master data might be integrated from customer
relationship management, billing, ERP, and purchased data
sources. MDM determines the best source for each piece of
data (e.g., customer address or name) and makes sure that all
applications reference the same virtual “golden record.”
There are three popular architectures for master data
management: identity registry, integration hub, and persist-
ent approach. In the registry approach, the master data
remains in their source systems, and applications refer to the
registry to determine where the agreed-upon source of the
particular data (such as customer address) resides. The reg-
istry helps each system match its master record with corre-
sponding master records in other source systems. Thus, an
application may have to access several databases to retrieve
all the data it needs, and a database may need to allow more
applications to access it. In the integration hub approach,