The Internet Encyclopedia (Volume 3)

(coco) #1

P1: 61


WL040C-197-Quin WL040/Bidgoli-Vol III-Ch-56 June 23, 2003 16:38 Char Count= 0


CONTENTREPRESENTATION ANDORGANIZATION 691

Content representation and organization refers to the
process in which data and document models are created
and relationships are mapped out as the architecture for
templates and interfaces. Subsequently, such models are
converted into logic designs and implemented into the sys-
tem. Creation of representation and organization schemes
and vocabularies requires methodologies from computer
science, linguistics, library and information science, and
other allied disciplines. There are two kinds of tools in
content representation and organization: (1) metadata
schemes that contain data elements describing various
types of data sets and documents and (2) controlled vo-
cabularies that are used to assign subject categories and
indexing terms to documents.

Metadata Schemes
A metadata scheme defines a set of properties about data
or documents. In the digital library community, there
have been several proposals: the Dublin Core metadata
sets (Weibel, Kunze, Logoze, & Wolf, 1998), the War-
wick Framework (Lagoze, 1996), and the IEEE LOM/IMS
metadata set for learning objects (IEEE/LTSC, 2002; IMS,
2002). The main purpose of metadata schemes is to pro-
vide a consistent way to record data about data sets
or documents and to encode such data in a computer-
understandable manner. The most common metadata
elements include authors, titles, dates of creation/
publication, owner/publisher, description, and category/
subject terms. Some metadata schemes provide coarse-
grained metadata elements to allow quick generation or
creation of a metadata repository. Dublin Core is an exam-
ple of this kind. With a structure containing three pack-
ages of content, intellectual property, and instantiation,
it includes 15 elements and their subelements (Table 1).
This set enumerates the very general yet essential data el-
ements necessary for identifying and locating documents.
Metadata schemes may also include finer-grained data
elements to describe documents in greater detail. For ex-
ample, Dublin Core’s educational extension, DC-ED, em-
ploys a set of elements designed specifically for describing
the grade level, audience, pedagogy, and other details for
a learning resource on the Web. In this sense, metadata
schemes are the data models for collecting information
about documents on the Web. Such information is used
to search, browse, and locate documents on a topic over
the Web. Another use of metadata schemes is to design
them as templates to capture content directly from the

Table 1Elements in Dublin Core

Content Intellectual property Instantiation
Title Creator Date
Subject Publisher Type
Description Contributor Format
Source Rights Identifier
Language
Relation
Coverage

Web. An example is Web-based forms, which have been
widely used to capture customer data, survey data, sales
orders, and best practices, to name a few.
Sometimes no standard metadata scheme is suitable
for representing the types of data and documents being
created. In such circumstances, a customized metadata
scheme(s) is needed. The customized scheme may be cre-
ated based on a standard such as Dublin Core, with a set
of extended elements. It may also be created from scratch
without using any existing standard. However, for inter-
operability, a metadata standard should be adopted when-
ever possible. Since metadata schemes are the data mod-
els for collecting information about data and documents,
they often need to be embedded in content authoring tools
so that content authors can create metadata while they
are creating documents. In either case, i.e., creating an ex-
tended metadata scheme or a scheme for capturing data in
a distributed environment, careful design is necessary to
guarantee a sound representation of the data to allow easy
retrieval and browsing of the content. Designing metadata
schemes usually involves a data modeling process.
Modeling is a process of synthetic analysis and abstrac-
tion, in which system developers “construct an abstract
description of a system in order to explain or predict cer-
tain system properties or phenomena” (Schreiber, Akker-
mans, Anjewierden, Hoog, Shadbolt, & Wielinga, 2000, p.
128). The result of this process is a model in which classes
of objects in the real world are specified as having a trans-
parent and one-to-one correspondence to an object in the
model. Figure 3 is an example of the data model for an ex-
perience factory in the domain of workforce development.
The model in Figure 3 provides a high-level view of
main classes and the relations among the classes. In the
domain of workforce development, participating classes
include the government who sponsors workforce pro-
grams such as “Welfare-to-work” and “Study-to-career,”
workforce organizations that initiate projects to execute
programs, people in organizations who prepare docu-
ments describing and disseminating project information
and results, and knowledge captured for sharing promis-
ing practices and lessons learned. A data model such as
this serves as a communication tool for users of the Web
content. It shows what kinds of content there will be on
the Web site and how each class is related to one another.
Before a data model reaches its acceptable version, discus-
sions are often held with constituents to solicit feedback.
As with any system development process, this discussion–
revision process is iterative.

Cases
that worked

Lessons
learned

Workforce
Document Projects programs

People Organization Government

describes refers-to

example-of

includes is-related-to

writes initiates is-part-of sponsors

Figure 3: A sample conceptual model for the workforce ser-
vices domain.
Free download pdf