The Internet Encyclopedia (Volume 3)

P1: 61

WL040C-197-Quin WL040/Bidgoli-Vol III-Ch-56 June 23, 2003 16:38 Char Count= 0

CONTENTREPRESENTATION ANDORGANIZATION 691

Content representation and organization refers to the process in which data and document models are created and relationships are mapped out as the architecture for templates and interfaces. Subsequently, such models are converted into logic designs and implemented into the system. Creation of representation and organization schemes and vocabularies requires methodologies from computer science, linguistics, library and information science, and other allied disciplines. There are two kinds of tools in content representation and organization: (1) metadata schemes that contain data elements describing various types of data sets and documents and (2) controlled vocabularies that are used to assign subject categories and indexing terms to documents.

Metadata Schemes A metadata scheme defines a set of properties about data or documents. In the digital library community, there have been several proposals: the Dublin Core metadata sets (Weibel, Kunze, Logoze, & Wolf, 1998), the War- wick Framework (Lagoze, 1996), and the IEEE LOM/IMS metadata set for learning objects (IEEE/LTSC, 2002; IMS, 2002). The main purpose of metadata schemes is to provide a consistent way to record data about data sets or documents and to encode such data in a computer- understandable manner. The most common metadata elements include authors, titles, dates of creation/ publication, owner/publisher, description, and category/ subject terms. Some metadata schemes provide coarse- grained metadata elements to allow quick generation or creation of a metadata repository. Dublin Core is an example of this kind. With a structure containing three pack- ages of content, intellectual property, and instantiation, it includes 15 elements and their subelements (Table 1). This set enumerates the very general yet essential data elements necessary for identifying and locating documents. Metadata schemes may also include finer-grained data elements to describe documents in greater detail. For example, Dublin Core’s educational extension, DC-ED, em- ploys a set of elements designed specifically for describing the grade level, audience, pedagogy, and other details for a learning resource on the Web. In this sense, metadata schemes are the data models for collecting information about documents on the Web. Such information is used to search, browse, and locate documents on a topic over the Web. Another use of metadata schemes is to design them as templates to capture content directly from the

Table 1Elements in Dublin Core

Content Intellectual property Instantiation Title Creator Date Subject Publisher Type Description Contributor Format Source Rights Identifier Language Relation Coverage

Web. An example is Web-based forms, which have been widely used to capture customer data, survey data, sales orders, and best practices, to name a few. Sometimes no standard metadata scheme is suitable for representing the types of data and documents being created. In such circumstances, a customized metadata scheme(s) is needed. The customized scheme may be created based on a standard such as Dublin Core, with a set of extended elements. It may also be created from scratch without using any existing standard. However, for inter- operability, a metadata standard should be adopted when- ever possible. Since metadata schemes are the data models for collecting information about data and documents, they often need to be embedded in content authoring tools so that content authors can create metadata while they are creating documents. In either case, i.e., creating an extended metadata scheme or a scheme for capturing data in a distributed environment, careful design is necessary to guarantee a sound representation of the data to allow easy retrieval and browsing of the content. Designing metadata schemes usually involves a data modeling process. Modeling is a process of synthetic analysis and abstrac- tion, in which system developers “construct an abstract description of a system in order to explain or predict cer- tain system properties or phenomena” (Schreiber, Akker- mans, Anjewierden, Hoog, Shadbolt, & Wielinga, 2000, p. 128). The result of this process is a model in which classes of objects in the real world are specified as having a trans- parent and one-to-one correspondence to an object in the model. Figure 3 is an example of the data model for an ex- perience factory in the domain of workforce development. The model in Figure 3 provides a high-level view of main classes and the relations among the classes. In the domain of workforce development, participating classes include the government who sponsors workforce programs such as “Welfare-to-work” and “Study-to-career,” workforce organizations that initiate projects to execute programs, people in organizations who prepare documents describing and disseminating project information and results, and knowledge captured for sharing promis- ing practices and lessons learned. A data model such as this serves as a communication tool for users of the Web content. It shows what kinds of content there will be on the Web site and how each class is related to one another. Before a data model reaches its acceptable version, discus- sions are often held with constituents to solicit feedback. As with any system development process, this discussion– revision process is iterative.

Cases that worked

Lessons learned

Workforce Document Projects programs

People Organization Government

describes refers-to

example-of

includes is-related-to

writes initiates is-part-of sponsors

Figure 3: A sample conceptual model for the workforce ser- vices domain.

The Internet Encyclopedia (Volume 3)

Get our desktop app

Company

Features

Documentation

Resources