The Internet Encyclopedia (Volume 3)

(coco) #1

P1: IML/FFX P2: IML/FFX QC: IML/FFX T1: IML


Web ̇QOS WL040/Bidgoli-Vol III-Ch-58 July 16, 2003 9:36 Char Count= 0


720 WEBQUALITY OFSERVICE

versions to save execution time. Thus, when the server
is overloaded, an alternative to rejection of further re-
quests would be to adapt a quality of responses such
that the load on the server is reduced. Many leading
news and sports Web sites adopt this policy. For example,
the appearance of the Cable News Network Web site at
http://www.cnn.com is often significantly simplified upon
important breaking news to absorb the higher request
rate. An instance of that was the great reduction in CNN
site content during the first hours after the attack on the
World Trade Center on September 11, 2001.
Content degradation is preferred to service outage for
obvious reasons. One is that it maintains service to all
clients, albeit at a degraded quality, which is preferred to
interruption of service. Another is that it does not incur
rejection cost because service is not denied. As mentioned
before, rejection cost can be considerable when user-level
admission control is used.
To express the flexibility of adaptive Web applications,
an expanded QoS-contract model is proposed. It assumes
that the service exports multiple QoS levels with differ-
ent resource requirements and utility to the user. The
lowest level, by default, corresponds to request rejec-
tion. Its resource requirements are equal to the rejection
cost, and it has no utility to the user. The objective is to
choose a QoS level delivered to each user class such that
utility is maximized under resource constraints. Several
content adaptation architectures have been proposed in
Web QoS literature. They can be roughly classified into
two general types, depending on the reason for adapta-
tion, namely, adaptation to network/client limitations and
adaptation to server load. These two types are described
below.

Adaptation to Network and Client Limitations
In the first type, adaptation is performed online and is
sometimes called dynamic distillation or online transcod-
ing. For example, see the work of Chandra, Ellis, and
Vahdat (2000). The reason for such adaptation is to cope
with reduced network bandwidth, or client-side limita-
tions. Note that the dynamic distillation algorithm itself
will in fact increase the load on the server. In effect, the
algorithm implements a trade-off where extra computing
capacity on the server is used to compress content on the
fly to conform to reduced network bandwidth. Alterna-
tively, transcoding or distillation proxies may be intro-
duced into the network. For example, a transcoding proxy
can identify a client as a wireless PDA device and convert a
requested HTML page into WML for display on the client’s
limited screen.
Adaptation to client-side limitations can also be done
using layered services. In this paradigm, content delivery
is broken into multiple layers. The first has very limited
bandwidth requirements and produces a rough version of
the content. Subsequent layers refine the content itera-
tively, each requiring progressively more resources. JPEG
images, for example, can be delivered in this manner. An
adaptive service could control the number of layers deliv-
ered to a client depending on the client’s available band-
width. A client with a limited bandwidth may receive a
fraction of the layers only. The determination of the num-
ber of layers to send to the client can be done either by the

server of by the client itself. For example, consider an on-
line video presentation being multicast to the participants
of a conference call. The server encodes the transmitted
video into multiple layers and creates a multicast group
for each layer. Each client then subscribes to receive a
fraction of the layers as permitted by its resource capacity
and network connectivity. Such adaptation architectures
have initially been proposed in the context of streaming
media.

Adaptation to Server Load
In the second type of adaptation, content is adapted to
reduce server load. In this case, dynamic distillation or
compression cannot be used because the server itself is
the bottleneck. Instead, content must be preprocessed
a priori. At run time, the server merely chooses which
version to send out to which client. The server in such an
architecture has multiple content trees, each of a differ-
ent quality. For example, it can have a full content tree, a
reduced content tree where some decorative icons, back-
grounds, and long images have been stripped, and a low-
quality text-only tree. A transparent middleware solution
has been described that features a software layer inter-
posed between the server processes and the communi-
cation subsystem. The layer has access to the HTTP re-
quests received by the server and the responses sent. It
intercepts each request and prepends the requested URL
name with the name of the “right” content tree from which
it should be served in accordance with load conditions.
To decide on the “right” content tree for each client the
interposed content adaptation layer measures the cur-
rent degree of server utilization and decides on the ex-
tent of adaptation that will prevent underutilization or
overload.
An interesting question is whether or not load can be
adapted in a continuous range when only a finite small
number of different content versions (trees) are available.
Such continuous adaptation is possible when the number
of clients is large. To illustrate this point, consider a server
withMdiscrete service levels (e.g., content trees), where
Mis a small integer. These levels are numbered 1,...,M
from lowest quality to highest quality. The level 0 is added
to denote the special case of request rejection. The ad-
mission control algorithm is generalized, so that instead
of making a binary decision, it determines a continuous
valuem, in the range [0,M], which we call the degree of
degradation. This value modulates server load in a contin-
uous range. In this case,m=0 means rejecting all requests
(minimum load), andm=Mmeans serving all requests at
the highest quality (maximum load). In general, whenm
happens to be an integer, it uniquely determines the ser-
vice level (i.e., tree) to be offered to all clients. Ifmis a
fractional number, composed of an integral partIand a
fractionF(such thatm=I+F), the two integers nearest
tom(namely,IandI+1) determine the two most appro-
priate service levels at which clients must be served. The
fractional partFdetermines the fraction of clients served
at each of the two levels. In effect,mis interpreted to
mean that a fraction 1−Fof clients must be served at level
I, and a fractionFat levelI+1. The policy can be accu-
rately implemented when the number of clients is large. It
ensures that load can be controlled in a continuous range
Free download pdf