The Internet Encyclopedia (Volume 3)

(coco) #1

P1: IML/FFX P2: IML/FFX QC: IML/FFX T1: IML


Web ̇QOS WL040/Bidgoli-Vol III-Ch-58 July 16, 2003 9:36 Char Count= 0


PERFORMANCEGUARANTEES INWEBSERVERS 715

made possible due to late binding, supported by most op-
erating systems today. In late binding, references to called
library functions are not resolved until the call is made
at run time and the corresponding library is dynamically
loaded into the server’s memory. It is therefore possible
to make changes to the shared library without having to
recompile or relink the server software. Once the server
makes the standard library call, the new (modified) library
gets invoked. In the case of middleware for performance
isolation, the modified shared library may implement ac-
counting functions that approximate resource containers.
One of the most obvious libraries to instrument is the
socket library. Hewlett Packard Labs researchers (Bhatti
& Friedrich, 1999) were the first to suggest architectures
where the socket library is replaced with a QoS-sensitive
version, which implements performance isolation. In the
context of regular socket calls, the QoS-sensitive library
dequeues service requests from the server’s well-known
port and classifies them into per-class queues. The ac-
cept() or read() socket calls are modified so that no con-
nections are accepted from a given per-class queue unless
the budget of the corresponding class (maintained by the
modified library) is nonzero. The scheme implements ap-
proximate performance isolation. It has been successfully
integrated into a server platform sold by Hewlett Packard,
called WebQoS.

Application-Layer Mechanisms
Current state-of-the-art Web servers, such as Apache (the
most widespread Web server today), maintain a single
process pool for all incoming requests. The single-pool
architecture significantly complicates QoS provisioning
because all requests are treated alike in the server. In a
multi–class server, attainment of performance isolation
can be significantly simplified if the server is designed with
QoS guarantees in mind. The single main feature that pro-
vides most impact in that regard is to maintain a separate
pool of processes for each traffic class. Once the server
identifies an incoming request as belonging to a particular
class, it is queued up for the corresponding process pool.
Several examples of this architecture have been proposed
in the literature. QoS provisioning reduces to controlling
the resource allocation of each separate pool.

Service Differentiation
An important second category of QoS guarantees is ser-
vice differentiation. The goal of performance isolation,
discussed above, is to logicallypartitionthe resource so
that each class of clients would get its own indepen-
dent portion. Competition among classes iseliminatedby
giving each exclusive ownership over a subset of re-
sources. In contrast, service differentiation policies do not
attempt to partition the resource. The resource isshared.
When the resource is in demand by multiple classes, the
differentiation policyresolvesthe competition, typically
in a way that favors some class over others. Note that per-
formance isolation can also lead to different performance
levels for different classes and hence can be thought of as
a special case of “service differentiation.” One main differ-
ence is that in performance isolation no resource sharing
occurs. Service differentiation policies are classified

depending on what it means to favor a particular class.
There are several ways “favor” can be defined. In the fol-
lowing, we describe the most common examples and their
supporting mechanisms.

Prioritization
The simplest method to provide differentiation is priori-
tization. Consider a situation where a Web service is ac-
cessed by two classes of clients: paying customers and
nonmembers. In contemporary Web services, paying cus-
tomers are usually allowed access to protected parts of the
Web site that are inaccessible to nonpaying users. This
type of differentiation fails to achieve its goal when the
Web site is overloaded. In such a situation, a large group
of nonpaying users can increase load on the server to the
extent that all users (including the paying ones) have dif-
ficulty accessing the site content. Performance isolation
can be applied between paying and nonpaying users, but
it suffers the problem of having to decide on the relative
sizes of the respective resource partitions, which typically
depend on the current load. One approach to circumvent-
ing this problem is to serve clients in absolute priority
order. In this scheme all client requests are queued up in
a single priority queue for server access. Under overload,
the queue overflows. Clients at the tail of the queue are
dropped. These clients, by construction of the queuing
policy, are the lower priority ones.
The problem with prioritization alone is that it fails
to provide meaningful performance guarantees to clients.
The top priority class receive the best service, but very
little can be predicted about the performance received
by other classes. Prioritization, however, becomes an ex-
tremely useful analyzable tool once combined with other
techniques discussed below.

Absolute Delay Guarantees
Prioritization, in conjunction with admission control,
allows customizable absolute delay guarantees to an
arbitrary number of client classes. Consider a case where
there areNclasses of paying clients. To recover a fee,
the server is contractually obligated to serve each class
within a maximum time delay specified in the correspond-
ing QoS contract signed with that class. For example, in
an online trading server, first-class clients may be guar-
anteed a maximum response time of 2 s, whereas econ-
omy clients are guaranteed a maximum response time of
10 s. Failure to execute the trade within the guaranteed re-
sponse time results in a commission waiver. Alternatively,
in a Web hosting service, the content provider of each
hosted site might have a QoS contract with the host that
specifies a target service time and a fee paid to the host
per request served within the agreed-upon delay. Hence,
an overloaded hosting server, which consistently fails to
meet the delay constraints, will recover no revenue from
the content providers.
Because, in these examples, a host does not derive
revenue from requests that miss their deadlines, admis-
sion control may be used against clients who are unlikely
to met their timing constraints. The rationale for such
admission control is that scarce server capacity should
not be wasted on clients who are unable to make rev-
enue. Although theoretically admission control refers to a
Free download pdf