The Internet Encyclopedia (Volume 3)

P1: IML/FFX P2: IML/FFX QC: IML/FFX T1: IML

Web ̇QOS WL040/Bidgoli-Vol III-Ch-58 July 16, 2003 9:36 Char Count= 0

PERFORMANCEGUARANTEES INWEBSERVERS 715

made possible due to late binding, supported by most op- erating systems today. In late binding, references to called library functions are not resolved until the call is made at run time and the corresponding library is dynamically loaded into the server’s memory. It is therefore possible to make changes to the shared library without having to recompile or relink the server software. Once the server makes the standard library call, the new (modified) library gets invoked. In the case of middleware for performance isolation, the modified shared library may implement ac- counting functions that approximate resource containers. One of the most obvious libraries to instrument is the socket library. Hewlett Packard Labs researchers (Bhatti & Friedrich, 1999) were the first to suggest architectures where the socket library is replaced with a QoS-sensitive version, which implements performance isolation. In the context of regular socket calls, the QoS-sensitive library dequeues service requests from the server’s well-known port and classifies them into per-class queues. The ac- cept() or read() socket calls are modified so that no con- nections are accepted from a given per-class queue unless the budget of the corresponding class (maintained by the modified library) is nonzero. The scheme implements approximate performance isolation. It has been successfully integrated into a server platform sold by Hewlett Packard, called WebQoS.

Application-Layer Mechanisms Current state-of-the-art Web servers, such as Apache (the most widespread Web server today), maintain a single process pool for all incoming requests. The single-pool architecture significantly complicates QoS provisioning because all requests are treated alike in the server. In a multi–class server, attainment of performance isolation can be significantly simplified if the server is designed with QoS guarantees in mind. The single main feature that pro- vides most impact in that regard is to maintain a separate pool of processes for each traffic class. Once the server identifies an incoming request as belonging to a particular class, it is queued up for the corresponding process pool. Several examples of this architecture have been proposed in the literature. QoS provisioning reduces to controlling the resource allocation of each separate pool.

Service Differentiation An important second category of QoS guarantees is service differentiation. The goal of performance isolation, discussed above, is to logicallypartitionthe resource so that each class of clients would get its own indepen- dent portion. Competition among classes iseliminatedby giving each exclusive ownership over a subset of resources. In contrast, service differentiation policies do not attempt to partition the resource. The resource isshared. When the resource is in demand by multiple classes, the differentiation policyresolvesthe competition, typically in a way that favors some class over others. Note that performance isolation can also lead to different performance levels for different classes and hence can be thought of as a special case of “service differentiation.” One main differ- ence is that in performance isolation no resource sharing occurs. Service differentiation policies are classified

depending on what it means to favor a particular class. There are several ways “favor” can be defined. In the fol- lowing, we describe the most common examples and their supporting mechanisms.

Prioritization The simplest method to provide differentiation is prioritization. Consider a situation where a Web service is ac- cessed by two classes of clients: paying customers and nonmembers. In contemporary Web services, paying customers are usually allowed access to protected parts of the Web site that are inaccessible to nonpaying users. This type of differentiation fails to achieve its goal when the Web site is overloaded. In such a situation, a large group of nonpaying users can increase load on the server to the extent that all users (including the paying ones) have dif- ficulty accessing the site content. Performance isolation can be applied between paying and nonpaying users, but it suffers the problem of having to decide on the relative sizes of the respective resource partitions, which typically depend on the current load. One approach to circumvent- ing this problem is to serve clients in absolute priority order. In this scheme all client requests are queued up in a single priority queue for server access. Under overload, the queue overflows. Clients at the tail of the queue are dropped. These clients, by construction of the queuing policy, are the lower priority ones. The problem with prioritization alone is that it fails to provide meaningful performance guarantees to clients. The top priority class receive the best service, but very little can be predicted about the performance received by other classes. Prioritization, however, becomes an ex- tremely useful analyzable tool once combined with other techniques discussed below.

Absolute Delay Guarantees Prioritization, in conjunction with admission control, allows customizable absolute delay guarantees to an arbitrary number of client classes. Consider a case where there areNclasses of paying clients. To recover a fee, the server is contractually obligated to serve each class within a maximum time delay specified in the corresponding QoS contract signed with that class. For example, in an online trading server, first-class clients may be guaranteed a maximum response time of 2 s, whereas econ- omy clients are guaranteed a maximum response time of 10 s. Failure to execute the trade within the guaranteed response time results in a commission waiver. Alternatively, in a Web hosting service, the content provider of each hosted site might have a QoS contract with the host that specifies a target service time and a fee paid to the host per request served within the agreed-upon delay. Hence, an overloaded hosting server, which consistently fails to meet the delay constraints, will recover no revenue from the content providers. Because, in these examples, a host does not derive revenue from requests that miss their deadlines, admission control may be used against clients who are unlikely to met their timing constraints. The rationale for such admission control is that scarce server capacity should not be wasted on clients who are unable to make revenue. Although theoretically admission control refers to a

The Internet Encyclopedia (Volume 3)

Get our desktop app

Company

Features

Documentation

Resources