The Internet Encyclopedia (Volume 3)

P1: IML/FFX P2: IML/FFX QC: IML/FFX T1: IML

Web ̇QOS WL040/Bidgoli-Vol III-Ch-58 July 16, 2003 9:36 Char Count= 0

716 WEBQUALITY OFSERVICE

choice between acceptance and rejection, it is more realis- tic to choose between acceptance andbackground service. In other words, for the purposes of the following discus- sion, a rejected client is put in a separate lowest-priority queue to be served if resources permit. Admission control has received much attention in Web QoS literature. Admission control algorithms may be clas- sified into optimistic and pessimistic. The former type may admit clients optimistically even when they may miss their deadlines. The latter may reject them unnecessarily. Note that absence of admission control can be thought of as the extreme of optimistic admission control tests. Recent results (Abdelzaher & Lu, 2001) have shown that in a server with randomly arriving requests it is possible to use a constant-time admission control test to distinguish clients who will meet their deadlines from those who may not. The test is based on a running counter, which maintains a utilization-like metric. The counter is updated by a constant-time operation upon request arrivals and de- partures. If a request arrives while the counter is below a certain high-water mark, it is guaranteed to meet its deadline. Otherwise, the deadline may be missed and the request is served at the lowest priority level. Recent eval- uation results of this admission control algorithm show that it rarely errs in the sense of unnecessarily reject- ing requests. The test is shown to improve revenue at overload not only by eliminating resource consumption wasted on requests that miss their deadlines but also by favoring smaller requests (all other factors being equal) and hence improving server throughput. The derivation of the high-water mark representing the client admission threshold assumes that clients are served in a priority order such that more urgent requests are served first. A gen- eralization of this test has later been proposed for FIFO scheduling. When priority-driven scheduling is used to meet deadlines, priority should be set proportional to urgency. There are two ways urgency can be defined. In the first, clients with shorter per-class response times are considered more urgent. The resulting priority assignment is called deadline-monotonic. In the second, urgency is defined by absolute deadline. The resulting priority assignment is called earliest-deadline-first (EDF). For example, consider a request arriving at time 0 with a maximum response time constraint of 10 s. At timet=9, a second request arrives of a different class with a maximum response time constraints of 2 s. According to the deadline-monotonic priority assignment, the second request should receive higher priority because its maximum response time constraint, 2, is tighter than 10. According to EDF the first request should receive higher priority because its absolute deadline is 0+ 10 =10, which is before the absolute deadline of the second request, 9+ 2 = 11. EDF has been proved to be the optimal priority-driven scheduling policy. Deadline-monotonic scheduling is optimal among time-independent scheduling policies, i.e., those where priorities are assigned independent of absolute request arrival times. Optimality, here, is defined in terms of the ability to meet a larger number of deadlines. EDF can meet deadlines when deadline-monotonic scheduling fails because it takes arrival times into ac- count. For instance, in the above example, if the first

request has an execution time of 9.5 s, and the second request has an execution time of 1.5 s, both requests meet their deadlines under EDF, but not under deadline- monotonic scheduling. A problem with EDF is that it is less commonly implemented on standard operating systems, with the exception of embedded real-time operating systems where application timing is of prime importance. Deadline-monotonic scheduling is therefore a good com- promise.

Statistical Delay Guarantees An entirely different line of reasoning is to provide statistical guarantees on delays and deadline misses. Statistical guarantees require queuing analysis of the Web server. This analysis makes two types of assumptions. First, it must make assumptions regarding the queuing structure of the Web server. Second, it must make assumptions regarding the request arrival process. In the following, we outline the most important challenges underlying the statistical approach to QoS guarantees. Consider the first challenge, namely, deriving the queuing structure of the Web server. This queuing structure depends on the protocol used. We describe HTTP 1.0 for il- lustration. Consider a typical multithreaded (or multipro- cess) Web server. Packets arrive from the network, caus- ing hardware and software interrupts that, respectively, read the packets from network interface cards and deposit them into a kernel-level input queue called theIP queue.In between interrupts, packets in the IP queue are processed by the kernel and queued for the particular application ports for which they are destined. These ports are often called sockets. Each socket is associated with an input queue, called thelisten queue,where incoming connec- tion requests are queued. Independently schedulable en- tities in the Web server, calledworker threads,are blocked for data to be deposited at the listen queue. These threads are unblocked by request arrivals to execute the arriving requests. Each thread implements a loop that processes each incoming request and generates a response. Worker threads that have been unblocked by request arrival be- come runnable. Multiple runnable threads may exist at a given time. The order in which such threads get the CPU to execute a request is determined by the CPU scheduling policy. This policy maintains a priority queue called theready queue. The thread at the top of this queue executes for a particular time quantum or until it is blocked. Request processing by a worker thread typically entails access to one or more auxiliary server resources, the most notable being disk I/O. For example, in a Web server, disk I/O is usually needed to read the requested Web page from disk. Access to auxiliary resources blocks the calling thread, at which time it is queued for I/O until the awaited resource becomes available. Each resource usually has a queue, which de- termines the order in which accesses are served. We call it theI/O queue. The resource is made available to the thread at the top of the queue, at which time the correspond- ing thread becomes runnable again and re-enters the CPU ready queue. When request processing is done, the worker thread sends a response back to the client. Sending the response entails queuing data into the outgoing packet queuefor transmission on the network.

The Internet Encyclopedia (Volume 3)

Get our desktop app

Company

Features

Documentation

Resources