The Internet Encyclopedia (Volume 3)

P1: IML/FFX P2: IML/FFX QC: IML/FFX T1: IML

Web ̇QOS WL040/Bidgoli-Vol III-Ch-58 July 16, 2003 9:36 Char Count= 0

PERFORMANCEGUARANTEES INWEBSERVERS 717

The above discussion identifies five different queues in- volved in the Web server’s queuing structure: namely, the IP queue, the listen queue, the ready queue, the I/O queue (assuming a single I/O resource, e.g., disk), and the out- going packet queue. The interconnection of these queues creates a queuing network with loops and other depen- dencies. For example, threads that repeatedly read and process disk data essentially loop between the ready queue and the I/O queue. Moreover, when the number of threads is fixed, dequeuing from the listen queue is synchronized with progress in the ready queue in that new requests are dequeued from the former only when some runnable thread has finished serving a request or has blocked. These factors make accurate analysis of the server’s queuing structure difficult. Instead, many approxima- tions are possible. For example, it is possible to consider only the queue for the bottleneck resource. The general idea is that a Web request is likely to spend most of its time waiting in the bottleneck queue. The second challenge in providing statistical guarantees in Web servers is to identify the stochastic nature of the arrival process. Many queuing theory results assume a continuous Poisson arrival process. This process is char- acterized by an exponential distribution of interarrival times. This assumption does not hold for Web traffic. Research on Web characterization has identified that arrival of Web requests is generally modeled by a heavy tailed distribution (Crovella & Bestavros, 1997). One distribution that is commonly used to model request interarrival times and request execution times is the Pareto distribution. Breslau, Cao, Fan, Phillips, and Shenker (1999) also determined that URL popularity follows a Zipf-like distribution. This information is important for studies of Web performance because it helps quantify the effects of caching. To experiment with Web performance in labo- ratory testbeds, Barford and Crovella (1998) developed a synthetic Web workload generator that faithfully repro- duces the characteristics of realistic Web traffic, includ- ing its heavy-tailed distribution, URL popularity, and ref- erence locality characteristics. The problem of providing queuing-theoretic performance predictions for such Web traffic arriving at a server modeled by the queuing structure outlined above is still an open research topic.

Relative Guarantees From queuing theory we know that delay experienced by a request is a function of server load. Ultimately, the only way one can reduce delay to meet deadline guarantees is to keep the load low enough. This implies denying service to some requests under high load to improve performance. In many cases, however, it is preferred thatall clients receive service when capacity permits. One QoS- provisioning paradigm that subscribes to this model is proportional relative differentiated services. Proportional relative differentiation was first proposed in the networking community in the context of delay differentiation in routers. It was since extended to server end systems. In the relative differentiated services model, it is desired that the ratio between the performance levels of different classes of traffic be fixed; e.g., it may be that the delays of two traffic classes in a network router should be fixed at a ratio of 3:1. In general, if there are multiple

traffic classes in the system, and ifHiis the measured performance of classi, the relative guarantee specifies that H 1 :H 2 :...:Hn=C 1 :C 2 :...:Cn, whereCiis the weight of classi. Hence, only relative delay between any pair of classes is specified. The absolute delay can take any value. When the system approaches overload, the delay of all classes increases, although some classes see a larger delay increase than others in accordance with the relative delay guarantee. At present, mechanisms for providing relative delay differentiation are well understood, but not yet deployed. Relative delay guarantees make most sense under mod- erate server load. When the load is light, all classes see identically good service (no delay). When the load is very high, all classes suffer unacceptable delays and timeouts. Hence, it is often useful to combine relative and absolute delay guarantees in a single framework. The combined architecture bounds the maximum delay of a class in ad- dition to providing the correct delay ratio when the bound has not been reached. The architecture allows specifying a partial order on absolute and relative time constraints that determines which constraints should be relaxed first when network conditions makes the combination unreal- izable.

Convergence Guarantees In an environment where unpredictable traffic conditions make it impossible to satisfy absolute constraints, an alternative type of performance guarantees has recently been defined. This guarantee views QoS provisioning as a convergence problem and employs control theory to en- sure stability and timeliness of convergence of system performance to the right specification (Abdelzaher, Shin, & Bhatti, 2002). The statement of the guarantee is that a performance metric,R, will converge within a specified exponentially decaying envelope to a fixed value, called theset point,and that the maximum deviation from the set point will be bounded at all times. The absolute convergence guarantee is translated into a control loop such as those used in industrial control plants. The loop samples the measured performance metric (e.g., delay), compares it to the set point, and uses the difference to induce changes in resource allocation and load. Typically, it performs admission control to reduce load, and reallocates resources to alleviate the bot- tlenecks. The use of control theory to control the performance of software processes has gained much popularity in recent years. Traditionally, control theory was used to model and control industrial processes described by difference equations. The intuitive reason that such a theory would be applicable to server performance control is that input load on a server resembles input flow into a water tank. The fill level of the server queue resembles the tank fill level. Ad- mission control resembles a valve on the input flow pipe. Hence, the delay dynamics of a Web server are similar to flow and level control dynamics in a physical plant. The latter are well understood and can be described by difference equations. The second reason that control theory is becoming popular for server control is that feedback control loops are very robust to modeling errors. Hence, accurate models of software dynamics are not needed.

The Internet Encyclopedia (Volume 3)

Get our desktop app

Company

Features

Documentation

Resources