The Internet Encyclopedia (Volume 3)

P1: IML/FFX P2: IML/FFX QC: IML/FFX T1: IML

Web ̇QOS WL040/Bidgoli-Vol III-Ch-58 July 16, 2003 9:36 Char Count= 0

720 WEBQUALITY OFSERVICE

versions to save execution time. Thus, when the server is overloaded, an alternative to rejection of further requests would be to adapt a quality of responses such that the load on the server is reduced. Many leading news and sports Web sites adopt this policy. For example, the appearance of the Cable News Network Web site at http://www.cnn.com is often significantly simplified upon important breaking news to absorb the higher request rate. An instance of that was the great reduction in CNN site content during the first hours after the attack on the World Trade Center on September 11, 2001. Content degradation is preferred to service outage for obvious reasons. One is that it maintains service to all clients, albeit at a degraded quality, which is preferred to interruption of service. Another is that it does not incur rejection cost because service is not denied. As mentioned before, rejection cost can be considerable when user-level admission control is used. To express the flexibility of adaptive Web applications, an expanded QoS-contract model is proposed. It assumes that the service exports multiple QoS levels with different resource requirements and utility to the user. The lowest level, by default, corresponds to request rejection. Its resource requirements are equal to the rejection cost, and it has no utility to the user. The objective is to choose a QoS level delivered to each user class such that utility is maximized under resource constraints. Several content adaptation architectures have been proposed in Web QoS literature. They can be roughly classified into two general types, depending on the reason for adaptation, namely, adaptation to network/client limitations and adaptation to server load. These two types are described below.

Adaptation to Network and Client Limitations In the first type, adaptation is performed online and is sometimes called dynamic distillation or online transcoding. For example, see the work of Chandra, Ellis, and Vahdat (2000). The reason for such adaptation is to cope with reduced network bandwidth, or client-side limitations. Note that the dynamic distillation algorithm itself will in fact increase the load on the server. In effect, the algorithm implements a trade-off where extra computing capacity on the server is used to compress content on the fly to conform to reduced network bandwidth. Alterna- tively, transcoding or distillation proxies may be intro- duced into the network. For example, a transcoding proxy can identify a client as a wireless PDA device and convert a requested HTML page into WML for display on the client’s limited screen. Adaptation to client-side limitations can also be done using layered services. In this paradigm, content delivery is broken into multiple layers. The first has very limited bandwidth requirements and produces a rough version of the content. Subsequent layers refine the content itera- tively, each requiring progressively more resources. JPEG images, for example, can be delivered in this manner. An adaptive service could control the number of layers delivered to a client depending on the client’s available bandwidth. A client with a limited bandwidth may receive a fraction of the layers only. The determination of the number of layers to send to the client can be done either by the

server of by the client itself. For example, consider an online video presentation being multicast to the participants of a conference call. The server encodes the transmitted video into multiple layers and creates a multicast group for each layer. Each client then subscribes to receive a fraction of the layers as permitted by its resource capacity and network connectivity. Such adaptation architectures have initially been proposed in the context of streaming media.

Adaptation to Server Load In the second type of adaptation, content is adapted to reduce server load. In this case, dynamic distillation or compression cannot be used because the server itself is the bottleneck. Instead, content must be preprocessed a priori. At run time, the server merely chooses which version to send out to which client. The server in such an architecture has multiple content trees, each of a different quality. For example, it can have a full content tree, a reduced content tree where some decorative icons, back- grounds, and long images have been stripped, and a low- quality text-only tree. A transparent middleware solution has been described that features a software layer interposed between the server processes and the communi- cation subsystem. The layer has access to the HTTP requests received by the server and the responses sent. It intercepts each request and prepends the requested URL name with the name of the “right” content tree from which it should be served in accordance with load conditions. To decide on the “right” content tree for each client the interposed content adaptation layer measures the cur- rent degree of server utilization and decides on the ex- tent of adaptation that will prevent underutilization or overload. An interesting question is whether or not load can be adapted in a continuous range when only a finite small number of different content versions (trees) are available. Such continuous adaptation is possible when the number of clients is large. To illustrate this point, consider a server withMdiscrete service levels (e.g., content trees), where Mis a small integer. These levels are numbered 1,...,M from lowest quality to highest quality. The level 0 is added to denote the special case of request rejection. The admission control algorithm is generalized, so that instead of making a binary decision, it determines a continuous valuem, in the range [0,M], which we call the degree of degradation. This value modulates server load in a continuous range. In this case,m=0 means rejecting all requests (minimum load), andm=Mmeans serving all requests at the highest quality (maximum load). In general, whenm happens to be an integer, it uniquely determines the service level (i.e., tree) to be offered to all clients. Ifmis a fractional number, composed of an integral partIand a fractionF(such thatm=I+F), the two integers nearest tom(namely,IandI+1) determine the two most appro- priate service levels at which clients must be served. The fractional partFdetermines the fraction of clients served at each of the two levels. In effect,mis interpreted to mean that a fraction 1−Fof clients must be served at level I, and a fractionFat levelI+1. The policy can be accu- rately implemented when the number of clients is large. It ensures that load can be controlled in a continuous range

The Internet Encyclopedia (Volume 3)

Get our desktop app

Company

Features

Documentation

Resources