The Internet Encyclopedia (Volume 3)

P1: IML/FFX P2: IML/FFX QC: IML/FFX T1: IML

Web ̇QOS WL040/Bidgoli-Vol III-Ch-58 July 16, 2003 9:36 Char Count= 0

712 WEBQUALITY OFSERVICE

A significant body of literature has addressed the is- sue of guaranteeing temporal QoS attributes in the ab- sence of adequate prior knowledge of operating service conditions such as load and resource capacity. Until recently, the current state of the art in providing acceptable temporal performance to the users has been over-design. Throwing money and hardware at a performance problem eventually ensures that there are enough resources to service all incoming requests sufficiently fast. This ap- proach, however, is inadequate for several reasons. First, it is rather expensive, because more resources are expended than is strictly necessary. Second, it provides the same service to all clients. In many cases, however, a service provider might want to use performance differentiation as a tool to entice clients to subscribe to a “better” (and more expensive) service. Third, the server provides only a best-effort service in that there are no bounds on worst- case performance. It is sometimes advantageous to be able to quantitatively state a performance guarantee for which users can be commensurately charged. In the following sections, we describe several ap- proaches for QoS guarantees in more detail. We begin by a brief review of the current Web architecture and the underlying principles of Web server operation. We then survey the modifications suggested to this architecture to provide performance guarantees to Web clients.

CURRENT WEB ARCHITECTURE From an architectural standpoint, the World Wide Web is a distributed client–server system glued together by the hypertext transfer protocol (HTTP), which is simply a request–reply interface that allows clients (browsers) to download files from servers by (URL) name and allows one page to reference another, creating a logical mesh of links. The architecture is completely decentralized. By creating links from existing content, new content is seam- lessly integrated with the rest of the Web.

The HTTP Protocol The main exchange between clients and servers occurs using the HTTP protocol. When a client requests a page from a server only the text (HTML) portion is initially downloaded. If the page contains images or other em- bedded objects, the browser downloads them separately. At present, two important versions of HTTP are popu- lar, namely HTTP 1.0 and HTTP 1.1. The most quoted difference of HTTP 1.1, and the primary motivation for its existence (Mogul, 1995), is its support for persistent connections. In HTTP 1.0, each browser request creates a new TCP connection. Because most Web pages are short, these connections are short-lived and are closed once the requested page is downloaded. Unfortunately, TCP is optimized for long data transfers. Each new TCP connection begins its life cycle with a connection set-up phase, followed by a slow-start phase in which connection bandwidth gradually ramps up from a low initial value to the maximum bandwidth the network can support. Unfortu- nately, short-lived connections, such as those of HTTP 1.0, are closed before reaching the maximum bandwidth. Hence, transfers are slower than they need to be.

Persistent connections in HTTP 1.1 avoid the above problem by sending all browser requests on the same TCP connection. The connection is reused as long as the browser is downloading additional objects from the same server. This allows TCP to reach a higher connection bandwidth. Additionally, the cost of setting up and tearing down the TCP connection is amortized across multiple transfers. The debate over which protocol is actually better is stillgoing on. For example, a disadvantage of HTTP 1.1 is its reduced concurrency, because only one TCP connection is used for all objects downloaded from the same server, instead of multiple concurrent ones. Another problem with HTTP 1.1 is that the server, having received and served a request from a client, does not know when to ter- minate the underlying TCP connection. Ideally, the connection must be kept alive in anticipation of subsequent future requests. However, if the client does not intend to send more requests, keeping the connection open only wastes server resources. The present default is to wait for a short period of time (around 30 s) after serving the last request on a connection. If no additional requests arrive during that period, the connection is closed. The problem with this policy is that significant server resources can be blocked waiting for future requests that may never arrive. It is therefore not obvious that the bandwidth increase of HTTP 1.1 outweighs its limitations.

Caching and Content Distribution To improve Web access delays and reduce backbone Web traffic, caching and Web content distribution services have gradually emerged. These services attempt to redis- tribute content around the network backbone so that it is closer to the clients who access it. The difference between caching and content distribution lies in a data-pull versus a data-push model. Whereas caches store content locally in response to user requests, content distribution proxies proactively get copies of the content in advance. There are generally three types of caches, namely proxy caches, client caches, and server caches. Proxy caches are typically installed by the ISPs at the interface to the network backbone. They intercept all Web requests originating from the ISP’s clients and save copies of the requested pages when replies are received from the con- tacted servers. This process is called page caching. A request to a page that is already cached can be served directly from the proxy, thereby improving client-side latency, reducing server load, and minimizing backbone traffic for which the ISP is responsible to the backbone provider. An important question is what to do when the cache becomes full. To optimize its impact, a full cache retains only the most recently requested URLs, replacing those that have not been used the longest. This is known as the least-recently-used replacement policy. Several varia- tions and generalizations of this policy have been pro- posed, e.g., to account for page size, cost of a page miss, and the importance of the client. To improve performance further, client browsers locally cache the most recently requested pages. This cache is consulted when the page is revisited (e.g., when the client pushes the “back” button), hence obviating an extra access to the server. Finally, some server installations

The Internet Encyclopedia (Volume 3)

Get our desktop app

Company

Features

Documentation

Resources