Wormhole Caching with HTTP PUSH Method for a Satellite-Based Web Content Multicast and Replication System

Hua Chen

A&T Systems, Silver Spring, MD 20904

Marc Abrams, Tommy Johnson

Network Research Group, Computer Science Department, Virginia Tech, VA 24061

Anup Mathur, Ibraz Anwar

A&T Systems, Silver Spring, MD 20904

John Stevenson

Internet Service Division, INTELSAT, Washington, DC 20008

Contact email: pchen@ats.com

Abstract

A two-tier Web caching and replication system, called the INTELSAT Internet Delivery System (IDS) is discussed. Based on a Warehouse-Kiosk paradigm, IDS provides global access and Internet wormholes via a fleet of INTELSAT satellites, the largest commercial satellite communications system in the world. Web content such as cacheable HTTP, FTP and streaming objects are fetched or pushed both actively and reactively into a central repository cache via intelligent Web agents. Fresh objects are constantly sent via IP multicast reliably to registered Kiosk caches. Distributed Web caches in the Kiosks offer content to their local users directly with improved quality of service and less bandwidth cost.

In the IDS architecture, intelligence is added to a conventional Web cache to acquire popular, valuable content into the system in multiple modes and to keep it fresh. These include an adaptive refresh agent, an object discovery crawler and a reactive agent based on the pull functionality in a traditional Web cache. To efficiently store the multicast delivery of Web content both from the Warehouse to Kiosks, and from content providers to the Warehouse, an HTTP PUSH method is proposed as an open protocol for proxies. It follows the message format defined in HTTP 1.1. This method has been integrated into the SQUID Web proxy server to co-exist with other methods like GET. Test results are presented for the performance of this two-way Web cache under various stress tests. Preliminary data collected from the test bed of an upcoming international trial have shown that, as an additional layer complementing existing network resource, IDS is a promising adjunct for the next generation Internet infrastructure to improve network performance.

Keywords

Web caching, SQUID, HTTP Push, IP Multicast, Replication, Wormhole, Satellite, Warehouse, Kiosk

1. Wormhole Caching and Tunneling

The explosive growth in the World Wide Web usage has now created an exponential demand for communications bandwidth globally. It has been reported that by 2001 the number of Web users will triple to 175 million and the number of Web pages will increase about 20 times to around 4.4 billion [IDC97]. This growth implies that demand for bandwidth will far out pace the rate of network construction in the near future. This phenomenon can be shown schematically in Figure 1 by a bent Web plane, where both C, representing site for content provider, and U, representing end user, are growing rapidly. ISPs are represented by ellipses. In this "big-bang" of rapid Web growth, the challenge is how to move Web content from many CÆs to many UÆs efficiently.

 

Figure 1. Wormhole Caching through a Portal W

Traditional unicast-based transport from a single, original server site to individual end user simply can not scale, leading to the so-called Internet meltdown as Web traffic skyrockets. The prognosis is worse as bandwidth-intensive audio and video streaming content are becoming pervasive. To meet bandwidth demands effectively and economically, new thinking and technologies are needed to maximize the capacity and efficiency of existing network services.

Caching is one of the most effective technologies to enhance network performance. The idea is to remove redundant traffic from the network by storing and moving frequently and repeatedly accessed Web contents close to users. Client caches have been integrated into most every modern browser, but they achieve limited benefit due to the fact that a single user tends to browse new information, which is never in the client cache. Network caches allow a group of clients to share information brought in by the first-time user. Deployed by ISPs and corporations close to users, such an approach saves network bandwidth, reduces server traffic, and for international traffic, reduces response time. Surveys from Forrester Research Inc. [Hannigan97] showed 90% of Fortune 1000 companies expect to deploy some form of caching system in two years. In the research frontiers, we are having the fourth International Web Caching Workshop since 1996. The public opening of the originally proprietary WCCP protocol [Cisco98] by Cisco and the recent proposal of the WPAD [Inkt98] by Inktomi, Microsoft, RealNetworks and Sun Microsystems surely will speed up the broader usage of Web caching both at the user end and network POP in the near future.

While traditional Web caching alone provides a partial solution for Web infrastructure to beat MooreÆs law, we believe that more intelligence is required to scale the Web expansion of users and content providers. Usually, stand-alone caches are installed by companies and ISPs passively at the network edge, represented by square nodes in Figure 1, focusing on the "last-mile" problem between ISPs and users. In Figure 1, a user U from an ISP such as K2 requesting a page from target site C3 can avoid the long haul transport in the WAN if C3 has been cached locally and kept fresh. With caches installed in every ISP, the delivery of fresh Web content from many original sites to many ISPs is still a complicated many-to-many transport. The "first time caching" from CÆs to KÆs and the subsequent update process are driven by user behavior, and are limited by the InternetÆs bursty congestion. Since CÆs and KÆs span all over the growing Web space, this problem will become more and more severe when Web content turns to be more dynamic as the current trend shows.

In this paper, we propose to beat the expanding, uncertain Internet through wormhole caching, inspired by the idea of wormholes in space to beat the speed of light limitation in quantum gravity. We propose bypassing the slow, unpredictable international paths spanning numerous WAN routers, bridge and switches, through a dedicated multicast communication channel in space. By creating portals overlaying the Web space, Web content can be aggregated and passed through these portals from distributed CÆs to many KÆs in a certain time/distance. This shortcut is illustrated by portals W and KÆs in Figure 1. Rather than having UÆs connect directly to CÆs, the CÆs connect to a geographically close portal W (called a Warehouse) and the UÆs connect to a geographically close portal K (called a Kiosk). A satellite link connects portals W and K with a single hop. Caching is used at both ends of the link at W and K. In the initial deployment a single Warehouse is located in Northern America due to of the concentration of Web content in this geographic region. Over time, warehouse in other geographic regions will be added. Thus a wormhole is created between North America and worldwide, making the expanding Web become much smaller to content providers and end users.

The INTELSAT Internet Delivery System (IDS) is a prototype implementation of the concept of wormhole caching. Initiated as a research and development project to exploit the satellite bandwidth and its corresponding broadcast nature, IDS has evolved into a multicast Web caching and replication platform. In this platform, Web content will be pushed or prefetched into a large warehouse repository located near content providers via caching. Cached Web content then is pushed to subscribing Kiosks distributed around the world via IP multicast over satellite links. A feedback path from Kiosks back to Warehouse is used to dynamically construct push channels. We focused on the IDS Web Caching Subsystem in this paper. A detailed description of the IDS system will be published separately [Mathur99].

2. Web Caching in IDS System

2.1 IDS Web Caching Overview

INTELSAT IDS is a Web caching and IP multicast system with content management capability. Based on a Warehouse-Kiosk paradigm, it is an integration of academic research results and recent technologies such as Web caching, IP multicast, Push technology and database management. In the system architecture, shown in Figure 2, there are four system components in both the Warehouse and the Kiosk:

Content Management Subsystem A Web based system interface that provides functionality to configure, control the system and manage Web content, such as content registration, categorization, channelization and statistic analysis of the usage of Web objects in the system.

Persistent Storage Subsystem A cache oriented relational database that stores the metadata of Web objects, usage log and other system information such as content categories and subscription channels.

Multicast Transmission Subsystem A reliable IP multicast transmission engine with scheduling, bundling and compression/decompression capability. A transmitter is implemented in the Warehouse, and a receiver is in the Kiosk.

Web Caching Subsystem A two-tier Web caching system with intelligent agents and a new, open PUSH method built with traditional Web cache. At the Warehouse there is a repository cache interfacing with content providers. A two-way Web cache is running at the Kiosk accepting both pull and push requests.

Figure 2: INTELSAT IDS System Architecture

Depending on categories, IDS acquires and caches Web content in different operating modes, summarized in Table 1. The ultimate goal is to store and deliver fresh Web objects from their origin site to distributed Kiosks around the world. Defined by experience or history statistics, popular content (e.g., www.cnn.com) is retrieved proactively by the system. As usual, objects requested by individual user will be cached independently by the conventional pull method in the userÆs home Kiosk. These objects reside in different Kiosks until they are popular enough to become system wide popular content. Burst "hot spots" like an Olympic event site can be such an example. Being event driven in response to high volume of end user requests, a "hot spot" is transferred from passive caching to active caching and vice versa in the system. On the other hand, content provider can initiate a transfer of Web content through the true push mode, for things like updates to Web sites and advertisements. Streaming content will be usually transferred by a combination of push and pull operations.

Web Content Category

Caching Mode

Popular Content

Active

Hot Spot

Reactive

Content Provider Request

Push

User Request

Pull

Real Time Streaming

Media Proxy*

Table 1: Modes to Cache Web Content into IDS (*To be integrated)

A simplified scenario consists of the following operations. 1) Web content is acquired into the Warehouse cache through pull, prefetch or push and is refreshed automatically. 2) The transmitter gathers the fresh content and multicasts them to many distributed Kiosks in a single transfer; 3) The receivers push the content into their local caches; 4) Distributed caches respond to user requests with local content.

The Web Caching Subsystem is one of the core components in IDS. The performance of Web caching depends on the value of the cached objects, which in turn defines the effectiveness of the whole system. The question of What to cache and how to ensure content freshness and accuracy are the key tasks of intelligent Web caching modules, and are addressed next.

2.2 Wormhole Caching Modules

Wormhole Caching is a collection of active and reactive caching modules implemented in the Web Caching Subsystem to support the aforementioned operations and functionality. Built with the pull capability in a traditional Web cache, it consists of several distributed processes or state machines as shown in Figure 3. Modules in dashed boxes, streaming media proxy and cache clustering mechanism are still under development at this writing. Wormhole Caching modules extend first-generation caches to offer a short-cut caching solution in handling next generation Web traffic.

Figure 3. Wormhole Caching Modules in IDS

This is the engine that includes multiple active agents that automatically monitor and refresh Web objects in the system according to an Adaptive Refresh Algorithm. A Next Refresh Time (NRT) is computed for each Web object with the help of the Persistent Storage module when it enters the IDS. Multiple measures are used automatically to insure the freshness of static and dynamic objects. For example, the Cache-Control: header will be honored for an HTTP 1.1 object. NRT is determined by the Expires: header from original server of the object. For an HTTP 1.0 object, NRT is calculated heuristically based on the Last-Modified: header and last cached time, among others, since the Expires: header is rarely used or used to defeat caching. For the cases of a missing Last-Modified: header, a default cycle is assigned. A similar approach is applied for negative caching situation. A best-effort is tried with exponentially growing interval until a ceiling is reached, at which point the object is defined as uncacheable. Freshness update uses the "If-Modified-Since" request to check with original servers.

This is an HTML parser that is able to discover and prefetch embedded cacheable objects in a cached object up to a specified level and predefined boundary assigned as parameters in the content registration process in Content Management Subsystem. This crawler parses through a Web page as it is entering the system to discover and evaluate embedded objects such as text, image and Java classes. If a newly found object meets a cost-to-value criterion defined by the system, it will be prefetched into the cache.

In IDS, Active Caching is a combination of the popular content registration process and the automatic operations provided by the Adaptive Refresh Module and the Object Discovery Crawler that proactively caches and updates objects, in contrast to the proprietary Active Caching Technology proposed by CacheFlow Inc.[CFlow98].

The Reactive Caching operation consists of distributed processes that collect and analyze usage pattern from each Kiosk/ISP to satisfy user demands dynamically. Each Kiosk periodically sends its aggregated local Web access statistics back to the Warehouse. The Warehouse has the intelligence to react to new activity or burst event according to a thresholding conversion algorithm. This algorithm takes into the account of the number of Kiosks and the number of users to decide when a new document is popular. At this point a hot spot page is added to a dynamically constructed push channel, which is multicast as needed from the Warehouse back to Kiosks. When the page is no longer popular, it will be dynamically removed from any push channels to which it belonged.

We propose the HTTP PUSH method to support the multicast delivery of Web content in a two-tier Web caching platform. This method allows direct Web object insertion into the Warehouse cache and the Kiosk cache, in addition to the pull functionality offered by traditional caches. A PUSH-enabled Web client is required to communicate with the PUSH-enabled Web cache in HTTP protocol. The message format will be detailed in the following section.

This mechanism not only provides an effective way for content acquisition both at the Warehouse and at the Kiosk for IDS, but also opens up a powerful channel for future information dissemination. As pointed in a Forrester Research report, "Push will revolutionize caching" [Hannigan97]. HTTP PUSH-enabled cache allows content provider to initiate delivery, either an update of Web objects or a new object, through multicast or another protocol. This eliminates the need for a refresh mechanism for the pushed content. Since HTTP PUSH is an open protocol, any server application can proactively blast out new contents to its target caches under contract, such as a Warehouse.

Transparent caching is achieved by a layer 4 switch that intercepts only Web traffic at an ISP connected to a Kiosk and redirects them to the cache server. This choice avoids the need for end users to reconfigure their browsers when using the Kiosk cache. We exploit a third party product [Williams98] for this purpose.

In the other hand, we are looking forwards to any new network caching standards to help client software automatically locate and interface with cache servers in the near future. A canonical naming convention has been used in the Virginia Tech. campus [Johnson198]. We expect WCCP or WPAD to provide similar transparent functionality.

Since Web object is a transient entity in a cache, additional persistent storage is required to record state information of objects to provide necessary history information and life cycle image for an intelligent cache to function. This is provided by a cache-oriented relational database management system, which stores metadata and statistics of Web objects registered in the system. Most of the algorithms supporting the Web Caching Subsystem retrieve and update information to and from this storage asynchronously.

The scalability module is a cache clustering mechanism that scales the capacity of the cache. This module manages internal loads among cache nodes as well as incoming traffic. The Web object space is partitioned through a hash function for assignment to multiple cache nodes. The module also provides fail over recovery mechanism.

This proxy is a standalone component integrated in IDS for handling streaming audio and video media. Streaming content will be cached in the Warehouse and multicast to the Kiosks. Replay will deliver data from a Kiosk to its users with permission from content server each time.

In addition to the cache monitor functionality in a cache manager, this module also includes the definition for Access Control List (ACL), Push Control List (PCL), rules for logging and relationship with other sibling caches in a Kiosk.

3. Definition of the HTTP PUSH Method

3.1 Objective

Most of the cache engines available at present time are based on user requests to pull objects into their storage via methods GET, HEAD etc. as defined in the HTTP protocol. This seems very natural considering conventional cache is presumably a query-driven proxy server to the users. However, in an update-driven information delivery and distribution system, HTTP reply objects usually are available locally either via a push channel from data providers, or via a multicast transmission from distribution channels. In both cases, a pull scenario is no longer proper, and a push scenario becomes more efficient, if not mandatory, to store new or updated objects.

We proposed a PUSH method [Chen98] to insert HTTP objects received from multicast or other sources into a targeted cache along a proxy chain fast and efficiently without user request. A cache engine supporting the PUSH method will allow a client application to register and store an HTTP object into its cache storage directly.

Figure 4 shows the added functionality from a PUSH method in a proxy chain:

 

 Figure 4. HTTP Push in a Proxy Chain

In the diagram, client can be a browser; server is an origin content site. Cache is proxy server in between. The double link shows where the push content comes from.

3.2 HTTP PUSH Method Definition

The PUSH method is similar to the HTTP 1.1 PUT and POST methods [Fielding97]. Like PUT, a PUSH request contains a header and a body containing an object. It should comply with previous specifications defined for communications between user agents and proxies, such as HTTP 1.0 [Berners-Lee96]. The PUSH method reuses the protocol parameters and follows the same message format as defined in HTTP1.1.

First of all, a PUSH method has an associated PCL (Push Control List) which define the authorized Web client list. These clients can be identified by their IP addresses, for example. A PUSH request is checked against such a push control list before it is accepted and processed.

A PUSH query requests that the entity enclosed in the PUSH body be stored under the supplied Request-URI into the cache storage. If the Request-URI already exists in the cache, the enclosed entity can be always considered as an updated version of the one residing on the cache, or it can be compared with the existing object to retain the fresher copy only. If the Request-URI does not point to an existing entry in the cache, and that URI is cacheable, the cache engine should create a new entry and store the object. If a new entry is created, the cache must return an HTTP response with code 207 (PUSH created); if the entry already exists, the cache must return the HTTP response with code 200 (OK). A failure is signaled by an HTTP response with code 400 (Bad request).

The fundamental difference between the PUSH and POST or PUT requests is that, PUSH only inserts the document into one cache; thus a PUSH will never go beyond the boundary of current cache proxy. With a PUSH, content dissemination in a proxy farm is driven by other mechanisms, such as ICP [Wessels98] or Cache Digest [Rouss98]. A PUSH request is identical to a PUT request to a destination server if storage mechanism is not considered.

The PUSH request and response shall obey the message transmission requirements set out in HTTP 1.1. Therefore, the request syntax is:

 

|------PUSH REQUEST ------------------------------------------|

PUSH URL HTTP/version <crlf>

[Request Header]

[Entity body]

|-------------------------------------------------------------------|

Where the "Entity body" holds the HTTP object to be inserted into the proxy, which is an HTTP Reply Header plus an HTML page in general.

The PUSH reply syntax is:

|------PUSH REPLY----------------------------------------------|

HTTP/version Status_Code Explanation <crlf>

[Reply Header]

|-------------------------------------------------------------------|

There is no reply body in the PUSH reply.

3.3 Examples

The following are some examples:

1) PUSH request example:

|------A PUSH REQUEST---------------------------------------|

PUSH http://www.foo.com/page.html HTTP/1.0

Content-length: 4090

HTTP/1.0 200 OK

Content-type: text/html

Content-length: 4061

<HTML>

à.

</HTML>

|--------------------------------------------------------------------|

2) PUSH response example:

|------A PUSH REPLY--------------------------------------------|

HTTP/1.0 200 OK

|--------------------------------------------------------------------|

3.4 Design and Implementation Issues

A client application with PUSH knowledge is required to compose and send a PUSH request. This can be a modified version of a POST or PUT request composer.

A new state machine has been created within the traditional Web cache source tree. The PUSH request is sent to the same HTTP port (Default Port 3128 in SQUID [Wessels98]) as other methods like GET, HEAD, POST or PUT go. The cache engine receives and parses the request, checks with the URL to see if it is a new entry, and store the object via the new module. The same module sends back a PUSH acknowledgement. Detailed design including state diagrams can be found in [Chen98].

It is recommended to reuse the data structure in a traditional cache and follow the same data flow as much as possible to keep PUSH as similar to other methods as possible. The design goal is to minimize the impact, but optimize the performance of the new method.

In the IDS prototype phase, the HTTP PUSH method has been implemented in Harvest [Chank95] derived SQUID cache. In the commercial phase occurring at this writing, we are working with leading Web cache vendors to implement this method in their caches and integrate it into our system later.

4. Performance of a Two-Way Web Cache

A two-way Web cache is a proxy that supports both push and pull requests. In this section, we focus on the performance of a PUSH-enabled cache here to test our implementation of the HTTP PUSH method. We exploited the hostility of the WebJamma [Johnson298] to benchmark the limitation of a two-way Web cache we built on Squid Version 1.1.20 software [Wessels98] and general-purpose hardware.

4.1 WebJamma and Test Bench

WebJamma is an artificial HTTP traffic generator written as a research project by the Network Research Group at Virginia Tech. Intended to test and measure the performance of Web proxy and HTTP server, WebJamma plays back HTTP accesses read from a log file. It maintains a configurable number of parallel requests, and keeps them busy continuously until it reaches the end of the log file. WebJamma is utilized for the performance test of the Web Caching Subsystem in IDS. For this purpose, it has been extended to support both the normal GET method and the proposed PUSH method. In its operation, WebJamma replays a workload by reading a log file of URLs, sending parallel HTTP queries and timing the transfer.

We have built a two-way cache with commonly available hardware on a Unix platform. The hardware architecture is based on a 300 MHz Pentium II CPU with 512 MB RAM and 45 GB Storage. One fast UW SCSI driver controls the disk I/O. Network interface is a regular Ethernet 10/100 Base-T adapter. We choose to use Solaris 2.6.

Our goal is to test the cache, not the network, so we have created a controllable environment. A Web server is used only for prefilling of Web content in the cache to simulate a loaded cache. In the test, hit rate for GET is controlled at 100% to eliminate the effect of the network connection to Web servers. However, hit rate plays no role here for PUSH requests since our current implementation simply considers all pushed object to be new.

4.2 Throughput for Pull and Push

In Figure 5, we present our test result for cache with one million URLs. In the experiment, WebJamma is first attached to the cache locally and is configured to send 50 parallel requests to simulate a stressful load. All URLs contain only numeric IP address to avoid DNS lookup, so push and pull have a comparable test environment. Since WebJamma is running locally with the cache and just discards the transferred data, the delay is mainly from proxy processing. By varying the object size, PUSH and GET requests with different URL each time are sent to the cache to measure the object throughput. The highest object throughput in this configuration, 435 objects per second, is achieved in a special case with parallel GET queries of a single URL, demonstrating the "hot object" phenomenon [Chank96]. Within our knowledge, test results for pull is comparable to other equivalent single node caches available at present time. We see no degradation due to the PUSH method implementation.

Then we setup WebJamma remotely to the cache through a dedicated Ethernet link and repeated the above experiment. The Push-remote curve in Figure 5 shows the object throughput for PUSH request under various object sizes.

Average response time and data throughput are calculated in Table 2 for push requests. As expected, data throughput improves for larger object. This is can be explained by the setup overhead of the TCP connection. Data throughput for push requests has no dependence on DNS, since objects are injected in directly without DNS lookup. We are going to implement the pipe-line support for the PUSH method to improve the throughput.

We repeated the same test sets for cache sizes of 100,000 and 10,000 URLs. We see no substantial increase in object throughput comparing with Figure 5.

Figure 5: Object Throughput (objects/s) vs Object Size (KB)

Push Results : (1M URLs)

1KB

7KB

1MB

Object Throughput

53 objects/s

42 objects/s

3.3 objects/s

Average Response Time

19ms

24ms

303ms

Data Rate

0.4Mbps

2.3Mbps

26.4Mbps

 

Table 2. Test Results for the PUSH method

4.3 Hit Rate for a Two-Way Cache

Hit rate is the percentage of Web objects successfully served by a cache. Since there is no direct user at the Warehouse, hit rate is undefined there. For a Kiosk cache, hit rate will be improved by the HTTP PUSH traffic multicasted from the Warehouse by a Push Boost term in the following formula:

Two-Way Hit Rate = Traditional Hit Rate + Push Boost

Therefore, a two-way cache generally has an improved hit rate compared with an equivalent traditional cache.

It is important to recall that cache hit rate is not only a function of cache size, but also depends on its intelligence, such as the refreshing algorithm. Even with infinite storage, a conventional cache still can not ensure a 100% hit rate. A cache sharing mechanism helps to improve the hit rate for a hierarchical cache farm, since it effectively maximizes the number of users. The more pull requests accessing a conventional cache, the more collective intelligence they bring in. At least one proprietary active caching solution [CFlow98] commercially available at present time trades bandwidth for hit rate through constant prefetching. They can either bombard a particular site too heavily, or are simply too expensive for an ISP in a bandwidth starving developing country.

Ideally one would like a 100% hit rate, and our wormhole Web caching moves closer towards such an ultimate goal. As a two-tier replicate system, the huge repository cache resource at the Warehouse is shared by thousands of Kiosk caches. It is the aggregation of many distributed caches. Being push and pull capable, a wormhole cache acquires valuable objects into the system with accuracy, since both users and content providers are involved in decision making and contribute their intelligence. Through active and reactive caching, a wormhole cache offers a unique selection and refreshing process that ensures content freshness based on object information from both its history and future. The hit rate of a wormhole two-way cache can be estimated by the following formula:

Two-Way Hit Rate = 1- Delta1 - Delta2 - àà

Where each delta represents a part of hit rate depletions from a perfect cache, either due to a finite cache size, an imperfect refreshing algorithm, or insufficient bandwidth, etc. It is expected that hit rates at Kiosk caches should be substantially improved compared with that of a standalone cache. Results from the upcoming international trial will be presented separately.

5. Summary

We have presented a description of a multicast Web caching and replication platform recently developed for the INTELSAT Internet Delivery System. There are several distinguishing features of our architecture:

The wormhole concept is that of providing a pipe with relatively constant latency between two points on the globe. Caching is used at both ends of the pipe.

Reactive caching is a major extension to a standalone cache. Since IDS will be deployed globally, the aggregated statistics should cover the widest possible geographic area and culture diversity ever. The Warehouse cache could become the largest cache in the world. Compared with other cache sharing schemes such as ICP or Cache Digest deployed in a hierarchical WAN environment, we believe in the effectiveness of a central repository augmented with active multicast distribution in our design. Of course this has to be validated in the upcoming international trial.

HTTP PUSH is proposed as an open protocol. The HTTP PUSH allows caches to implement a distribution model for Web page dissemination. It is used in IDS to contract a Warehouse with many Kiosks efficiently, and to contract many content providers with a Warehouse or several Warehouses. Such a contract simplifies the relationship among content providers, Warehouses and ISPs. With HTTP PUSH, distribution of next generation Web content from many content providers to many ISPs becomes much simpler.

We wish the proposal of the open HTTP PUSH method to be a first step to advance the deployment of an active network, where content providers can deliver content closer to users, instead of waiting for redundant requests or constant update queries from active caches.

It is interesting to note that geosynchronous satellite-based delivery today over international Internet links is competitive in overall latency with terrestrial links. Satellite link latency is on the order of a few hundred milliseconds, but so is the latency of todayÆs terrestrial links between North America and various other countries [Habib98]. Thus the use of a geosynchronous satellite to implement a wormhole is competitive in response time with terrestrial links. Utilizing the international coverage of the INTELSAT satellites, IDS maximizes the capacity and efficiency of existing network infrastructure. Many Kiosks share the cost of a single Warehouse operation and multicast transfer. The overall result by caching and multicast is a very cost-effective, scalable solution that offers a truly worldwide access to the Web.

At this writing, the IDS prototype has been tested at the INTELSAT lab for two months and is being deployed for an international trial joint by ten INTELSAT signatories, most of them are national level service providers.

Acknowledgement

All IDS team members at A&T Systems and INTELSAT have contributed greatly to this project, notably Thuc Nguyen, Ken Salins, Balaguru Nalathambi, Ravi Vellaleth, Pat Percich, Tokuo Oishi and Lacina Kone. Finally, this is not possible without the constant support of Dr. Ashok Theraja.

References

[Berners-Lee96] Berners-Lee, etc. "Hypertext Transfer Protocol û HTTP/1.0," RFC 1945, MIT/LCS, May 1996.

[CFlow98] CacheFlow White Paper. "Active Web Caching Technology," CacheFlow, 1998.

[Cisco98] Cisco Document. "Web Cache Control Protocol," Cisco Systems, 1998. http://www.cisco.com/univercd/cc/td/doc/product/iaabu/webcache/ce17/ver17/wc17wccp.htm

[Chank95] Anawat Chankhunthod, etc. "A Hierarchical Internet Object Cache," Tech Report 95-611, USC, 1995.

[Chen98] Hua Chen, etc. "HTTP Push: A New Method for Squid-Based Cache Engine," A&T Systems White Paper, June 1998.

[Fielding97] R. Fielding, etc. "Hypertext Transfer Protocol û HTTP/1.1," November 1997.

[Habib98] M. Ahsan Habib, etc "Latency of International Satellites Links," Technical Report TR98-24, Virginia Tech., December 1998.

[Hannigan97] Brendan Hannigan, etc. "Why Caching Matters," Forrester Research White Paper, October 1997.

[IDC97] IDC White Paper. "Bandwidth Management and Network Caching," International Data Corporation, 1997.

[Inkt98] Inktomi Press Release. "New Network Caching Standard Proposed," Inktomi, December 1998.

[Johnson198] Tommy Johnson, "A Method for Automatically Finding the Friendly Neighborhood Proxy Server," Virginia Tech. White Paper, March 1998.

[Johnson298] Tommy Johnson, "The WebJamma," NRG, Computer Science Department, Virginia Tech White Paper, 1998. http://www.cs.vt.edu/~nrg/webjamma.html

[Mathur99] Anup Mathur, etc. "Architecture of a Second-Generation Satellite-Based Internet Delivery System," to be appeared in INET99 San Jose, June 1999.

[Rouss98] Alex Rousskov, etc. "Cache Digest," 3rd IntÆl Web Cache Workshop, Manchester, June 1998.

[Wessels98] Duane Wessels, etc. "ICP and the Squid Web Cache," IEEE J. Selected Areas in Communications, V16, No.3, April 1998.

[Williams98] Bert Williams, "Transparent Web Caching Solutions," Alteon Networks White Paper, 1998.