Footprint 2.02
Release Details

Contents

1. Introduction

2. Cache Peering Review
2.1 Basic Peering
2.2 Hierarchical Peering
2.3 Proxy-only Peering

3. Coherent Peering Extensions
3.1 Background
3.2 Requirements
3.3 Implementation
3.3.1 Invalidation Tokens
3.3.2 Basic Mechanism
3.4 Optimizations
3.5 Avoiding Invalidations after a Peer Fill
3.6 Avoiding Races and a Note on Atomicity

4. Footprint 2.02 Implementation
4.1 Invalidation Tokens
4.2 Squid Modifications
4.3 Distributor Modifications
4.4 Implementation

5. Software Installation

6. Configuration
6.1 Staged Adoption
6.2 Externally Visible Effects and Operational Considerations

7. Troubleshooting
7.1 Significant Error Messages
7.2 Fixed Defects For Release 251.96
7.3 New Configuration Variables
7.4 New CUI Commands
7.5 Additional Notes

8. Future Enhancements



1. Introduction

Build 251.96 is now the officially released Footprint version 2.02, with over 33 defects fixed from the previous release. The main feature of this release is the activation of an enhancement called, "Cache Peering." Although this feature has been tested on the B network of approximately 80 computers, you must err on caution when configuring the production network. Also, you must update the master candidates and load them with the appropriate SSL-Proxy software. Section 5. Software Installation suggests a basic upgrade strategy.

In addition, Section 3. Coherent Peering Extensions describes at the architectural level the various extensions to Squid that support Coherent Peering. Coherent Peering is a mechanism that safely allows caches to be peered while supporting asynchronous invalidation of content.

The intended audience comprises Footprint operations personnel and quality assurance personnel. This document assumes basic familiarity with the Footprint system and hierarchical cache peering concepts.


Note: This document does not address certain issues associated with peering, including interactions between peering and rewritten content. Similarly, detailed technical information about the internals of the peering implementation itself is beyond the scope of this document. Digital Island will address these issues in a separate document.


 

2. Cache Peering Review

This section covers general cache peering concepts and provides some information specific to the Footprint implementation. Generally speaking, a cache provides local copies of a resource that originate from a distant location. The purpose of a cache is to increase the speed of access to data. For the cache to be efficient, the data in question must be requested frequently, and change at a rate less than the request rate.

The higher the ratio of request frequency to change frequency, the more efficient the cache will be. Any cache established for the first time is empty. When receiving a request, the cache checks if it has a valid copy of the resource. If such a copy exists, it is called a cache hit as the cache then provides the request with its local resource. If the resource does not exist in the cache, this cache miss redirects the request to the origin point. To improve speed and immediacy, Digital Island configures local consumers to request data from the cache rather than from the origin point.

Footprint uses caching for HTTP (and FTP) resources. Each content distributor has a cache. Using proprietary rendezvous technology, Footprint directs the browser client's requests for resources to the content distributor closest to the requestor. When receiving a request, the cache checks for the presence of a local copy of the resource and, if one is present, sends it the requestor. Such a resource is called a cached resource. If the resource is not present, however, the cache then redirects the request to the customer's Web server for the resource. The customer's Web server is the origin point, that is, the origin server. When a cache retrieves a resource from an origin server, the process is called a cache fill for the cache now has its own copy of the resource for future requests.

A major complication in caching is the maintenance and refreshing of the cached data. A resource may change on the origin server, but without a provision for freshness maintenance, caches will continue serving copies of the same resource. Out of date resources are called stale. Even while the up-to-date resources exist at the origin server, the cache would continue serving old copies long after the rehouse has grown stale.

Footprint provides two basic mechanisms for freshness maintenance: expiration policies and Invalidation on Demand. An expiration policy tells a cache how long (or until when) it may continue to serve a resource without checking the origin server for updates. When a request for an expired resource arrives, the cache makes a refresh request to the origin server to check the resource's freshness. If the resource has changed, the origin server provides either a fresh copy of the resource or a new expiration for the old resource if it remains valid.

Invalidation on Demand is a mechanism with which a customer can trigger the immediate expiration of a resource or set of resources, regardless of the date set with whatever expiration had been used. When a cache receives a request for an invalidated resource, it performs the aforementioned freshness check with the origin server.

2.1 Basic Peering

The above caching system suffers from a significant scalability problem. Although the number of caches may grow, essentially without limits, there remains only one origin server. If a frequently requested resource on a large number of caches becomes invalidated, or if a burst of requests for a brand new resource arrives, the load on the origin server could exceed its capacity, resulting in slow or failed responses to browser clients.

Basic peering is a mechanism that allows caches to perform their fills from other caches in the network instead of, or in addition to, the origin server. Although the first new copy of the resource must come from the origin server, subsequent copies can be retrieved from other caches as well. This reduces the load on the origin server and reduces the response time for cache misses, in many cases quite significantly.

The most basic form of peering is flat peering, typically used within a cluster of caches at a single location. In this model, each cache is a sibling of other caches in the system. For example, when a request hits a cache miss, the cache then requests the resource from its nearest sibling. Of course, if the cache sibling does not have the resource, the cache contacts the origin server. Subsequent requests for the same resource, which arrive at other caches, can retrieve the resource from the first cache to acquire the resource. This reduces the load on the origin server by having only one fill per cluster for the resource, rather than one fill per cache. It also reduces the WAN bandwidth utilization at the cluster, since the peered requests and responses are transmitted over the local LAN and not outside the cluster. Finally it improves response time for the second and subsequent cache misses at the cluster.

The above is an example of intra-cluster peering. Footprint also uses flat peering with inter-cluster, that is, with cache siblings in remote clusters. This type of peering frequently uses a query mechanism which precedes the actual fill request. When a cache receives a request which results in a miss, it sends a query to each of its siblings and waits for a response. The actual cache fill is performed on the first sibling to respond with a positive indication that it indeed has a copy of the resource.

When a request arrives from a sibling which results in a cache miss, the request must then be forwarded directly to the origin server rather than to any of the other siblings. Otherwise, a forwarding loop could result. This can occur even with the use of the query mechanism, due to the window between the query and the actual request. During this window, the resource may stale while the cache provides a positive response to the query. There are other mechanisms for avoiding forwarding loops, but the simplest policy is to avoid using siblings to respond to requests arriving from other siblings.

2.2 Hierarchical Peering

Another complementary form of peering is hierarchical peering. In addition to the above sets of caches with sibling (i.e. flat peering) relationships, there are parent caches in the system. When a request arrives, which cannot be satisfied by any of a particular cache's siblings, the cache forwards the request to one or more of its parent caches, rather than to the origin server. A cache's parent might have siblings of its own that it can use to satisfy a cache fill request.

If configured properly, hierarchical peering strictly controls the load on the origin server, which is at the very top of the hierarchy. In this case, the maximum load possible on the origin server is equal to the load placed there by the parent caches at the immediate, next level of the hierarchy. Other requests are handled by caches rather than the origin server.

2.3 Proxy-only Peering

Proxy-only peering is generally used in sibling caches, which reside in the same cluster and which have a low cost communication link between them. In this instance, after a sibling is identified with a copy of the resource, the resource is passed through the cache without the cache retaining its own copy of the resource.

This is useful for conserving resources, such as memory and disk space, by having only one copy of the resource within a cluster. The only negative impact results in longer response times, a higher load on the caches (since they'll receive repeated requests for the same resource from their siblings), and when the one cache with the particular resource becomes unavailable, thus forcing another long-distance cache fill.

 

3. Coherent Peering Extensions

This section describes at the architectural level various extensions to Squid that support "coherent peering." Some details of the implementation in Footprint Release 2.02 are also provided. Coherent peering is a mechanism which allows caches to be peered safely while supporting asynchronous invalidation of content.


Note: This document does not address certain issues associated with peering, including interactions between peering and rewritten content. Digital Island will address these issues in a separate document.


3.1 Background

Footprint supports on-demand invalidation of content held in the Footprint caches. Invalidation in the caches is asynchronous in that each cache acts on invalidation commands as they are received and there is no feedback about the completion of invalidations to a central controlling authority.

The problem with asynchronous invalidation in its current implementation is that it does not allow for cache peering. For example, a cache which already has processed an invalidation yet can request the resource from a cache that is valid. This could load a potentially stale resource into the cache which had already processed the invalidation, thereby allowing stale content to be served from that cache. The coherent peering mechanism described here avoids that problem while still allowing invalidations to be processed asynchronously.

3.2 Requirements

The basic requirement for coherent peering is that, once an invalidation has been processed by a specific cache, that cache will not load or serve stale content, that is, content which hasn't been refreshed or reloaded from the origin server after the invalidation was started. Customers are expected to update their content before performing the invalidation.

There is no requirement that the algorithm perform absolutely optimally in the face of a rapidly changing invalidation environment. That is, it is acceptable to fill from the origin server even if a particular peer may have also been able to safely satisfy the request. The algorithm must err on the side of safety. There is also no requirement that a resource be invalidated the same number of times it is invalidated by a user, only that the last invalidation issued by a user is reliably processed. The invalidation mechanism may deliver some invalidations out of order, particularly while an agent is synching with an invalidation source. An earlier-issued invalidation for a given resource may be skipped if it can be reliably determined that a later-issued one has already been processed.

3.3 Implementation

Squid generally supports cache peering using a protocol called Internet Cache Protocol (ICP). This is a datagram-based (UDP) protocol wherein a cache can query its peers to determine whether a resource is present on the peer. Upon receipt of a positive response from a peer, the cache requests the resource from that peer rather than the origin server. If no positive responses are received within a small time-out period (dynamically adjusted based on historical response times), the cache requests the resource from the origin server or from a parent cache.

The coherent peering mechanism extends ICP to allow data about invalidation state to become part of the hit or miss decision made by peers in response to ICP requests. The mechanism also includes transmission and consideration of this state information with the actual peer fill request. This is needed to avoid a race condition:

The invalidation state could change after the ICP request/response cycle is complete and a peer has been selected for the fill.

Peer fills with this mechanism generate directly from cache-to-cache. The distributor core is not involved in the peer fill mechanism itself.

3.3.1 Invalidation Tokens

Invalidation state information is represented by invalidation tokens. These tokens have two parts: a source part and a sequence part. The source part associates the token with a specific invalidation source. This is necessary because invalidations in the Footprint system can originate from multiple sources, since the master in charge of invalidation dissemination can change at any time. The sequence part is a sequence number assigned by the invalidation source to each invalidation such that a higher sequence number is known to represent a later invalidation than a token from the same source with a lower sequence number. Note that there is no requirement that sequence numbers increase monotonically, only that a higher sequence number represents a later-issued invalidation.

To support different invalidation tagging schemes, tokens are defined to contain (currently) up to 32 bytes of data, with the length of the match and sequence portions being specified when the token is presented. As long as the total amount of data needed doesn't exceed 32 bytes, the range of sources and sequence numbers can be changed at will to support different invalidation mechanisms implemented in the distributors.

In readable form, tokens are represented as:

<source>:<sequence> 

The source and sequence portions are specified in hexadecimal notation.


Note: In this document, we use simplified, decimal tokens for clarity.) In general, for a specific source, the sequence numbers should all be of the same length. If this is not the case, the mechanism considers any longer sequence portion to be greater than (that is, to represent a later invalidation) than a shorter one. Again, this is should not occur in practice and the case is defined only for the sake of completeness.

Whenever a distributor issues an invalidation to the cache, it provides the token associated with the invalidation. The distributor is also responsible for initializing the known and seen tables in the cache at startup. It is not safe to allow a cache hit for a peer fill until these tables have been initialized.

3.3.2 Basic Mechanism

The basic mechanism involves the maintenance of two sets of invalidation tokens by each cache. The first set is called the known set. This set of tokens represents invalidations such that all invalidations from the same source with the same sequence number or lower are known to have been processed to completion. This set of tokens is updated by the distributor when invalidations are completed by the cache.

The second set is called the seen set. This set of tokens represents invalidations such that any invalidation from the same source with a higher sequence number is known not to have been seen by that particular cache. This set of tokens is updated by the cache as invalidations are started.

The basic question the recipient of a peer fill request must answer is, "Have I not yet processed to completion an invalidation which has already been seen by the sender?" If the answer to that question is yes, the peer should not allow the fill to proceed, or it must retrieve a fresh resource from the origin server or parent cache to satisfy the peer's request.

To accomplish this, the sender sends its complete set of seen invalidations (one token per invalidation source) to the peer along with the ICP request as well as with the actual peer fill request. The peer compares this data with its known invalidations.

For each token sent by the requestor, the recipient looks up the token in its known table (the lookup is based only on the source portion of the token, not the sequence number of course) and compares its sequence number to the sender's. In order to safely satisfy the peer fill request, there must be no tokens in the recipient's known table which have a sequence number less than the sender's. Further, if no token for a particular source sent by the sender is found in the known table, then the peer fill may not proceed.

In a more compact notation, the basic test is:

peruke = sender(seen) <= recipient(known) 

This ensures that the sender has not seen an invalidation which is not known to have been processed on the recipient.

For example, assume that the recipient's known table contains the following tokens:

0:10 
1:20 

Assume that the sender has the following in its seen table, which it sends along with its request:

0:9 
1:5 

The recipient is known to have processed all invalidations from source "0", which have sequence numbers of 10 or less, and all invalidations from source "1" with sequence numbers of 20 or less. The sender has not seen (started) any invalidations from source 0 with a sequence number greater than 9, or from source 1 with a sequence number greater than 5. It is currently behind the recipient. The peer fill is safe in this case.

If there is an outstanding invalidation for the resource, it must have a sequence number greater than 10 from source 0, 20 from source 1, or it must be from a brand new source. Since the sender hasn't processed invalidations beyond 0:9 or 1:5, and hasn't started any invalidations from a third source, the peer fill is safe. If there is an outstanding invalidation, it will be processed by the sender when it is received.

If the sender has a third source in its seen table, the peer fill is not safe. The resource could have been invalidated with that particular invalidation, and the recipient of the request hasn't yet processed the invalidation. Stale data could be loaded into the sender in this case. The same argument applies if the sender has seen any invalidations with higher sequence numbers than the known-complete invalidations on the recipient.

For example, if the recipient's known table is:

0:10 
1:20 

and the sender's seen table is:

0:12 
1:18 

the peer fill might not proceed.

The resource may have been invalidated with invalidation 0:11 or 0:12, both of which may have already been processed on the sender, but which haven't been processed by the recipient.

3.4 Optimizations

The basic mechanism can be optimized to allow peer fills to proceed in cases where the basic mechanism would disallow them. One of these optimization is accomplished by having the cache keep track of the token used to invalidate each individual resource. If the sender of a peer fill request has previously invalidated the resource, it may use the token associated with that invalidation resource to "prune" its seen table.

For example, if a sender's global seen table contains the following tokens:

0:192 
1:248 

but the last invalidation for the particular resource being requested was invalidated using:

1:42 

as the token, the sender may replace the token from source 1 in its seen table with the token on the resource - since that token contains the highest sequence number from that source which is relevant to that particular resource.

This optimization is possible because the sender has a copy of the resource with which it can associate more information, namely, the invalidation token last used to invalidate the resource or the fact that the resource has never been invalidated by that cache.

In general, the mechanism may be optimized by maintaining data about seen invalidations at higher levels of granularity. For instance, if the seen table were maintained on a per-migratory basis, then only the seen tokens for invalidations associated with that particular migrator would need to be sent in peering requests. Or, for even tighter granularity, the seen table could be maintained on a per-resource basis. One attractive aspect of the mechanism is that a sender may elect to perform these optimizations with knowledge or cooperation by the recipient. The recipient's processing remains the same, driven by data from the sender.

3.5 Avoiding Invalidations after a Peer Fill

The token mechanism may be used to avoid performing a needless invalidation. The case where this arises is one where there is an outstanding invalidation for a resource which has not yet been processed on the sender but has been processed by the recipient. In the normal case, the invalidation will probably be processed on the sender either during or shortly after the peer fill. If the version of the resource provided by the recipient has already been invalidated (and refreshed, of course; stale cache hits are never allowed) by that invalidation, then the sender doesn't need to invalidate that specific resource when it processes the invalidation. Therefore, the recipient provides the last token used to invalidate the resource as part of its peer fill response. If an incoming invalidation on the sender matches this token, the invalidation may be safely skipped by the sender for that resource only.

3.6 Avoiding Races and a Note on Atomicity

There is a race between the ICP request, incoming invalidations, and the actual peer fill request. It's possible that the state information sent with the ICP request will have changed by the time the actual peer fill request is sent. To handle this, the state information is sent along with the actual peer fill request and the recipient checks again whether it may safely satisfy the request. If the answer is "no," the recipient refreshes the resource itself directly with the origin server or a parent cache and provides the fresh resource to the sender. The recipient never uses a sibling peer in this case. This avoids "flapping" and limits the latency associated with the fill.

A note on atomicity: When a fill or refresh is needed, a new cache entry is created and placed in the cache's table as the first step. This entry must have the basic data needed to determine whether an invalidation would invalidate the resource. If the entry is hit by an invalidation before sufficient data is received to determine whether the invalidation can be skipped, then the entry must be immediately expired. Only the browser clients currently requesting the resource will receive that version of the resource. New requests will go through the peer fill mechanism as before.

If it is not possible to associate enough information with the new cache entry to ensure that an incoming invalidation will hit it if appropriate, then invalidations which arrive while the fill is in progress must be attached to the resource and be replayed against it after sufficient data is received. This is necessary in the case where metadata on the reply is needed to determine if an invalidation hits a particular resource.

 

4. Footprint 2.02 Implementation

This section provides information on the Footprint 2.02 implementation of coherent peering.

4.1 Invalidation tokens

The previous text described invalidation tokens in general. In version 2.02, invalidation tokens total 20 bytes, as follows:

For the source portion,

4 byte IP address followed by 8 byte journal date 

For the sequence portion:

8 byte sequence number 

So, if a particular invalidation source has IP address 63.209.70.231 and its journal started on July 5th, 2000 at 17:34:34.000 GMT (which is 962,817,214,000 milliseconds since the epoch) the source portion of the token would be:

3fd146e7000000e02c60c630 

If that source issued an invalidation with a sequence number of 9722, the entire token would be:

3fd146e7000000e02c60c630:00000000000025fa 

The journal date must be included in the source identification because if a journal is lost, the sequence numbers reset to 1.

4.2 Squid Modifications

Most of the modifications for coherent peering are in the Squid cache itself. The cache maintains the known and seen tables internally. The known table is only updated on demand by the distributor (see below). The seen table is updated when invalidation requests are received.

To support this, the FPMCP invalidate command has been extended to take an additional argument:

tok=<token>

This is the token associated with the invalidation.

To initialize the known and seen tables and to enable peering requests and responses, a new FPMCP command peerstate has been added. The possible arguments to the peerstate command are:

request=on|off 

Enable (on) or disable (off) peering requests. Peering requests are disabled when the cache first comes up. It is not safe to enable peering requests until the seen table has been initialized.

response=on|off 

Enable (on) or disable (off) peering responses. When peering responses are disabled, all ICP queries are responded to with a miss indication and received peer requests are not allowed to hit in the cache. It is not safe to enable peering responses until the known table has been initialized.

setknown=[<tok>[[,<tok>]...]] 

The known table is set to contain only the given tokens. If no tokens are provided, the known table is cleared.

setseen=[<tok>[[,<tok>]...]] 

The seen table is set to contain only the given tokens. If no tokens are provided, the known table is cleared.

mergeknown=[<tok>[[,<tok>]...]]

The given tokens (if any; passing none is a nop but is supported) are merged with the existing known table, for example:

When the incoming tokens that have the same source portion as any token in the known table are replaced.

mergeseen=[<tok>[[,<tok>]...]]

The given tokens (if any; passing none is a nop but is supported) are merged with the existing seen table, as follows:

When the incoming tokens that have the same source portion and a sequence number that is higher than any token in the seen table.

pmergeseen=[<tok>[[,<tok>]...]]

The given tokens are used to perform a pruning merge of the seen table, for example:

When incoming tokens with the same source portion and a sequence number higher than a token in the seen table are replaced.
Any tokens in the seen table are removed if they do not have corresponding (source match) tokens in the incoming token set.

Squid's ICP implementation has been extended to provide a new ICP op code, as follows:

ICP_QUERY_INV. 

This stands for "ICP query with invalidation tokens." The message contains invalidation tokens which are checked against the cache's known table in order to determine whether a "hit" response is allowed. (The normal check for a cache hit is of course also performed.) Also, in response to ICP_QUERY_INV, the ICP implementation will always respond with a miss if peering responses haven't been enabled using the peerstate command.

Squid's peer selection algorithm has been modified such that it is disabled if peering requests have not been enabled with the peerstate command.

A new header is defined for peer requests, as follows:

X-WR-PEER: [tok=<tok>[[,<tok>]...]]

This header serves two purposes on requests:

Identifies that the request is a peer fill request.
Provides the tokens, if any, which should be compared with the recipients known table to determine if a peer cache fill is safe.

On replies, the header syntax is:

X-WR-PEER: [tok=<tok>]

The header in this case is used to communicate the invalidation token last used to invalidate the resource on the recipient back to the sender.

Finally, the extended status codes have been modified to contain peering information. The previously unused second decimal digit of the extended status code is now encoded as a 3-bit bitfield (max value of 7), as follows:

bit 1 - Set if resource was last filled/refreshed from a peer.
bit 2 - Set if resource is being served to a peer.
Bit 3 - Set if the peer fill request forced a cache miss due to the presented tokens.

4.3 Distributor Modifications

The distributor has been modified, as follows:

Initialize the known and seen tables at startup and enable peering requests/responses. The tables are initialized based on data in the distributor's existing InvalidatedTable (part of the journalling mechanism). The FPMCP peerstate command is used for this purpose.

Construct/provide the token associated with an invalidation when issuing an invalidate command.

Update the known table when an invalidation completes and it represents the next invalidation in sequence or causes a hole to be closed. This is also done using data from the distributor's InvalidatedTable.

Reinitialize the known and seen tables when a distributor is removed from the Successor List. (Invalidations from that distributor are no longer considered in peering decisions.)

Reinitialize the known and seen tables if a journal is lost on a distributor in the Successor List.

Reinitialize the known and seen tables if a complete cache flush is needed (due to missed, unrecoverable invalidations).

4.4 Implementation

Footprint uses a heavily modified version of the Squid proxy cache. Squid has a relatively feature-rich peering capability which implements flat and hierarchical peering with querying and a proxy-only option. Squid uses ICP (Inter Cache Protocol which is documented in RFC2186 and RFC2187) as the querying mechanism. (This does not appear to be a standards track protocol at this time. The RAFs are in the "informational" category and were last updated in 1997.)

Footprint's support for invalidation on demand, which is one of the features added to Squid, made it impossible to use Squid's peering implementation as is. (There were other barriers as well, but this was the major one.) The basic problem arises when one cache which has processed an invalidation attempts a fill from a peer which has yet to process the invalidation. The result is that the requesting cache gets a stale copy of the resource stuck in its cache, since it already processed the invalidation. Obviously, any peering solution must address this issue.

The Footprint peering implementation leverages Squid peering capabilities by extending the ICP protocol to include information about invalidation state, and by extending FPMCP (Footprint Managed Cache Protocol) to allow invalidation state information to be maintained by the caches themselves. This information allows a cache which receives an ICP query to determine whether it may safely service the request. The implementation supports direct cache-to-cache peering. There is no distributor involved in handling peer fill requests or queries.

 

5. Software Installation


Note: As of yet, we do not have the final software installation instructions. The procedures that follow in this section are incomplete and may or may not satisfy the installation parameters.


To install Footprint's release 2.02, perform the following steps:

1. Stop the secondary master.
2. Install new ssl-proxy software (104) via pkgadd.
3. Update master to 251.96
4. Restart the secondary master.

If the secondary master restarts as required, continue through the following steps:

5. Stop the tertiary master(s).
6. Install new ssl-proxy software (104).
7. Update to 251.96
8. Start the tertiary master(s).

When the tertiary master(s) restarts correctly, perform the final steps:

9. Stop the primary master.
10. Install new ssl-proxy software (104).
11. Update to 251.96
12. Start the primary master.

 

6. Configuration


Note: Some of this data may change prior to deployment due to issues discovered during the QA cycle, implementation of extra features, removal of restrictions, etc. This section is intended for illustrative purposes only. Detailed and current information will be provided prior to the deployment.]


Peering is configured using Squid's existing peer configuration facilities. Note that Squid provides more configuration options than will be tested and approved by engineering. Only the approved configuration options should be used.

The peer configuration resides in the squid.conf configuration file. A peer definition looks like this:

catchpenny <hostname> <type> <HTTP port> <ICP port> [options]

The hostname is the name of the peer. The type indicates whether the peer is a parent or sibling. The HTTP port is the port on which HTTP requests should be sent to the peer (note that this will probably not be port 80 initially due to complications with the Alteon configuration and VIPs) and the ICP port is the port on which ICP requests should be sent. (The port data and recommended configurations will be provided prior to deployment.) The only option allowed currently is "proxy-only" which implements the proxy-only behavior described previously.

To maintain consistency in the squid.conf file across the network, canonical neighbors names should be used for the sibling caches.


Note: This is leveraging work which is being done to ease the current NTP configuration.


A canonical neighbor name is like agent03, which within a cluster resolves to the address of the 03 agent within that cluster. A prototype squid.conf peering section for intra-cluster peering may look like the following:

cache_peer agent01 sibling 8803 8809 
cache_peer agent03 sibling 8803 8809 
cache_peer agent05 sibling 8803 8809 
cache_peer agent07 sibling 8803 8809 
cache_peer agent09 sibling 8803 8809 
cache_peer agent11 sibling 8803 8809 

On the actual 01 agent, the agent01 line needs to be commented out, and on the 03 agent, the agent03 line needs to be commented out, etc. (Note: this restriction will probably be removed before deployment.) However, a single peering section for 01, 03, 05, 07, 09 and 11 agents will be valid network-wide if the canonical neighbor name scheme is adopted. There is no requirement to comment out lines which refer to non-existent agents within a cluster. If the name doesn't resolve, Squid will not use that peer. If the name resolves to a non-responsive IP address, Squid will quickly detect the non-responsive peer and won't expect to receive any responses from it.

The initial parent cache configuration is expected to be static and will apply to all agents. The current plan is to set up a relatively small set of parent caches (on the order of 10 to 20) for use by the entire network. These may also use canonical names which are aliases for the selected parent cache distributors. The parent configuration section may look like:

cache_peer parent1.fp.sandpiper.net parent 8803 8809 
cache_peer parent2.fp.sandpiper.net parent 8803 8809 
cache_peer parent3.fp.sandpiper.net parent 8803 8809 
[...] cache_peer parent10.fp.sandpiper.net parent 8803 8809

As mentioned previously, the initial peering configuration will be essentially static. Future releases are expected to support dynamic reconfiguration of peering in response to changing network conditions.

There's a new access control list (ACL) called peer_request. A peer request uses a special header and elicits a more detailed response from the cache, including the Footprint metadata associated with the resource. This ACL may be used to prevent the data from being requested by unauthorized sources. Also, the existing icp_access ACL will need to be configured to enable peering. Details on both ACLs will be provided prior to deployment.

 

6.1 Staged adoption

Although the release will support peering, the adoption will be staged. Initially, only intra-cluster peering will be configured and then only within a few clusters. Once the system has proven itself and operational experience has been gathered, some parents will be added. Eventually, peering will be adopted network wide. There is no requirement to adopt peering simply due to the deployment of the release (which contains other important features and fixes) and a mixed network of older agents which don't support peering and newer agents which do is transparently supported.

6.2 Externally Visible Effects and Operational Considerations

Beyond improved performance and reliability for cache fills and reduced load on origin servers, the principal externally visible effect of peering will be in request logging:

When a resource is filled from or refreshed from another distributor, the distributor ID logged in the access log will be the ID of the peer, not the ID of the local distributor. (This behavior is non-optional.)
The second digit of the extended status code will contain information about peering. This is a bitfield as follows (PRELIMINARY):
Bit 1
Resource was last filled/refreshed from peer
Bit 2
Resource is being served to a peer

Bit 3

 

Resource is a forced cache miss due to invalidation state (occurs when a request which otherwise could have been serviced from the cache is instead forwarded to a parent or an origin server because the invalidation state indicates it is unsafe to service the peer fill request; this should be very rare)

This feature is optional and is enabled via an entry in squid.conf. The expected default is for the feature to be enabled. The only reason to disable it is if there are consumers of logs which will be confused by the data in the second digit, and engineering is currently evaluating this.

Logging of ICP queries: This is an existing Squid feature which may be disabled by placing:

 log_icp_queries off 

Because these appear in the Squid access log directly and would confuse log analysis, we expect to disable them for deployment.

Another externally visible effect appears in the audit log. When an invalidation is issued, the FPMCP command to the cache is logged in log group 'c'. This command will now include an extra argument which is the invalidation token associated with the invalidation being performed. This is part of the mechanism which allows peering to be used at all.

A new command has been added to the FPMCP protocol and when issued it will be logged in the cache.log (as all the FPMCP commands are). The command is peerstate and is used to enable peering and set up the tables used to maintain invalidation state.

The FPMCP protocol number has been bumped to 2. Backward compatibility in both Squid and the distributor core exists: It is possible to use the new distributor core with an older version of Squid, and to use a new version Squid with the prior release of the distributor core.

There's a new header called X-WR-PEER. This header is used on peer fill requests to:

Identify that the request is indeed a peer fill request
Provide the invalidation state information necessary to service the request. The header is also used on the response to provide invalidation state information back to the requesting cache.

Finally, there is one very important operational consideration (which exists in today's deployed network but which bears repeating and is even more important in the context of peering): If an agent is removed from the Successors, it is important to determine whether that agent had issued any invalidations which were not processed to completion by the network. If so, those invalidations need to be re-issued from another master. If not, then resources which would have been invalidated may become stale in some caches, even caches which had processed the invalidation. This is because removal from the SuccessorList causes the caches to de-allocate their state information about invalidations originating from that source. If there were no outstanding invalidations from the removed agent, there's no issue. There's also no correctness issue if a down agent is left on the SuccessorList, although it may adversely affect peering performance if there were outstanding invalidations.

 

7. Troubleshooting

Currently, there are over 33 defects that have been fixed in this version. These are some issues to watch for when peering is enabled. Note that this section talks about failure modes which have been anticipated in the design and which will be tested for during qualification. None of the problems below are expected to occur in normal operation:

Stale resources: The major barrier to the adoption of peering was the lack of a mechanism to handle invalidation on demand while also supporting peering. The issue (as described previously) was that in some instances it was possible for a stale resource to get stuck in a particular cache even though it had processed an invalidation. Although the issue has been addressed, it's possible there is a bug in the design or implementation. Of course, this and all issues listed here will be tested for carefully as the peering solution is qualified.

Peering loops: A peering loop occurs when a cache asks sibling for a resource, the sibling doesn't have it and so it asks another, different sibling and that sibling turns around and asks the original requestor. The design doesn't allow this to occur, but again there could be a bug. If this were to occur, the effect would be a flurry of ICP activity until the original request finally timed out. (It would probably show up as 5xx series errors on the NOC page.)

Memory leaks: There are new data structures being maintained by Squid. Again, leaks will be carefully tested for but after initial adoption, monitoring Squid's memory usage is a good practice (as it is for any new release).

7.1 Significant Error Messages

Obtain information on the new error messages at the following location:

http://comanche.sandpiper.net/leapfrog/doc/release_notes.cgi?build=251.91-251.96&option=NewMessages 

7.2 Fixed Defects For Release 251.96

Obtain information on the fixed defects for the release 251.96 at the following location:

http://comanche.sandpiper.net/leapfrog/doc/bugs_by_build.cgi?build=251.91-251.96 

Note: You must login as ops w/password of ops.


7.3 New Configuration Variables

Obtain information on the new configuration variables at the following location:

http://comanche.sandpiper.net/leapfrog/doc/release_notes.cgi?build=251.91-251.96&option=NewConfigVar 

7.4 New CUI Commands


Note: The masterMigrator command has been removed.


Go to noc.digisle.com to learn about the following commands:

peer
iqueue
lastload
subscriber
refresh (modified)

7.5 Additional Notes

4231: SubscriberTable bindings.

4262: New character for invalidation string.

4290: New value added to the SubscriberTable/resource headers.

4321: You may get the following email alert even though the distributor is not configured as a peer. This is a generic alert and as such, may or may not apply to cache peering. You should find out why squid was restarted, however, and take appropriate action.

Alarm from:

distributor/60107 at a60107.sandpiper.net/211.36.242.167

Cache reset: *** CACHE RESTARTED ***

Successful (but peer responses are disabled)

 

8. Future Enhancements

Future directions for cache peering include:

Better gathering/reporting of peering statistics.

Monitoring tools.

More flexible configuration capability, including on-the-fly reconfiguration both human-driven and automatically in response to changing network status.

More intelligent selection of parent caches, perhaps by using the distributor core for inter-cluster peering rather than going directly from cache to cache.

Qualification and adoption of more of Squid's existing peering features.

Tighter granularity on the invalidation state information so that invalidations issued by one customer don't affect the ability to do peer cache fills for another customer.

-- ======================= http://www.digitalisland.net ========================