Contents
1. Introduction2. Cache Peering Review
2.1 Basic Peering
2.2 Hierarchical Peering
2.3 Proxy-only Peering3. Coherent Peering Extensions
3.1 Background
3.2 Requirements
3.3 Implementation
3.3.1 Invalidation Tokens
3.3.2 Basic Mechanism
3.4 Optimizations
3.5 Avoiding Invalidations after a Peer Fill
3.6 Avoiding Races and a Note on Atomicity4. Footprint 2.02 Implementation
4.1 Invalidation Tokens
4.2 Squid Modifications
4.3 Distributor Modifications
4.4 Implementation6. Configuration
6.1 Staged Adoption
6.2 Externally Visible Effects and Operational Considerations7. Troubleshooting
7.1 Significant Error Messages
7.2 Fixed Defects For Release 251.96
7.3 New Configuration Variables
7.4 New CUI Commands
7.5 Additional Notes
Build 251.96 is now the officially released Footprint version 2.02, with over 33 defects fixed from the previous release. The main feature of this release is the activation of an enhancement called, "Cache Peering." Although this feature has been tested on the B network of approximately 80 computers, you must err on caution when configuring the production network. Also, you must update the master candidates and load them with the appropriate SSL-Proxy software. Section 5. Software Installation suggests a basic upgrade strategy.
In addition, Section 3. Coherent Peering Extensions describes at the architectural level the various extensions to Squid that support Coherent Peering. Coherent Peering is a mechanism that safely allows caches to be peered while supporting asynchronous invalidation of content.
The intended audience comprises Footprint operations personnel and quality assurance personnel. This document assumes basic familiarity with the Footprint system and hierarchical cache peering concepts.
Note: This document does not address certain issues associated with peering, including interactions between peering and rewritten content. Similarly, detailed technical information about the internals of the peering implementation itself is beyond the scope of this document. Digital Island will address these issues in a separate document.
This section covers general cache peering concepts and provides some information
specific to the Footprint implementation. Generally speaking, a cache provides
local copies of a resource that originate from a distant location. The purpose
of a cache is to increase the speed of access to data. For the cache to be efficient,
the data in question must be requested frequently, and change at a rate less
than the request rate.
The higher the ratio of request frequency to change frequency, the more efficient
the cache will be. Any cache established for the first time is empty. When receiving
a request, the cache checks if it has a valid copy of the resource. If such
a copy exists, it is called a cache hit as the cache then provides the
request with its local resource. If the resource does not exist in the cache,
this cache miss redirects the request to the origin point. To improve
speed and immediacy, Digital Island configures local consumers to request data
from the cache rather than from the origin point.
Footprint uses caching for HTTP (and FTP) resources. Each content distributor has a cache. Using proprietary rendezvous technology, Footprint directs the browser client's requests for resources to the content distributor closest to the requestor. When receiving a request, the cache checks for the presence of a local copy of the resource and, if one is present, sends it the requestor. Such a resource is called a cached resource. If the resource is not present, however, the cache then redirects the request to the customer's Web server for the resource. The customer's Web server is the origin point, that is, the origin server. When a cache retrieves a resource from an origin server, the process is called a cache fill for the cache now has its own copy of the resource for future requests.
A major complication in caching is the maintenance and refreshing of the cached data. A resource may change on the origin server, but without a provision for freshness maintenance, caches will continue serving copies of the same resource. Out of date resources are called stale. Even while the up-to-date resources exist at the origin server, the cache would continue serving old copies long after the rehouse has grown stale.
Footprint provides two basic mechanisms for freshness maintenance: expiration policies and Invalidation on Demand. An expiration policy tells a cache how long (or until when) it may continue to serve a resource without checking the origin server for updates. When a request for an expired resource arrives, the cache makes a refresh request to the origin server to check the resource's freshness. If the resource has changed, the origin server provides either a fresh copy of the resource or a new expiration for the old resource if it remains valid.
Invalidation on Demand is a mechanism with which a customer can trigger the immediate expiration of a resource or set of resources, regardless of the date set with whatever expiration had been used. When a cache receives a request for an invalidated resource, it performs the aforementioned freshness check with the origin server.
The above caching system suffers from a significant scalability problem. Although the number of caches may grow, essentially without limits, there remains only one origin server. If a frequently requested resource on a large number of caches becomes invalidated, or if a burst of requests for a brand new resource arrives, the load on the origin server could exceed its capacity, resulting in slow or failed responses to browser clients.
Basic peering is a mechanism that allows caches to perform their fills from other caches in the network instead of, or in addition to, the origin server. Although the first new copy of the resource must come from the origin server, subsequent copies can be retrieved from other caches as well. This reduces the load on the origin server and reduces the response time for cache misses, in many cases quite significantly.
The most basic form of peering is flat peering, typically used within a cluster of caches at a single location. In this model, each cache is a sibling of other caches in the system. For example, when a request hits a cache miss, the cache then requests the resource from its nearest sibling. Of course, if the cache sibling does not have the resource, the cache contacts the origin server. Subsequent requests for the same resource, which arrive at other caches, can retrieve the resource from the first cache to acquire the resource. This reduces the load on the origin server by having only one fill per cluster for the resource, rather than one fill per cache. It also reduces the WAN bandwidth utilization at the cluster, since the peered requests and responses are transmitted over the local LAN and not outside the cluster. Finally it improves response time for the second and subsequent cache misses at the cluster.
The above is an example of intra-cluster peering. Footprint also uses flat peering with inter-cluster, that is, with cache siblings in remote clusters. This type of peering frequently uses a query mechanism which precedes the actual fill request. When a cache receives a request which results in a miss, it sends a query to each of its siblings and waits for a response. The actual cache fill is performed on the first sibling to respond with a positive indication that it indeed has a copy of the resource.
When a request arrives from a sibling which results in a cache miss, the request must then be forwarded directly to the origin server rather than to any of the other siblings. Otherwise, a forwarding loop could result. This can occur even with the use of the query mechanism, due to the window between the query and the actual request. During this window, the resource may stale while the cache provides a positive response to the query. There are other mechanisms for avoiding forwarding loops, but the simplest policy is to avoid using siblings to respond to requests arriving from other siblings.
Another complementary form of peering is hierarchical peering. In addition to the above sets of caches with sibling (i.e. flat peering) relationships, there are parent caches in the system. When a request arrives, which cannot be satisfied by any of a particular cache's siblings, the cache forwards the request to one or more of its parent caches, rather than to the origin server. A cache's parent might have siblings of its own that it can use to satisfy a cache fill request.
If configured properly, hierarchical peering strictly controls the load on the origin server, which is at the very top of the hierarchy. In this case, the maximum load possible on the origin server is equal to the load placed there by the parent caches at the immediate, next level of the hierarchy. Other requests are handled by caches rather than the origin server.
Proxy-only peering is generally used in sibling caches, which reside in the same cluster and which have a low cost communication link between them. In this instance, after a sibling is identified with a copy of the resource, the resource is passed through the cache without the cache retaining its own copy of the resource.
This is useful for conserving resources, such as memory and disk space, by having only one copy of the resource within a cluster. The only negative impact results in longer response times, a higher load on the caches (since they'll receive repeated requests for the same resource from their siblings), and when the one cache with the particular resource becomes unavailable, thus forcing another long-distance cache fill.
This section describes at the architectural level various extensions to Squid that support "coherent peering." Some details of the implementation in Footprint Release 2.02 are also provided. Coherent peering is a mechanism which allows caches to be peered safely while supporting asynchronous invalidation of content.
Note: This document does not address certain issues associated with peering, including interactions between peering and rewritten content. Digital Island will address these issues in a separate document.
Footprint supports on-demand invalidation of content held in the Footprint caches. Invalidation in the caches is asynchronous in that each cache acts on invalidation commands as they are received and there is no feedback about the completion of invalidations to a central controlling authority.
The problem with asynchronous invalidation in its current implementation is that it does not allow for cache peering. For example, a cache which already has processed an invalidation yet can request the resource from a cache that is valid. This could load a potentially stale resource into the cache which had already processed the invalidation, thereby allowing stale content to be served from that cache. The coherent peering mechanism described here avoids that problem while still allowing invalidations to be processed asynchronously.
The basic requirement for coherent peering is that, once an invalidation has been processed by a specific cache, that cache will not load or serve stale content, that is, content which hasn't been refreshed or reloaded from the origin server after the invalidation was started. Customers are expected to update their content before performing the invalidation.
There is no requirement that the algorithm perform absolutely optimally in the face of a rapidly changing invalidation environment. That is, it is acceptable to fill from the origin server even if a particular peer may have also been able to safely satisfy the request. The algorithm must err on the side of safety. There is also no requirement that a resource be invalidated the same number of times it is invalidated by a user, only that the last invalidation issued by a user is reliably processed. The invalidation mechanism may deliver some invalidations out of order, particularly while an agent is synching with an invalidation source. An earlier-issued invalidation for a given resource may be skipped if it can be reliably determined that a later-issued one has already been processed.
Squid generally supports cache peering using a protocol called Internet Cache Protocol (ICP). This is a datagram-based (UDP) protocol wherein a cache can query its peers to determine whether a resource is present on the peer. Upon receipt of a positive response from a peer, the cache requests the resource from that peer rather than the origin server. If no positive responses are received within a small time-out period (dynamically adjusted based on historical response times), the cache requests the resource from the origin server or from a parent cache.
The coherent peering mechanism extends ICP to allow data about invalidation state to become part of the hit or miss decision made by peers in response to ICP requests. The mechanism also includes transmission and consideration of this state information with the actual peer fill request. This is needed to avoid a race condition:
The invalidation state could change after the ICP request/response cycle is complete and a peer has been selected for the fill.
Peer fills with this mechanism generate directly from cache-to-cache. The distributor core is not involved in the peer fill mechanism itself.
Invalidation state information is represented by invalidation tokens. These tokens have two parts: a source part and a sequence part. The source part associates the token with a specific invalidation source. This is necessary because invalidations in the Footprint system can originate from multiple sources, since the master in charge of invalidation dissemination can change at any time. The sequence part is a sequence number assigned by the invalidation source to each invalidation such that a higher sequence number is known to represent a later invalidation than a token from the same source with a lower sequence number. Note that there is no requirement that sequence numbers increase monotonically, only that a higher sequence number represents a later-issued invalidation.
To support different invalidation tagging schemes, tokens are defined to contain (currently) up to 32 bytes of data, with the length of the match and sequence portions being specified when the token is presented. As long as the total amount of data needed doesn't exceed 32 bytes, the range of sources and sequence numbers can be changed at will to support different invalidation mechanisms implemented in the distributors.
In readable form, tokens are represented as:
<source>:<sequence>
Whenever a distributor issues an invalidation to the cache, it provides the
token associated with the invalidation. The distributor is also responsible
for initializing the known and seen tables in the cache at startup. It is not
safe to allow a cache hit for a peer fill until these tables have been initialized.
The basic mechanism involves the maintenance of two sets of invalidation tokens
by each cache. The first set is called the known set. This set of tokens
represents invalidations such that all invalidations from the same source with
the same sequence number or lower are known to have been processed to completion.
This set of tokens is updated by the distributor when invalidations are completed
by the cache.
The second set is called the seen set. This set of tokens represents
invalidations such that any invalidation from the same source with a higher
sequence number is known not to have been seen by that particular cache. This
set of tokens is updated by the cache as invalidations are started.
The basic question the recipient of a peer fill request must answer is, "Have
I not yet processed to completion an invalidation which has already been seen
by the sender?" If the answer to that question is yes, the peer should not allow
the fill to proceed, or it must retrieve a fresh resource from the origin server
or parent cache to satisfy the peer's request.
To accomplish this, the sender sends its complete set of seen invalidations
(one token per invalidation source) to the peer along with the ICP request as
well as with the actual peer fill request. The peer compares this data with
its known invalidations.
For each token sent by the requestor, the recipient looks up the token in
its known table (the lookup is based only on the source portion of the token,
not the sequence number of course) and compares its sequence number to the sender's.
In order to safely satisfy the peer fill request, there must be no tokens in
the recipient's known table which have a sequence number less than the sender's.
Further, if no token for a particular source sent by the sender is found in
the known table, then the peer fill may not proceed.
In a more compact notation, the basic test is:
This ensures that the sender has not seen an invalidation which is not known
to have been processed on the recipient.
For example, assume that the recipient's known table contains the following
tokens: Assume that the sender has the following in its seen table, which it
sends along with its request:
The recipient is known to have processed all invalidations from source "0",
which have sequence numbers of 10 or less, and all invalidations from source
"1" with sequence numbers of 20 or less. The sender has not seen (started)
any invalidations from source 0 with a sequence number greater than 9, or from
source 1 with a sequence number greater than 5. It is currently behind the recipient.
The peer fill is safe in this case.
If there is an outstanding invalidation for the resource, it must have a sequence
number greater than 10 from source 0, 20 from source 1, or it must be from a
brand new source. Since the sender hasn't processed invalidations beyond 0:9
or 1:5, and hasn't started any invalidations from a third source, the peer fill
is safe. If there is an outstanding invalidation, it will be processed by the
sender when it is received.
If the sender has a third source in its seen table, the peer fill is
not safe. The resource could have been invalidated with that particular invalidation,
and the recipient of the request hasn't yet processed the invalidation. Stale
data could be loaded into the sender in this case. The same argument applies
if the sender has seen any invalidations with higher sequence numbers than the
known-complete invalidations on the recipient.
For example, if the recipient's known table is:
and the sender's seen table is: the peer fill might not proceed.
The resource may have been invalidated with invalidation 0:11 or 0:12, both
of which may have already been processed on the sender, but which haven't been
processed by the recipient.
The basic mechanism can be optimized to allow peer fills to proceed in cases
where the basic mechanism would disallow them. One of these optimization is
accomplished by having the cache keep track of the token used to invalidate
each individual resource. If the sender of a peer fill request has previously
invalidated the resource, it may use the token associated with that invalidation
resource to "prune" its seen table.
For example, if a sender's global seen table contains the following
tokens:
but the last invalidation for the particular resource being requested was invalidated
using:
as the token, the sender may replace the token from source 1 in its seen table
with the token on the resource - since that token contains the highest sequence
number from that source which is relevant to that particular resource.
This optimization is possible because the sender has a copy of the resource
with which it can associate more information, namely, the invalidation token
last used to invalidate the resource or the fact that the resource has never
been invalidated by that cache.
In general, the mechanism may be optimized by maintaining data about seen
invalidations at higher levels of granularity. For instance, if the seen
table were maintained on a per-migratory basis, then only the seen tokens
for invalidations associated with that particular migrator would need to be
sent in peering requests. Or, for even tighter granularity, the seen
table could be maintained on a per-resource basis. One attractive aspect of
the mechanism is that a sender may elect to perform these optimizations with
knowledge or cooperation by the recipient. The recipient's processing remains
the same, driven by data from the sender.
The token mechanism may be used to avoid performing a needless invalidation.
The case where this arises is one where there is an outstanding invalidation
for a resource which has not yet been processed on the sender but has been processed
by the recipient. In the normal case, the invalidation will probably be processed
on the sender either during or shortly after the peer fill. If the version of
the resource provided by the recipient has already been invalidated (and refreshed,
of course; stale cache hits are never allowed) by that invalidation, then the
sender doesn't need to invalidate that specific resource when it processes the
invalidation. Therefore, the recipient provides the last token used to invalidate
the resource as part of its peer fill response. If an incoming invalidation
on the sender matches this token, the invalidation may be safely skipped by
the sender for that resource only.
There is a race between the ICP request, incoming invalidations, and the actual
peer fill request. It's possible that the state information sent with the ICP
request will have changed by the time the actual peer fill request is sent.
To handle this, the state information is sent along with the actual peer fill
request and the recipient checks again whether it may safely satisfy the request.
If the answer is "no," the recipient refreshes the resource itself directly
with the origin server or a parent cache and provides the fresh resource to
the sender. The recipient never uses a sibling peer in this case. This avoids
"flapping" and limits the latency associated with the fill.
A note on atomicity: When a fill or refresh is needed, a new
cache entry is created and placed in the cache's table as the first step. This
entry must have the basic data needed to determine whether an invalidation would
invalidate the resource. If the entry is hit by an invalidation before
sufficient data is received to determine whether the invalidation can be skipped,
then the entry must be immediately expired. Only the browser clients currently
requesting the resource will receive that version of the resource. New requests
will go through the peer fill mechanism as before.
If it is not possible to associate enough information with the new cache entry
to ensure that an incoming invalidation will hit it if appropriate, then invalidations
which arrive while the fill is in progress must be attached to the resource
and be replayed against it after sufficient data is received. This is necessary
in the case where metadata on the reply is needed to determine if an invalidation
hits a particular resource.
This section provides information on the Footprint 2.02 implementation of coherent
peering.
The previous text described invalidation tokens in general. In version 2.02,
invalidation tokens total 20 bytes, as follows:
For the source portion,
For the sequence portion:
So, if a particular invalidation source has IP address 63.209.70.231 and its
journal started on July 5th, 2000 at 17:34:34.000 GMT (which is 962,817,214,000
milliseconds since the epoch) the source portion of the token would be:
If that source issued an invalidation with a sequence number of 9722, the entire
token would be:
The journal date must be included in the source identification because if a
journal is lost, the sequence numbers reset to 1.
Most of the modifications for coherent peering are in the Squid cache itself.
The cache maintains the known and seen tables internally. The known table is
only updated on demand by the distributor (see below). The seen table is updated
when invalidation requests are received.
To support this, the FPMCP invalidate command has been extended to take an
additional argument:
This is the token associated with the invalidation.
To initialize the known and seen tables and to enable peering requests and
responses, a new FPMCP command peerstate has been added. The possible
arguments to the peerstate command are:
Enable (on) or disable (off) peering requests. Peering requests are disabled
when the cache first comes up. It is not safe to enable peering requests until
the seen table has been initialized.
Enable (on) or disable (off) peering responses. When peering responses are
disabled, all ICP queries are responded to with a miss indication and received
peer requests are not allowed to hit in the cache. It is not safe to enable
peering responses until the known table has been initialized.
The known table is set to contain only the given tokens. If no tokens are provided,
the known table is cleared.
The seen table is set to contain only the given tokens. If no tokens are provided,
the known table is cleared.
The given tokens (if any; passing none is a nop but is supported) are merged
with the existing known table, for example:
When the incoming tokens that have the same source portion as any token
in the known table are replaced.
The given tokens (if any; passing none is a nop but is supported) are merged
with the existing seen table, as follows:
When the incoming tokens that have the same source portion and a sequence
number that is higher than any token in the seen table.
The given tokens are used to perform a pruning merge of the seen table,
for example:
When incoming tokens with the same source portion and a sequence number
higher than a token in the seen table are replaced. Squid's ICP implementation has been extended to provide a new ICP op code,
as follows:
This stands for "ICP query with invalidation tokens." The message contains
invalidation tokens which are checked against the cache's known table in order
to determine whether a "hit" response is allowed. (The normal check for a cache
hit is of course also performed.) Also, in response to ICP_QUERY_INV, the ICP
implementation will always respond with a miss if peering responses haven't
been enabled using the peerstate command.
Squid's peer selection algorithm has been modified such that it is disabled
if peering requests have not been enabled with the peerstate command.
A new header is defined for peer requests, as follows:
This header serves two purposes on requests:
Identifies that the request is a peer fill request. On replies, the header syntax is:
The header in this case is used to communicate the invalidation token last
used to invalidate the resource on the recipient back to the sender.
Finally, the extended status codes have been modified to contain peering information.
The previously unused second decimal digit of the extended status code is now
encoded as a 3-bit bitfield (max value of 7), as follows:
bit 1 - Set if resource was last filled/refreshed from a peer. The distributor has been modified, as follows:
Initialize the known and seen tables at startup and enable
peering requests/responses. The tables are initialized based on data in the
distributor's existing InvalidatedTable (part of the journalling mechanism).
The FPMCP peerstate command is used for this purpose.
Construct/provide the token associated with an invalidation when issuing
an invalidate command.
Update the known table when an invalidation completes and it represents
the next invalidation in sequence or causes a hole to be closed. This is also
done using data from the distributor's InvalidatedTable.
Reinitialize the known and seen tables when a distributor
is removed from the Successor List. (Invalidations from that distributor are
no longer considered in peering decisions.)
Reinitialize the known and seen tables if a journal is lost
on a distributor in the Successor List.
Reinitialize the known and seen tables if a complete cache
flush is needed (due to missed, unrecoverable invalidations).
Footprint uses a heavily modified version of the Squid proxy cache. Squid has
a relatively feature-rich peering capability which implements flat and hierarchical
peering with querying and a proxy-only option. Squid uses ICP (Inter Cache Protocol
which is documented in RFC2186 and RFC2187) as the querying mechanism. (This
does not appear to be a standards track protocol at this time. The RAFs are
in the "informational" category and were last updated in 1997.)
Footprint's support for invalidation on demand, which is one of the features
added to Squid, made it impossible to use Squid's peering implementation as
is. (There were other barriers as well, but this was the major one.) The basic
problem arises when one cache which has processed an invalidation attempts a
fill from a peer which has yet to process the invalidation. The result is that
the requesting cache gets a stale copy of the resource stuck in its cache, since
it already processed the invalidation. Obviously, any peering solution must
address this issue.
The Footprint peering implementation leverages Squid peering capabilities by
extending the ICP protocol to include information about invalidation state,
and by extending FPMCP (Footprint Managed Cache Protocol) to allow invalidation
state information to be maintained by the caches themselves. This information
allows a cache which receives an ICP query to determine whether it may safely
service the request. The implementation supports direct cache-to-cache peering.
There is no distributor involved in handling peer fill requests or queries.
Note:
As of yet, we do not have the final software installation
instructions. The procedures that follow in this section are
incomplete and may or may not satisfy the installation parameters.
To install Footprint's release 2.02, perform the following steps:
1. Stop the secondary master. If the secondary master restarts as required, continue through the following
steps:
5. Stop the tertiary master(s). When the tertiary master(s) restarts correctly, perform the final steps:
9. Stop the primary master. Note:
Some of this data may change prior to deployment due to issues
discovered during the QA cycle, implementation of extra features,
removal of restrictions, etc. This section is intended for
illustrative purposes only. Detailed and current information
will be provided prior to the deployment.]
Peering is configured using Squid's existing peer configuration facilities.
Note that Squid provides more configuration options than will be tested and
approved by engineering. Only the approved configuration options should be used.
The peer configuration resides in the squid.conf configuration file. A peer
definition looks like this:
The hostname is the name of the peer. The type indicates whether
the peer is a parent or sibling. The HTTP port is the port on which HTTP
requests should be sent to the peer (note that this will probably not be port
80 initially due to complications with the Alteon configuration and VIPs) and
the ICP port is the port on which ICP requests should be sent. (The port
data and recommended configurations will be provided prior to deployment.) The
only option allowed currently is "proxy-only" which implements the proxy-only
behavior described previously.
To maintain consistency in the squid.conf file across the network, canonical
neighbors names should be used for the sibling caches.
Note:
This is leveraging work which is being done to ease the current NTP
configuration.
A canonical neighbor name is like agent03, which within a cluster resolves
to the address of the 03 agent within that cluster. A prototype squid.conf peering
section for intra-cluster peering may look like the following:
On the actual 01 agent, the agent01 line needs to be commented out,
and on the 03 agent, the agent03 line needs to be commented out, etc. (Note:
this restriction will probably be removed before deployment.) However, a single
peering section for 01, 03, 05, 07, 09 and 11 agents will be valid network-wide
if the canonical neighbor name scheme is adopted. There is no requirement to
comment out lines which refer to non-existent agents within a cluster. If the
name doesn't resolve, Squid will not use that peer. If the name resolves to
a non-responsive IP address, Squid will quickly detect the non-responsive peer
and won't expect to receive any responses from it.
The initial parent cache configuration is expected to be static and will apply
to all agents. The current plan is to set up a relatively small set of parent
caches (on the order of 10 to 20) for use by the entire network. These may also
use canonical names which are aliases for the selected parent cache distributors.
The parent configuration section may look like:
As mentioned previously, the initial peering configuration will be essentially
static. Future releases are expected to support dynamic reconfiguration of peering
in response to changing network conditions.
There's a new access control list (ACL) called peer_request. A peer request
uses a special header and elicits a more detailed response from the cache, including
the Footprint metadata associated with the resource. This ACL may be used to
prevent the data from being requested by unauthorized sources. Also, the existing
icp_access ACL will need to be configured to enable peering. Details on both
ACLs will be provided prior to deployment.
Although the release will support peering, the adoption will be staged. Initially,
only intra-cluster peering will be configured and then only within a few clusters.
Once the system has proven itself and operational experience has been gathered,
some parents will be added. Eventually, peering will be adopted network wide.
There is no requirement to adopt peering simply due to the deployment of the
release (which contains other important features and fixes) and a mixed network
of older agents which don't support peering and newer agents which do is transparently
supported.
Beyond improved performance and reliability for cache fills and reduced load
on origin servers, the principal externally visible effect of peering will be
in request logging:
When a resource is filled from or refreshed from another distributor, the
distributor ID logged in the access log will be the ID of the peer, not the
ID of the local distributor. (This behavior is non-optional.) Bit 3
This feature is optional and is enabled via an entry in squid.conf. The expected
default is for the feature to be enabled. The only reason to disable it is if
there are consumers of logs which will be confused by the data in the second
digit, and engineering is currently evaluating this.
Logging of ICP queries: This is an existing Squid feature which may
be disabled by placing:
Because these appear in the Squid access log directly and would confuse log
analysis, we expect to disable them for deployment.
Another externally visible effect appears in the audit log. When an invalidation
is issued, the FPMCP command to the cache is logged in log group 'c'. This command
will now include an extra argument which is the invalidation token associated
with the invalidation being performed. This is part of the mechanism which allows
peering to be used at all.
A new command has been added to the FPMCP protocol and when issued it will
be logged in the cache.log (as all the FPMCP commands are). The command is peerstate
and is used to enable peering and set up the tables used to maintain invalidation
state.
The FPMCP protocol number has been bumped to 2. Backward compatibility in both
Squid and the distributor core exists: It is possible to use the new distributor
core with an older version of Squid, and to use a new version Squid with the
prior release of the distributor core.
There's a new header called X-WR-PEER. This header is used on peer fill requests
to:
Identify that the request is indeed a peer fill request Finally, there is one very important operational consideration (which exists
in today's deployed network but which bears repeating and is even more important
in the context of peering): If an agent is removed from the Successors, it is
important to determine whether that agent had issued any invalidations which
were not processed to completion by the network. If so, those invalidations
need to be re-issued from another master. If not, then resources which would
have been invalidated may become stale in some caches, even caches which had
processed the invalidation. This is because removal from the SuccessorList causes
the caches to de-allocate their state information about invalidations originating
from that source. If there were no outstanding invalidations from the removed
agent, there's no issue. There's also no correctness issue if a down agent is
left on the SuccessorList, although it may adversely affect peering performance
if there were outstanding invalidations.
Currently, there are over 33 defects that have been fixed in this version.
These are some issues to watch for when peering is enabled. Note that this section
talks about failure modes which have been anticipated in the design and which
will be tested for during qualification. None of the problems below are expected
to occur in normal operation:
Stale resources: The major barrier to the adoption of peering was
the lack of a mechanism to handle invalidation on demand while also supporting
peering. The issue (as described previously) was that in some instances it
was possible for a stale resource to get stuck in a particular cache even
though it had processed an invalidation. Although the issue has been addressed,
it's possible there is a bug in the design or implementation. Of course, this
and all issues listed here will be tested for carefully as the peering solution
is qualified.
Peering loops: A peering loop occurs when a cache asks sibling for
a resource, the sibling doesn't have it and so it asks another, different
sibling and that sibling turns around and asks the original requestor. The
design doesn't allow this to occur, but again there could be a bug. If this
were to occur, the effect would be a flurry of ICP activity until the original
request finally timed out. (It would probably show up as 5xx series errors
on the NOC page.)
Memory leaks: There are new data structures being maintained by
Squid. Again, leaks will be carefully tested for but after initial adoption,
monitoring Squid's memory usage is a good practice (as it is for any new release).
Obtain information on the new error messages at the following location:
Obtain information on the fixed defects for the release 251.96 at the following
location:
Note:
You must login as ops w/password of ops.
Obtain information on the new configuration variables at the following location:
Note:
The masterMigrator command has been removed.
Go to noc.digisle.com to learn about the following commands:
peer 4231: SubscriberTable bindings.
4262: New character for invalidation string.
4290: New value added to the SubscriberTable/resource headers.
4321: You may get the following email alert even though the distributor
is not configured as a peer. This is a generic alert and as such, may or may
not apply to cache peering. You should find out why squid was restarted, however,
and take appropriate action.
Alarm from:
distributor/60107 at a60107.sandpiper.net/211.36.242.167 Cache reset: *** CACHE RESTARTED ***
Successful (but peer responses are disabled)
Future directions for cache peering include:
Better gathering/reporting of peering statistics.
Monitoring tools.
More flexible configuration capability, including on-the-fly reconfiguration
both human-driven and automatically in response to changing network status.
More intelligent selection of parent caches, perhaps by using the distributor
core for inter-cluster peering rather than going directly from cache to cache.
Qualification and adoption of more of Squid's existing peering features.
Tighter granularity on the invalidation state information so that invalidations
issued by one customer don't affect the ability to do peer cache fills for
another customer.
Note:
In this document, we use simplified, decimal tokens for
clarity.) In general, for a specific source, the sequence numbers
should all be of the same length. If this is not the case, the
mechanism considers any longer sequence portion to be greater
than (that is, to represent a later invalidation) than a shorter
one. Again, this is should not occur in practice and the case
is defined only for the sake of completeness.
3.3.2 Basic Mechanism
peruke = sender(seen) <= recipient(known)
0:10
1:20
0:9
1:5
0:10
1:20
0:12
1:18
3.4 Optimizations
0:192
1:248
1:42
3.5 Avoiding Invalidations after a Peer Fill
3.6 Avoiding Races and a Note on Atomicity
4. Footprint 2.02 Implementation
4.1 Invalidation tokens
4 byte IP address followed by 8 byte journal date
8 byte sequence number
3fd146e7000000e02c60c630
3fd146e7000000e02c60c630:00000000000025fa
4.2 Squid Modifications
tok=<token>
request=on|off
response=on|off
setknown=[<tok>[[,
setseen=[<tok>[[,
mergeknown=[<tok>[[,
mergeseen=[<tok>[[,
pmergeseen=[<tok>[[,
Any tokens in the seen table are removed if they do not have corresponding
(source match) tokens in the incoming token set.
ICP_QUERY_INV.
X-WR-PEER: [tok=<tok>[[,
Provides the tokens, if any, which should be compared with the recipients
known table to determine if a peer cache fill is safe.
X-WR-PEER: [tok=
bit 2 - Set if resource is being served to a peer.
Bit 3 - Set if the peer fill request forced a cache miss due to the presented
tokens.
4.3 Distributor Modifications
4.4 Implementation
5. Software Installation
2. Install new ssl-proxy software (104) via pkgadd.
3. Update master to 251.96
4. Restart the secondary master.
6. Install new ssl-proxy software (104).
7. Update to 251.96
8. Start the tertiary master(s).
10. Install new ssl-proxy software (104).
11. Update to 251.96
12. Start the primary master.
6. Configuration
catchpenny <hostname> <type> <HTTP port> <ICP
port>
cache_peer agent01 sibling 8803 8809
cache_peer agent03 sibling 8803 8809
cache_peer agent05 sibling 8803 8809
cache_peer agent07 sibling 8803 8809
cache_peer agent09 sibling 8803 8809
cache_peer agent11 sibling 8803 8809
cache_peer parent1.fp.sandpiper.net parent 8803 8809
cache_peer parent2.fp.sandpiper.net parent 8803 8809
cache_peer parent3.fp.sandpiper.net parent 8803 8809
[...]
cache_peer parent10.fp.sandpiper.net parent 8803 8809
6.1 Staged adoption
6.2 Externally Visible Effects and Operational Considerations
The second digit of the extended status code will contain information about
peering. This is a bitfield as follows (PRELIMINARY):
Resource was last filled/refreshed from peer
Resource is being served to a peer
Resource is a forced cache miss due to invalidation
state (occurs when a request which otherwise could have been serviced
from the cache is instead forwarded to a parent or an origin server because
the invalidation state indicates it is unsafe to service the peer fill
request; this should be very rare)
log_icp_queries off
Provide the invalidation state information necessary to service the request.
The header is also used on the response to provide invalidation state information
back to the requesting cache.
7. Troubleshooting
7.1 Significant Error Messages
http://comanche.sandpiper.net/leapfrog/doc/release_notes.cgi?build=251.91-251.96&option=NewMessages
7.2 Fixed Defects For Release 251.96
http://comanche.sandpiper.net/leapfrog/doc/bugs_by_build.cgi?build=251.91-251.96
7.3 New Configuration Variables
http://comanche.sandpiper.net/leapfrog/doc/release_notes.cgi?build=251.91-251.96&option=NewConfigVar
7.4 New CUI Commands
iqueue
lastload
subscriber
refresh (modified)
7.5 Additional Notes
8. Future Enhancements