At the time of this writing, popular key/value servers include Memcached, Redis, and Riak. While these tools all have different usage focuses, they all have in common that the storage model is based on the retrieval of a value based on a key; as such, they are all potentially suitable for caching, particularly Memcached which is first and foremost designed for caching.
With a caching system in mind, dogpile.cache provides an interface to a particular Python API targeted at that system.
A dogpile.cache configuration consists of the following components:
dogpile.cache includes a Pylibmc backend. A basic configuration looks like:
from dogpile.cache import make_region
region = make_region().configure(
'dogpile.cache.pylibmc',
expiration_time = 3600,
arguments = {
'url':["127.0.0.1"],
}
)
@region.cache_on_arguments()
def load_user_info(user_id):
return some_database.lookup_user_by_id(user_id)
Above, we create a CacheRegion using the make_region() function, then apply the backend configuration via the CacheRegion.configure() method, which returns the region. The name of the backend is the only argument required by CacheRegion.configure() itself, in this case dogpile.cache.pylibmc. However, in this specific case, the pylibmc backend also requires that the URL of the memcached server be passed within the arguments dictionary.
The configuration is separated into two sections. Upon construction via make_region(), the CacheRegion object is available, typically at module import time, for usage in decorating functions. Additional configuration details passed to CacheRegion.configure() are typically loaded from a configuration file and therefore not necessarily available until runtime, hence the two-step configurational process.
Key arguments passed to CacheRegion.configure() include expiration_time, which is the expiration time passed to the Dogpile lock, and arguments, which are arguments used directly by the backend - in this case we are using arguments that are passed directly to the pylibmc module.
The make_region() function currently calls the CacheRegion constructor directly.
A front end to a particular cache backend.
Parameters: |
|
---|
One you have a CacheRegion, the CacheRegion.cache_on_arguments() method can be used to decorate functions, but the cache itself can’t be used until CacheRegion.configure() is called. The interface for that method is as follows:
Configure a CacheRegion.
The CacheRegion itself is returned.
Parameters: |
|
---|
The CacheRegion can also be configured from a dictionary, using the CacheRegion.configure_from_config() method:
Configure from a configuration dictionary and a prefix.
Example:
local_region = make_region()
memcached_region = make_region()
# regions are ready to use for function
# decorators, but not yet for actual caching
# later, when config is available
myconfig = {
"cache.local.backend":"dogpile.cache.dbm",
"cache.local.arguments.filename":"/path/to/dbmfile.dbm",
"cache.memcached.backend":"dogpile.cache.pylibmc",
"cache.memcached.arguments.url":"127.0.0.1, 10.0.0.1",
}
local_region.configure_from_config(myconfig, "cache.local.")
memcached_region.configure_from_config(myconfig,
"cache.memcached.")
The CacheRegion object is our front-end interface to a cache. It includes the following methods:
Return a value from the cache, based on the given key.
If the value is not present, the method returns the token NO_VALUE. NO_VALUE evaluates to False, but is separate from None to distinguish between a cached value of None.
By default, the configured expiration time of the CacheRegion, or alternatively the expiration time supplied by the expiration_time argument, is tested against the creation time of the retrieved value versus the current time (as reported by time.time()). If stale, the cached value is ignored and the NO_VALUE token is returned. Passing the flag ignore_expiration=True bypasses the expiration time check.
Changed in version 0.3.0: CacheRegion.get() now checks the value’s creation time against the expiration time, rather than returning the value unconditionally.
The method also interprets the cached value in terms of the current “invalidation” time as set by the invalidate() method. If a value is present, but its creation time is older than the current invalidation time, the NO_VALUE token is returned. Passing the flag ignore_expiration=True bypasses the invalidation time check.
New in version 0.3.0: Support for the CacheRegion.invalidate() method.
Parameters: |
|
---|
Return a cached value based on the given key.
If the value does not exist or is considered to be expired based on its creation time, the given creation function may or may not be used to recreate the value and persist the newly generated value in the cache.
Whether or not the function is used depends on if the dogpile lock can be acquired or not. If it can’t, it means a different thread or process is already running a creation function for this key against the cache. When the dogpile lock cannot be acquired, the method will block if no previous value is available, until the lock is released and a new value available. If a previous value is available, that value is returned immediately without blocking.
If the invalidate() method has been called, and the retrieved value’s timestamp is older than the invalidation timestamp, the value is unconditionally prevented from being returned. The method will attempt to acquire the dogpile lock to generate a new value, or will wait until the lock is released to return the new value.
Changed in version 0.3.0: The value is unconditionally regenerated if the creation time is older than the last call to invalidate().
Parameters: |
|
---|
See also
CacheRegion.cache_on_arguments() - applies get_or_create() to any function using a decorator.
CacheRegion.get_or_create_multi() - multiple key/value version
Place a new value in the cache under the given key.
Remove a value from the cache.
This operation is idempotent (can be called multiple times, or on a non-existent key, safely)
A function decorator that will cache the return value of the function using a key derived from the function itself and its arguments.
The decorator internally makes use of the CacheRegion.get_or_create() method to access the cache and conditionally call the function. See that method for additional behavioral details.
E.g.:
@someregion.cache_on_arguments()
def generate_something(x, y):
return somedatabase.query(x, y)
The decorated function can then be called normally, where data will be pulled from the cache region unless a new value is needed:
result = generate_something(5, 6)
The function is also given an attribute invalidate(), which provides for invalidation of the value. Pass to invalidate() the same arguments you’d pass to the function itself to represent a particular value:
generate_something.invalidate(5, 6)
Another attribute set() is added to provide extra caching possibilities relative to the function. This is a convenience method for CacheRegion.set() which will store a given value directly without calling the decorated function. The value to be cached is passed as the first argument, and the arguments which would normally be passed to the function should follow:
generate_something.set(3, 5, 6)
The above example is equivalent to calling generate_something(5, 6), if the function were to produce the value 3 as the value to be cached.
New in version 0.4.1: Added set() method to decorated function.
Similar to set() is refresh(). This attribute will invoke the decorated function and populate a new value into the cache with the new value, as well as returning that value:
newvalue = generate_something.refresh(5, 6)
New in version 0.5.0: Added refresh() method to decorated function.
Lastly, the get() method returns either the value cached for the given key, or the token NO_VALUE if no such key exists:
value = generate_something.get(5, 6)
New in version 0.5.3: Added get() method to decorated function.
The default key generation will use the name of the function, the module name for the function, the arguments passed, as well as an optional “namespace” parameter in order to generate a cache key.
Given a function one inside the module myapp.tools:
@region.cache_on_arguments(namespace="foo")
def one(a, b):
return a + b
Above, calling one(3, 4) will produce a cache key as follows:
myapp.tools:one|foo|3 4
The key generator will ignore an initial argument of self or cls, making the decorator suitable (with caveats) for use with instance or class methods. Given the example:
class MyClass(object):
@region.cache_on_arguments(namespace="foo")
def one(self, a, b):
return a + b
The cache key above for MyClass().one(3, 4) will again produce the same cache key of myapp.tools:one|foo|3 4 - the name self is skipped.
The namespace parameter is optional, and is used normally to disambiguate two functions of the same name within the same module, as can occur when decorating instance or class methods as below:
class MyClass(object):
@region.cache_on_arguments(namespace='MC')
def somemethod(self, x, y):
""
class MyOtherClass(object):
@region.cache_on_arguments(namespace='MOC')
def somemethod(self, x, y):
""
Above, the namespace parameter disambiguates between somemethod on MyClass and MyOtherClass. Python class declaration mechanics otherwise prevent the decorator from having awareness of the MyClass and MyOtherClass names, as the function is received by the decorator before it becomes an instance method.
The function key generation can be entirely replaced on a per-region basis using the function_key_generator argument present on make_region() and CacheRegion. If defaults to function_key_generator().
Parameters: |
|
---|
Backends are located using the setuptools entrypoint system. To make life easier for writers of ad-hoc backends, a helper function is included which registers any backend in the same way as if it were part of the existing sys.path.
For example, to create a backend called DictionaryBackend, we subclass CacheBackend:
from dogpile.cache.api import CacheBackend, NO_VALUE
class DictionaryBackend(CacheBackend):
def __init__(self, arguments):
self.cache = {}
def get(self, key):
return self.cache.get(key, NO_VALUE)
def set(self, key, value):
self.cache[key] = value
def delete(self, key):
self.cache.pop(key)
Then make sure the class is available underneath the entrypoint dogpile.cache. If we did this in a setup.py file, it would be in setup() as:
entry_points="""
[dogpile.cache]
dictionary = mypackage.mybackend:DictionaryBackend
"""
Alternatively, if we want to register the plugin in the same process space without bothering to install anything, we can use register_backend:
from dogpile.cache import register_backend
register_backend("dictionary", "mypackage.mybackend", "DictionaryBackend")
Our new backend would be usable in a region like this:
from dogpile.cache import make_region
region = make_region("myregion")
region.configure("dictionary")
data = region.set("somekey", "somevalue")
The values we receive for the backend here are instances of CachedValue. This is a tuple subclass of length two, of the form:
(payload, metadata)
Where “payload” is the thing being cached, and “metadata” is information we store in the cache - a dictionary which currently has just the “creation time” and a “version identifier” as key/values. If the cache backend requires serialization, pickle or similar can be used on the tuple - the “metadata” portion will always be a small and easily serializable Python structure.
The ProxyBackend is a decorator class provided to easily augment existing backend behavior without having to extend the original class. Using a decorator class is also adventageous as it allows us to share the altered behavior between different backends.
Proxies are added to the CacheRegion object using the CacheRegion.configure() method. Only the overridden methods need to be specified and the real backend can be accessed with the self.proxied object from inside the ProxyBackend.
For example, a simple class to log all calls to .set() would look like this:
from dogpile.cache.proxy import ProxyBackend
import logging
log = logging.getLogger(__name__)
class LoggingProxy(ProxyBackend):
def set(self, key, value):
log.debug('Setting Cache Key: %s' % key)
self.proxied.set(key, value)
ProxyBackend can be be configured to optionally take arguments (as long as the ProxyBackend.__init__() method is called properly, either directly or via super(). In the example below, the RetryDeleteProxy class accepts a retry_count parameter on initialization. In the event of an exception on delete(), it will retry this many times before returning:
from dogpile.cache.proxy import ProxyBackend
class RetryDeleteProxy(ProxyBackend):
def __init__(self, retry_count=5):
super(RetryDeleteProxy, self).__init__()
self.retry_count = retry_count
def delete(self, key):
retries = self.retry_count
while retries > 0:
retries -= 1
try:
self.proxied.delete(key)
return
except:
pass
The wrap parameter of the CacheRegion.configure() accepts a list which can contain any combination of instantiated proxy objects as well as uninstantiated proxy classes. Putting the two examples above together would look like this:
from dogpile.cache import make_region
retry_proxy = RetryDeleteProxy(5)
region = make_region().configure(
'dogpile.cache.pylibmc',
expiration_time = 3600,
arguments = {
'url':["127.0.0.1"],
},
wrap = [ LoggingProxy, retry_proxy ]
)
In the above example, the LoggingProxy object would be instantated by the CacheRegion and applied to wrap requests on behalf of the retry_proxy instance; that proxy in turn wraps requests on behalf of the original dogpile.cache.pylibmc backend.
New in version 0.4.4: Added support for the ProxyBackend class.