Huey’s API

Note

The django API is a slightly simplified version of the general python API. For details on using the django API, read here

Most end-users will interact with the API using the two decorators in huey.decorators:

  • queue_command()
  • periodic_command()

Each decorator takes an Invoker instance – the Invoker is responsible for coordinating with the various backends (the message queue, the result store if you’re using one, scheduling commands, etc). The API documentation will follow the structure of the huey API, starting with the highest-level interfaces (the decorators) and eventually discussing the lowest-level interfaces, the BaseQueue and BaseDataStore objects.

Function decorators and helpers

huey.decorators.queue_command(invoker[, retries=0[, retry_delay=0]])

Function decorator that marks the decorated function for processing by the consumer. Calls to the decorated function will do the following:

  1. Serialize the function call into a message suitable for storing in the queue
  2. Enqueue the message for execution by the consumer
  3. If a ResultStore has been configured, return an AsyncData instance which can retrieve the result of the function, or None if not using a result store.

Note

The Invoker can be configured to execute the function immediately by instantiating it with always_eager = True – this is useful for running in debug mode or when you do not wish to run the consumer.

Here is how you might use the queue_command decorator:

# assume that we've created an invoker alongside the rest of the
# config
from config import invoker
from huey.decorators import queue_command

@queue_command(invoker)
def count_some_beans(num):
    # do some counting!
    return 'Counted %s beans' % num

Now, whenever you call this function in your application, the actual processing will occur when the consumer dequeues the message and your application will continue along on its way.

Without a result store:

>>> res = count_some_beans(1000000)
>>> res is None
True

With a result store:

>>> res = count_some_beans(1000000)
>>> res
<huey.queue.AsyncData object at 0xb7471a4c>
>>> res.get()
'Counted 1000000 beans'
Parameters:
  • invoker – an Invoker instance
  • retries – number of times to retry the task if an exception occurs
  • retry_delay – number of seconds to wait between retries
Return type:

decorated function

The return value of any calls to the decorated function depends on whether the invoker is configured with a result store. If a result store is configured, the decorated function will return an AsyncData object which can fetch the result of the call from the result store – otherwise it will simply return None.

The queue_command decorator also does one other important thing – it adds a special function onto the decorated function, which makes it possible to schedule the execution for a certain time in the future:

{decorated func}.schedule(args=None, kwargs=None, eta=None, convert_utc=True)

Use the special .schedule() function to schedule the execution of a queue command for a given time in the future:

import datetime

# get a datetime object representing one hour in the future
in_an_hour = datetime.datetime.now() + datetime.timedelta(seconds=3600)

# schedule "count_some_beans" to run in an hour
count_some_beans.schedule(args=(100000,), eta=in_an_hour)
Parameters:
  • args – arguments to call the decorated function with
  • kwargs – keyword arguments to call the decorated function with
  • eta – a datetime instance specifying the time at which the function should be executed
  • convert_utc – whether the eta should be converted from local time to UTC, defaults to True
Return type:

like calls to the decorated function, will return an AsyncData object if a result store is configured, otherwise returns None

{decorated func}.command_class

Store a reference to the command class for the decorated function.

>>> count_some_beans.command_class
commands.queuecmd_count_beans
huey.decorators.periodic_command(invoker, validate_datetime)

Function decorator that marks the decorated function for processing by the consumer at a specific interval. Calls to functions decorated with periodic_command will execute normally, unlike queue_command(), which enqueues commands for execution by the consumer. Rather, the periodic_command decorator serves to mark a function as needing to be executed periodically by the consumer.

Note

By default, the consumer will not execute periodic_command functions. To enable this, simply add PERIODIC = True to your configuration.

The validate_datetime parameter is a function which accepts a datetime object and returns a boolean value whether or not the decorated function should execute at that time or not. The consumer will send a datetime to the function every minute, giving it the same granularity as the linux crontab, which it was designed to mimic.

For simplicity, there is a special function crontab(), which can be used to quickly specify intervals at which a function should execute. It is described below.

Here is an example of how you might use the periodic_command decorator and the crontab helper:

from config import invoker
from huey.decorators import periodic_command, crontab

@periodic_command(invoker, crontab(minute='*/5'))
def every_five_minutes():
    # this function gets executed every 5 minutes by the consumer
    print "It's been five minutes"

Note

Because functions decorated with periodic_command are meant to be executed at intervals in isolation, they should not take any required parameters nor should they be expected to return a meaningful value. This is the same regardless of whether or not you are using a result store.

Parameters:
  • invoker – an Invoker instance
  • validate_datetime – a callable which takes a datetime and returns a boolean whether the decorated function should execute at that time or not
Return type:

decorated function

Like queue_command(), the periodic command decorator adds several helpers to the decorated function. These helpers allow you to “revoke” and “restore” the periodic command, effectively enabling you to pause it or prevent its execution.

{decorated_func}.revoke([revoke_until=None[, revoke_once=False]])

Prevent the given periodic command from executing. When no parameters are provided the function will not execute again.

This function can be called multiple times, but each call will overwrite the limitations of the previous.

Parameters:
  • revoke_until (datetime) – Prevent the execution of the command until the given datetime. If None it will prevent execution indefinitely.
  • revoke_once (bool) – If True will only prevent execution the next time it would normally execute.
# skip the next execution
every_five_minutes.revoke(revoke_once=True)

# pause the command indefinitely
every_five_minutes.revoke()

# pause the command for 24 hours
every_five_minutes.revoke(datetime.datetime.now() + datetime.timedelta(days=1))
{decorated_func}.is_revoked([dt=None])

Check whether the given periodic command is revoked. If dt is specified, it will check if the command is revoked for the given datetime.

Parameters:dt (datetime) – If provided, checks whether command is revoked at the given datetime
{decorated_func}.restore()

Clears any revoked status and run the command normally

If you want access to the underlying command class, it is stored as an attribute on the decorated function:

{decorated_func}.command_class

Store a reference to the command class for the decorated function.

huey.decorators.crontab(month='*', day='*', day_of_week='*', hour='*', minute='*')

Convert a “crontab”-style set of parameters into a test function that will return True when a given datetime matches the parameters set forth in the crontab.

Acceptable inputs:

  • “*” = every distinct value
  • “*/n” = run every “n” times, i.e. hours=’*/4’ == 0, 4, 8, 12, 16, 20
  • “m-n” = run every time m..n
  • “m,n” = run on m and n
Return type:a test function that takes a datetime and returns a boolean

The Invoker and AsyncData classes

class huey.queue.Invoker(queue[, result_store=None[, task_store=None[, store_none=False[, always_eager=False]]]])

The Invoker ties together your application’s queue, result store, and supplies some options to configure how tasks are executed and how their results are stored.

Applications will have at least one Invoker instance, as it is required by the function decorators. Typically it should be instantiated along with the Queue, or wherever you create your configuration.

Example:

from huey.backends.redis_backend import RedisBlockingQueue, RedisDataStore
from huey.queue import Invoker

queue = RedisBlockingQueue('test-queue', host='localhost', port=6379)
result_store = RedisDataStore('results', host='localhost', port=6379)

# Create an invoker instance, which points at the queue and result store
# which are used by the application's Configuraiton object
invoker = Invoker(queue, result_store=result_store)
class huey.queue.AsyncData(invoker, command)

Although you will probably never instantiate an AsyncData object yourself, they are returned by any calls to queue_command() decorated functions (provided the invoker is configured with a result store). The AsyncData talks to the result store and is responsible for fetching results from tasks. Once the consumer finishes executing a task, the return value is placed in the result store, allowing the producer to retrieve it.

Working with the AsyncData class is very simple:

>>> from main import count_some_beans
>>> res = count_some_beans(100)
>>> res # <--- what is "res" ?
<huey.queue.AsyncData object at 0xb7471a4c>

>>> res.get() # <--- get the result of this task, assuming it executed
'Counted 100 beans'

What happens when data isn’t available yet? Let’s assume the next call takes about a minute to calculate:

>>> res = count_some_beans(10000000) # let's pretend this is slow
>>> res.get() # data is not ready, so returns None

>>> res.get() is None # data still not ready
True

>>> res.get(blocking=True, timeout=5) # block for 5 seconds
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/charles/tmp/huey/src/huey/huey/queue.py", line 46, in get
    raise DataStoreTimeout
huey.exceptions.DataStoreTimeout

>>> res.get(blocking=True) # no timeout, will block until it gets data
'Counted 10000000 beans'
get([blocking=False[, timeout=None[, backoff=1.15[, max_delay=1.0[, revoke_on_timeout=False]]]]])

Attempt to retrieve the return value of a task. By default, it will simply ask for the value, returning None if it is not ready yet. If you want to wait for a value, you can specify blocking = True – this will loop, backing off up to the provided max_delay until the value is ready or until the timeout is reached. If the timeout is reached before the result is ready, a DataStoreTimeout exception will be raised.

Parameters:
  • blocking – boolean, whether to block while waiting for task result
  • timeout – number of seconds to block for (used with blocking=True)
  • backoff – amount to backoff delay each time no result is found
  • max_delay – maximum amount of time to wait between iterations when attempting to fetch result.
  • revoke_on_timeout (bool) – if a timeout occurs, revoke the task
revoke()

Revoke the given command. Unless it is in the process of executing, it will be revoked and the command will not run.

in_an_hour = datetime.datetime.now() + datetime.timedelta(seconds=3600)

# run this command in an hour
res = count_some_beans.schedule(args=(100000,), eta=in_an_hour)

# oh shoot, I changed my mind, do not run it after all
res.revoke()

Configuration

class huey.bin.config.BaseConfiguration

Applications using huey should subclass BaseConfiguration when specifying the configuration options to use. BaseConfiguration is where the queue, result store, and many other settings are configured. The configuration is then used by the consumer to access the queue. All configuration settings are class attributes.

QUEUE

An instance of a Queue class, which must be a subclass of BaseQueue. Tells consumer what queue to pull messages from.

RESULT_STORE

An instance of a DataStore class, which must be a subclass of DataStore or None. Tells consumer where to store results of messages.

TASK_STORE

An instance of a DataStore class, which must be a subclass of DataStore or None. Tells consumer where to serialize the schedule of pending tasks in the event the consumer is shut down unexpectedly.

PERIODIC = False

A boolean value indicating whether the consumer should enqueue periodic tasks

THREADS = 1

Number of worker threads to run

LOGFILE = None
LOGLEVEL = logging.INFO
BACKOFF = 1.15
INITIAL_DELAY = .1
MAX_DELAY = 10
UTC = True

Whether to run using local now() or utcnow() when determining times to execute periodic commands and scheduled commands.

Queues and DataStores

Huey communicates with two types of data stores – queues and datastores. Thinking of them as python datatypes, a queue is sort of like a list and a datastore is sort of like a dict. Queues are FIFOs that store tasks – producers put tasks in on one end and the consumer reads and executes tasks from the other. DataStores are key-based stores that can store arbitrary results of tasks keyed by task id. DataStores can also be used to serialize task schedules so in the event your consumer goes down you can bring it back up and not lose any tasks that had been scheduled.

Huey, like just about a zillion other projects, uses a “pluggable backend” approach, where the interface is defined on a couple classes BaseQueue and BaseDataStore, and you can write an implementation for any datastore you like. The project ships with backends that talk to redis, a fast key-based datastore, but the sky’s the limit when it comes to what you want to interface with. Below is an outline of the methods that must be implemented on each class.

Base classes

class huey.backends.base.BaseQueue(name, **connection)

Queue implementation – any connections that must be made should be created when instantiating this class.

Parameters:
  • name – A string representation of the name for this queue
  • connection – Connection parameters for the queue
blocking = False

Whether the backend blocks when waiting for new results. If set to False, the backend will be polled at intervals, if True it will read and wait.

write(data)

Write data to the queue - has no return value.

Parameters:data – a string
read()

Read data from the queue, returning None if no data is available – an empty queue should not raise an Exception!

Return type:a string message or None if no data is present
flush()

Optional: Delete everything in the queue – used by tests

__len__()

Optional: Return the number of items in the queue – used by tests

class huey.backends.base.BaseDataStore(name, **connection)

Data store implementation – any connections that must be made should be created when instantiating this class.

Parameters:
  • name – A string representation of the name for this data store
  • connection – Connection parameters for the data store
put(key, value)

Store the value using the key as the identifier

get(key)

Retrieve the value stored at the given key, returns a special value EmptyData if no data exists at the given key. This is to differentiate between “no data” and a stored None value.

Warning

After a result is fetched it should be removed from the store!

flush()

Remove all keys

Redis implementation

All the following use the python redis driver written by Andy McCurdy.

class huey.backends.redis_backend.RedisQueue(name, **connection)

Does a simple RPOP to pull messages from the queue, meaning that it polls.

Parameters:
  • name – the name of the queue to use
  • connection – a list of values passed directly into the redis.Redis class
class huey.backends.redis_backend.RedisBlockingQueue(name, **connection)

Does a BRPOP to pull messages from the queue, meaning that it blocks on reads.

Parameters:
  • name – the name of the queue to use
  • connection – a list of values passed directly into the redis.Redis class
class huey.backends.redis_backend.RedisDataStore(name, **connection)

Stores results in a redis hash using HSET, HGET and HDEL

Parameters:
  • name – the name of the data store to use
  • connection – a list of values passed directly into the redis.Redis class

Django API

Good news, the django api is considerably simpler! This is because django has very specific conventions for how things should be configured. If you’re using django you don’t have to worry about invokers or configuration objects – simply configure the queue and result store in the settings and use the decorators and management command to run the consumer.

Function decorators and helpers

huey.djhuey.decorators.queue_command()

Identical to the queue_command() described above, except that it takes no parameters.

from huey.djhuey.decorators import queue_command

@queue_command
def count_some_beans(how_many):
    return 'Counted %s beans' % how_many
huey.djhuey.decorators.periodic_command(validate_datetime)

Identical to the periodic_command() described above, except that it does not take an invoker as its first argument.

from huey.djhuey.decorators import periodic_command, crontab

@periodic_command(crontab(minute='*/5'))
def every_five_minutes():
    # this function gets executed every 5 minutes by the consumer
    print "It's been five minutes"

Configuration

All configuration occurs in the django settings module. Settings are configured using the same names as those in the python api with the exception that queues and data stores can be specified using a string module path, and connection keyword-arguments are specified using a dictionary.

Example configuration:

HUEY_CONFIG = {
    'QUEUE': 'huey.backends.redis_backend.RedisQueue',
    'QUEUE_CONNECTION': {
        'host': 'localhost',
        'port': 6379
    },
    'THREADS': 4,
}

Required settings

QUEUE (string or Queue instance)

Either a queue instance or a string pointing to the module path and class name of the queue. If a string is used, you may also need to specify a connection parameters.

Example: huey.backends.redis_backend.RedisQueue

Optional settings

PERIODIC (boolean), default = False
Determines whether or not to the consumer will enqueue periodic commands. If you are running multiple consumers, only one of them should be configured to enqueue periodic commands.
THREADS (int), default = 1
Number of worker threads to use when processing jobs

LOGFILE (string), default = None

LOGLEVEL (int), default = logging.INFO

BACKOFF (numeric), default = 1.15
How much to increase delay when no jobs are present
INITIAL_DELAY (numeric), default = 0.1
Initial amount of time to sleep when waiting for jobs
MAX_DELAY (numeric), default = 10
Max amount of time to sleep when waiting for jobs
ALWAYS_EAGER, default = False
Whether to skip enqueue-ing and run in-band (useful for debugging)