This sections provides a more in-depth description of the Collector class and some of the shortcut functions.
The Collector needs two things to monitor a variable:
When you create a new collector, you must pass a tuple (name, collector_func) for each variable you want to monitor:
>>> from collectors import Collector
>>> a = 1
>>> b = 2
>>>
>>> def get_b(factor):
... return factor * b
...
>>> c = Collector(
... ('a', lambda: a),
... ('b', get_b)
... )
A variable’s name must be a string and should also be a valid Python identifier. The collector function can be anything that’s callable—it might even take a parameter.
By default, the Collector creates a Python list for each variable which will hold all monitored values. We will call this list series here.
The series for a variable is accessible either by index or as an attribute (this is why name should be a valid identifier):
>>> c
([], [])
>>> c[0] == c.a, c[1] == c.b
(True, True)
Each time the Collector (or its collect() method) is called, it calls every collector function in the order they were initially passed to it and appends their return value to each variable’s series. If a collector function needs a parameter, you must pass it as keyword argument:
>>> c
([], [])
>>> c(b=4) # c.collect(b=4) would do the same
>>> c
([1], [8])
Collectors has some shortcut functions included that help you save typing. They are defined in collectors.shortcuts but can also be import directly from collectors for even less typing. ;-)
In most cases you’ll probably end up using Collectors like this:
>>> class Spam(object):
... def __init__(self, a, b, c):
... self.a = a
... self.b = b
... self.c = c
...
... self.collector = Collector(
... ('a', lambda: self.a),
... ('b', lambda: self.b),
... ('c', lambda: self.c),
... )
Setting up a Collector like this is very tedious and repetitive. The shortcut get() allows you to create these tuples much faster:
>>> from collectors import get
>>> class Spam(object):
... def __init__(self, a, b, c):
... self.a = a
... self.b = b
... self.c = c
...
... self.collector = Collector(get(self, 'a', 'b', 'c'))
You must pass an object and the names of attributes to get(). For each attribute, it generates a tuple ('attr', lambda: getattr(obj, 'attr')) for you.
If you want to monitor the same attributes for many Spam instances with only one Collector, there is another shortcut called get_objects() that works in a similar way:
>>> from collectors import get_objects
>>> class Spam(object):
... def __init__(self, id):
... self.id = '_%d' % id
... self.a, self.b = 0, 0
...
>>> spams = [Spam(i) for i in range(10)]
>>> collector = Collector(get_objects(spams, 'id', 'a', 'b'))
Similarly to get(), get_objects() creates a (name, func) tuple for the attributes of all passed objects. In contrast to get() you must also define an id attribute which will be prefixed to each name in order to make them distinguishable. Since the names become attributes of the Collector instance, they must not be pure integers.
In the above example, collector would have the attributes _0_a, _0_b, _1_a and so forth.
Sometimes you might want to save some calculation results on-the-fly, were you would use lambda x: x as a collector function. A shortcut for that is manual():
>>> from collectors import Collector, manual
>>> collector = Collector(('val', manual))
>>> for i in range(10):
... collector(val=i)
...
>>> collector.val
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
You can of course freely mix shortcut functions and “normal” tuples:
>>> def foo():
... return spam.a + spam.b
...
>>> collector = Collector(
... ('foo', foo),
... ('bar', manual),
... get(spams[0], 'a', 'b'),
... get_objects(spams, 'id', 'a'),
... )
By default, Collector stores all collected values in plain Python lists, but it is also able to store them in various other formats like PyTables/HDF5 or MS Excel. The next section explains the various storage classes and how you can create your own.