Paraphrasing the emacs documentation, let us say that hooks are an important mechanism for customizing an application. A hook is basically a list of functions to be called on some well-defined occasion (this is called running the hook).
In CubicWeb, hooks are subclasses of the Hook class. They are selected over a set of pre-defined events (and possibly more conditions, hooks being selectable appobjects like views and components). They should implement a __call__() method that will be called when the hook is triggered.
There are two families of events: data events (before / after any individual update of an entity / or a relation in the repository) and server events (such as server startup or shutdown). In a typical application, most of the hooks are defined over data events.
Also, some Operation may be registered by hooks, which will be fired when the transaction is commited or rollbacked.
The purpose of data event hooks is usually to complement the data model as defined in the schema, which is static by nature and only provide a restricted builtin set of dynamic constraints, with dynamic or value driven behaviours. For instance they can serve the following purposes:
It is functionally equivalent to a database trigger, except that database triggers definition languages are not standardized, hence not portable (for instance, PL/SQL works with Oracle and PostgreSQL but not SqlServer nor Sqlite).
Hint
It is a good practice to write unit tests for each hook. See an example in Unit test by example
Operations are subclasses of the Operation class that may be created by hooks and scheduled to happen just before (or after) the precommit, postcommit or rollback event. Hooks are being fired immediately on data operations, and it is sometime necessary to delay the actual work down to a time where all other hooks have run. Also while the order of execution of hooks is data dependant (and thus hard to predict), it is possible to force an order on operations.
Operations may be used to:
Hooks are mostly defined and used to handle dataflow operations. It means as data gets in (entities added, updated, relations set or unset), specific events are issued and the Hooks matching these events are called.
You can get the event that triggered a hook by accessing its :attr:event attribute.
Hooks called on server start/maintenance/stop event (eg ‘server_startup’, ‘server_maintenance’, ‘server_shutdown’) have a repo attribute, but their `_cw` attribute is None. The server_startup is called on regular startup, while server_maintenance is called on cubicweb-ctl upgrade or shell commands. server_shutdown is called anyway.
Hooks called on backup/restore event (eg ‘server_backup’, ‘server_restore’) have a repo and a timestamp attributes, but their `_cw` attribute is None.
Hooks called on session event (eg ‘session_open’, ‘session_close’) have no special attribute.
It is sometimes convenient to explicitly enable or disable some hooks. For instance if you want to disable some integrity checking hook. This can be controlled more finely through the category class attribute, which is a string giving a category name. One can then uses the hooks_control context manager to explicitly enable or disable some categories.
context manager to control activated hooks categories.
If mode is session.`HOOKS_DENY_ALL`, given hooks categories will be enabled.
If mode is session.`HOOKS_ALLOW_ALL`, given hooks categories will be disabled.
with hooks_control(self.session, self.session.HOOKS_ALLOW_ALL, 'integrity'):
# ... do stuff with all but 'integrity' hooks activated
with hooks_control(self.session, self.session.HOOKS_DENY_ALL, 'integrity'):
# ... do stuff with none but 'integrity' hooks activated
The existing categories are:
Nothing precludes one to invent new categories and use the hooks_control context manager to filter them in or out.
accept if parameters specified as initializer arguments are specified in named arguments given to the selector
Parameter: | *expected – parameters (eg basestring) which are expected to be found in named arguments (kwargs) |
---|
Base class for hook.
Hooks being appobjects like views, they have a __regid__ and a __select__ class attribute. Like all appobjects, hooks have the self._cw attribute which represents the current session. In entity hooks, a self.entity attribute is also present.
The events tuple is used by the base class selector to dispatch the hook on the right events. It is possible to dispatch on multiple events at once if needed (though take care as hook attribute may vary as described above).
Note
Do not forget to extend the base class selectors as in
.. sourcecode:: python
- class MyHook(Hook):
- __regid__ = ‘whatever’ __select__ = Hook.__select__ & is_instance(‘Person’)
else your hooks will be called madly, whatever the event.
Base class for operations.
Operation may be instantiated in the hooks’ __call__ method. It always takes a session object as first argument (accessible as .session from the operation instance), and optionally all keyword arguments needed by the operation. These keyword arguments will be accessible as attributes from the operation instance.
An operation is triggered on connections pool events related to commit / rollback transations. Possible events are:
‘precommit’:
the transaction is being prepared for commit. You can freely do any heavy computation, raise an exception if the commit can’t go. or even add some new operations during this phase. If you do anything which has to be reverted if the commit fails afterwards (eg altering the file system for instance), you’ll have to support the ‘revertprecommit’ event to revert things by yourself
‘revertprecommit’:
if an operation failed while being pre-commited, this event is triggered for all operations which had their ‘precommit’ event already fired to let them revert things (including the operation which made the commit fail)
‘rollback’:
the transaction has been either rollbacked either:
- intentionaly
- a ‘precommit’ event failed, in which case all operations are rollbacked once ‘revertprecommit’’ has been called
‘postcommit’:
the transaction is over. All the ORM entities accessed by the earlier transaction are invalid. If you need to work on the database, you need to start a new transaction, for instance using a new internal session, which you will need to commit (and close!).
For an operation to support an event, one has to implement the <event name>_event method with no arguments.
Notice order of operations may be important, and is controlled according to the insert_index’s method output (whose implementation vary according to the base hook class used).
We will use a very simple example to show hooks usage. Let us start with the following schema.
class Person(EntityType):
age = Int(required=True)
We would like to add a range constraint over a person’s age. Let’s write an hook (supposing yams can not handle this nativly, which is wrong). It shall be placed into mycube/hooks.py. If this file were to grow too much, we can easily have a mycube/hooks/... package containing hooks in various modules.
from cubicweb import ValidationError
from cubicweb.selectors import is_instance
from cubicweb.server.hook import Hook
class PersonAgeRange(Hook):
__regid__ = 'person_age_range'
__select__ = Hook.__select__ & is_instance('Person')
events = ('before_add_entity', 'before_update_entity')
def __call__(self):
if 'age' in self.entity.cw_edited:
if 0 <= self.entity.age <= 120:
return
msg = self._cw._('age must be between 0 and 120')
raise ValidationError(self.entity.eid, {'age': msg})
In our example the base __select__ is augmented with an is_instance selector matching the desired entity type.
The events tuple is used specify that our hook should be called before the entity is added or updated.
Then in the hook’s __call__ method, we:
Now Let’s augment our schema with new Company entity type with some relation to Person (in ‘mycube/schema.py’).
class Company(EntityType):
name = String(required=True)
boss = SubjectRelation('Person', cardinality='1*')
subsidiary_of = SubjectRelation('Company', cardinality='*?')
We would like to constrain the company’s bosses to have a minimum (legal) age. Let’s write an hook for this, which will be fired when the boss relation is established (still supposing we could not specify that kind of thing in the schema).
class CompanyBossLegalAge(Hook):
__regid__ = 'company_boss_legal_age'
__select__ = Hook.__select__ & match_rtype('boss')
events = ('before_add_relation',)
def __call__(self):
boss = self._cw.entity_from_eid(self.eidto)
if boss.age < 18:
msg = self._cw._('the minimum age for a boss is 18')
raise ValidationError(self.eidfrom, {'boss': msg})
Note
We use the match_rtype selector to select the proper relation type.
The essential difference with respect to an entity hook is that there is no self.entity, but self.eidfrom and self.eidto hook attributes which represent the subject and object eid of the relation.
Suppose we want to check that there is no cycle by the subsidiary_of relation. This is best achieved in an operation since all relations are likely to be set at commit time.
from cubicweb.server.hook import Hook, Operation, match_rtype
def check_cycle(self, session, eid, rtype, role='subject'):
parents = set([eid])
parent = session.entity_from_eid(eid)
while parent.related(rtype, role):
parent = parent.related(rtype, role)[0]
if parent.eid in parents:
msg = session._('detected %s cycle' % rtype)
raise ValidationError(eid, {rtype: msg})
parents.add(parent.eid)
class CheckSubsidiaryCycleOp(Operation):
def precommit_event(self):
check_cycle(self.session, self.eidto, 'subsidiary_of')
class CheckSubsidiaryCycleHook(Hook):
__regid__ = 'check_no_subsidiary_cycle'
__select__ = Hook.__select__ & match_rtype('subsidiary_of')
events = ('after_add_relation',)
def __call__(self):
CheckSubsidiaryCycleOp(self._cw, eidto=self.eidto)
Like in hooks, ValidationError can be raised in operations. Other exceptions are usually programming errors.
In the above example, our hook will instantiate an operation each time the hook is called, i.e. each time the subsidiary_of relation is set. There is an alternative method to schedule an operation from a hook, using the set_operation() function.
from cubicweb.server.hook import set_operation
class CheckSubsidiaryCycleHook(Hook):
__regid__ = 'check_no_subsidiary_cycle'
events = ('after_add_relation',)
__select__ = Hook.__select__ & match_rtype('subsidiary_of')
def __call__(self):
set_operation(self._cw, 'subsidiary_cycle_detection', self.eidto,
CheckSubsidiaryCycleOp)
class CheckSubsidiaryCycleOp(Operation):
def precommit_event(self):
for eid in self.session.transaction_data['subsidiary_cycle_detection']:
check_cycle(self.session, eid, self.rtype)
Here, we call set_operation() so that we will simply accumulate eids of entities to check at the end in a single CheckSubsidiaryCycleOp operation. Value are stored in a set associated to the ‘subsidiary_cycle_detection’ transaction data key. The set initialization and operation creation are handled nicely by set_operation().
A more realistic example can be found in the advanced tutorial chapter Step 2: security propagation in hooks.
Never, ever use the entity.foo = 42 notation to update an entity. It will not work.To updating an entity attribute or relation, uses set_attributes() and set_relations() methods.
‘before_*’ hooks give you access to the old attribute (or relation) values. You can also hi-jack actually edited stuff in the case of entity modification. Needing one of this will definitly guide your choice.
Else the question is: should I need to do things before or after the actual modification. If the answer is “it doesn’t matter”, use an ‘after’ event.
When a hook is responsible to maintain the consistency of the data model detect an error, it must use a specific exception named ValidationError. Raising anything but a (subclass of) ValidationError is a programming error. Raising a it entails aborting the current transaction.
This exception is used to convey enough information up to the user interface. Hence its constructor is different from the default Exception constructor. It accepts, positionally:
In hooks, you can use the added_in_transaction() or deleted_in_transaction() of the session object to check if an eid has been created or deleted during the hook’s transaction.
This is useful to enable or disable some stuff if some entity is being added or deleted.
if self._cw.deleted_in_transaction(self.eidto):
return
Relations which are defined in the schema as inlined (see Relation type for details) are inserted in the database at the same time as entity attributes. This may have some side effect, for instance when creating entity and setting an inlined relation in the same rql query, when ‘before_add_relation’ for that relation will be run, the relation will already exist in the database (it’s usually not the case).