Advanced Topics

As with all frameworks, Dejavu can't cover every need out-of-the-box. However, Dejavu has been specifically designed to be hackable. In particular, the creation of new Storage Managers is a well-defined process. Read on if there's a feature you need that you might consider building yourself.

Subclassing Sandbox

Okay, I lied. There's not much I can think of that you'd want to do with Sandboxes. Most things I can think of would be better implemented as Storage Manager middleware. But if you think of any, let me know.

Arena Hacking

The most common modification to an Arena object is to use it as a dumping-ground for other application data. Since the Arena should persist for the lifetime of the application's process, it can serve as a decent top-level Application namespace. Since the only mandatory argument to Sandbox.__init__ is an Arena, you can pass Sandboxes around in your code and always have access to the global Arena. If you use another application framework for your front-end, just stick a reference to it in your Arena and vice-versa. Python dynamic attributes to the rescue again!

Passing through SQL

If you want to keep writing SQL, there's nothing stopping you from doing so. If nothing else, it can be a handy way to prototype or migrate an application, and then replace the SQL with Dejavu API calls later on. You'll need to make your deployers aware that you're using SQL directly (and which DB's your SQL runs on), so they don't try to deploy your application with an unsupported store.

To avoid stale data, you should probably flush any sandboxes before running your query, especially if it updates data. You should also run the following snippet to flush CachingProxy or BurnedProxy SM's:

for store in arena.stores.itervalues():
    if hasattr(store, "sweep_all"):
        store.sweep_all()

Once you've got a clean slate, obtain a reference to the StorageManager. It should always be a subclass of dejavu.storage.db.StorageManagerDB:

>>> s = arena.stores['MControl']    # or...
>>> s = arena.stores.itervalues[0]  # or...
>>> s = arena.storage(cls)
>>> s.__class__.__mro__
(<class 'dejavu.storage.storeado.StorageManagerADO_MSAccess'>,
 <class 'dejavu.storage.storeado.StorageManagerADO'>,
 <class 'dejavu.storage.db.StorageManagerDB'>,
 <class 'dejavu.storage.StorageManager'>,
 <type 'object'>)

Then, use the extension methods built into StorageManager classes' "db" attribute to get data:

>>> rows, cols = s.db.fetch("SELECT djvFields.Value, Count(djvCity.ID) AS NumCities "
                            "FROM djvFields LEFT JOIN djvCity ON djvFields.ID = "
                            "djvCity.Field GROUP BY djvFields.Value")
>>> cols     # [(name, type), (name, type), ...]
[(u'Value', 202), (u'NumCities', 3)]
>>> rows
[(u'Baja California', 3), (u'Ciudad Juarez', 1), (u'Puerto Penasco', 0), (u'Yucatan Peninsula', 1)]

...or update:

>>> s.execute("UPDATE djvFields SET ShortCode = Left(Value, 1) WHERE ShortCode Is Null;")

There are a lot of other things you can do with the builtin Database, Table, Column, and SQLDecompiler classes. Feel free to open them up in an interactive session and explore. All of the RDBMS Storage Managers are built on top of an independent SQL layer (in storage/geniusql.py) that knows nothing about units or arenas (but it does understand Expressions).

Custom Storage Managers

In most cases, you will add new functionality to Dejavu itself by creating a custom Storage Manager, whether for a new backend, or a custom middleware component. Storage Managers must conform to a simple interface for creating, destroying, and recalling Units. They are free to implement that functionality however they like.

As you can see in the code, the storage.StorageManager base class requires you to override most of its methods.

Generic Database Wrappers

Writing a Storage Manager for a database is relatively straightforward, mostly because Dejavu doesn't have complicated storage interfaces or demands. If you find your application depends heavily upon using advanced features of a particular database, or upon hand-crafted SQL, then Dejavu is not for you or your application. A Dejavu SM module for a database usually includes:

  1. Adapters, which coerce values from Python types to database types and back again. Base classes for DB Adapters can be found in dejavu.storage.db.
  2. An SQLDecompiler, which converts dejavu Expression objects (essentially, Python lambdas) into SQL.
  3. A subclass of geniusql.Database, which handles requests to SELECT data using the above two components, as well as ALTER TABLE, etc.
Adapters

Generally, you will end up with three kinds of Adapters (subclasses of storage.Adapter): one for converting Dejavu types to your database types, another for the reverse (storage.db.AdapterFromDB), and probably a third to insert Dejavu values (with proper quoting, etc.) into SQL statements for your database (storage.db.AdapterToSQL). The Adapter class provides a single public method, coerce(self, value, dbtype, pytype), which takes any value and attempts to return a new value.

adapter.coerce() handles a request by calling a sibling method (that is, a method of the same subclass). Therefore, you need to add methods to your Adapter for each Python type you wish to support. For example, if you wish to coerce Python ints to INTEGER, you need to add the following method to your Adapter subclass:

    def coerce_int_to_any(self, value):
        return str(value)
Methods are named coerce_type1_to_type2, where 'type1' and 'type2' are type names, one of them a Python type and the other a database type. If your type name has dots in it, they will be converted to underscores. If either of the type names is 'any', that method will be used if no more-specific coercion method exists. Again, you can most likely use methods in the base Adapter classes provided.

Your coercion method should receive a single value and return that value, coerced to a type. An outbound adapter coerces from Python types to database types. You supply a Dejavu UnitProperty value to coerce, and the appropriate coercion method will be selected based upon the type() of that value. An inbound adapter, on the other hand, coerces from DB types to Python types. Call coerce with your database value and the valuetype argument, which is then used to call the appropriate coercion method. That method returns the value, coerced to type(valuetype), which the UnitProperty expects.

If coerce cannot find a method for the appropriate Python type, it errors, and rightly so. Don't let these errors pass silently! An earlier version of Dejavu had a "default" coercion method, which was a Bad Idea. Don't replicate it.

Decompiler

The SQLDecompiler is the tricky bit of any Storage Manager. You must receive a Unit class and an Expression, and produce valid SQL for your database from both. For example, given:

unitClass = Things
expr = logic.Expression(lambda x: x.Group == 3)
...your decompiler should produce something like:
"SELECT * FROM [djvThings] WHERE [djvThings].[Group] = 3"

The above example may seem trivial to you, but add in proper quoting, diverse datatypes (like dates and decimals), complex operators (like 'in', 'Like', and 'ieq'), logic functions (like today() and iscurrentweek()), null queries, and just-in-time keyword args, and it becomes complex very quickly. You are, in effect, writing a mini-ORM.

But, don't despair. Dejavu provides you with tools to make this task easier:

  1. The most important tool is geniusql.SQLDecompiler, a complete base class. You should be able to tweak it for most databases with a couple of SQL syntax changes.
  2. SQLDecompiler is built on a simple Visitor-style base class, codewalk.LambdaDecompiler. More complicated extensions are easily added to this base class; each bytecode in the Expression (Python lambda) gets its own method call.
  3. You don't have to handle globals or cell references within the lambda--when the lambda gets wrapped in an Expression, all free variables are converted to constants.
  4. You aren't forced to handle every possible operator, function, or term in SQL. The base SQLDecompiler doesn't; when it encounters a function it can't handle, for example, it punts by flagging the SQL as imperfect. This signals the Storage Manager to run each Unit through the lambda (in pure Python) before yielding it back to the caller. In fact, you can start writing your Storage Manager without a decompiler at all! Just return all stored Units of the given class and use the Expression to filter whole Units. Then, when your SM works, add a decompiler.
Database/Table/IndexSet

You'll need a subclass of geniusql.Database. Override the container methods (like __setitem__ and __delitem__). For most popular databases, these are pretty straightforward. Some notes:

Database SM's should also define the methods create_database() and drop_database(), if possible.

Use dejavu/test/zoo_fixture.py to test your new Storage Manager. Copy one of the (very short) test_store* modules for the other SM's, and make the necessary changes for your SM. All of the heavy lifting of the tests is done in zoo_fixture.

Legacy Database Wrappers

Sometimes you do not have complete control over the database you want to reference. In that case, you should probably still write a custom Storage Manager, Adapters, and a Decompiler. Often, you can get away with providing a simple column-to-Unit mapping to use as you decompile. I've built one, for example, to wrap The Raiser's Edge (third-party fundraising software). My Dejavu model manages directory records and income without regard for the underlying database; a custom Storage Manager maps between that ideal model and the Raiser's Edge API. This allows me to integrate data from RE with our custom inventory, invoice, and scheduling software.

One of the more important parts of wrapping existing tables is getting your pretty Python names mapped to ugly database names. Do this by making a custom Database: override the column_name and table_name methods to do the mapping.

Other Serialization Mechanisms

sockets

There's a sockets module in the storage package. It does simple serialization of Units across a socket, so you can run Dejavu in its own process, separate from your front end. I had to do this with a third-party database, which couldn't handle web-traffic threading models. Here's a snippet of how to use it (from that app):

def query(self, cmd, unitType='', data=None):
    if isinstance(data, dejavu.Unit):
        data = stream(data)
    elif data is None:
        data = ''
    else:
        data = pickle.dumps(data)
    response = self.socket.query(":".join((cmd, unitType, data)))
    return response