Introducing Formish

Note

this introduction was written for a blog post. Please send any feedback or wishes for expansion to developers@ish.io

Matt Goodall and I had previously worked with the Nevow web framework for a lot of our software development work. We had developed a form library, called Formal, that had been reasonably well accepted and that had worked extremely well for most of our applications. However, working with Nevow/Twisted meant that there were a lot of things we couldn’t do that we would have liked.

When we moved over to a new project, one of the things we wanted to bring with us was the way we worked with forms. The library, now called formish, is just about ready for production use and I wanted to demonstrate some of its features and discuss some of the philosophy behind it.

A big goal for the form library was to break the problem down into it’s discrete components. These being

  • Templating
  • Validation
  • Data Schema
  • Data Conversion

A library that addressed each of these should work independently of each other and hopefully could be used in a project that had nothing to do with forms (apart from the templating bit).

Here is a diagram that tries to show the data flow through the widget structure.

_static/images/graphic/formish-dataflow.png

We’ve tried to keep the data flow as layered as possible. In this diagram you can see the rounded boxes represent the changing data representation and the vertical dotted lines the different library code that handles transformations. Don’t worry too much about the ‘pre_parse_request’ but at the moment, its used to pre-munge some aspects request data to make the process as symmetric as possible,

Data Schema

Any form has to work on a data structure of some sort and it makes sense for this not to be tied in any way to web data. We built a fairly light weight schema library called schemaish which defines data structures and allows metadata (such as field titles and descriptions) to be assigned. Validation can also be added to each node in a schema (see later). Here are a few examples of creating schema structures.

import schemaish

schema_item = schemaish.Integer()

# or

my_schema = schemaish.Structure()
my_schema.add( 'name', schemaish.String() )
my_schema.add( 'age', schemaish.Integer() )

# or

class MyStructure(schemaish.Structure):
    name = schemaish.String()
    age = schemaish.Integer()

Each of these schemas can be validated, however we don’t have any validators yet. We looked for a reusable validator library that was simple in execution (i.e. just callable!) but there weren’t any that really fitted the bill (we tried FormEncode but it seems to conflate validation and conversion, which we don’t think is quite right - personal opinion of course)).

Validatish

Validation should be simple, just call a validator with a value and it should either succeed or raise an exception. Obviously some validators need to be configured so you should either pass in some configuration variables to a function or instantiate a validator object that has a __call__ method.

In our validatish package, we created two submodules with the function validators in one (validatish.validate) and the class validators (which use the function validatos internally) in another (validatish.validator).

Here is an example of a function based validator

def is_string(v):
    """ checks that the value is an instance of basestring """
    if v is None:
        return
    msg = "must be a string"
    if not isinstance(v,basestring):
        raise Invalid(msg)

And here is an example of its matching class based version. We recommend using the class based validators all of the time to keep consistency (you can’t use function based validators if the validator needs configuring - schemaish expects a callable that takes a single argument).

class String(Validator):
    def __call__(self, v):
        validate.is_string(v)

Note

If a value is None, then the validation is not applied (this would imply a required constraint also).

So, now we can pass a validator into one of our schema instances

>>> import schemaish
>>> from validatish import validator

>>> schema = schemaish.String(validator=validator.String()))

>>> try:
...     schema.validate(10)
...     print 'success!'
... except schemaish.Invalid, e:
...     print 'error',e.error_dict
...
error {'': 'must be a string'}

>>> try:
...     schema.validate('foo')
...     print 'success!'
... except schemaish.Invalid, e:
...     print 'error',e.error_dict
...
success!

Note

Validators do not return any value on success (or more correctly they return None).

If we apply validators to multiple items in a structure, we can validate them all in one go.

>>> import schemaish
>>> from validatish import validator

>>> schema = schemaish.Structure()
>>> schema.add('name', schemaish.String(validator=validator.String()))
>>> schema.add('age', schemaish.Integer(validator=validator.Range(min=18)))

>>> try:
...     schema.validate({'name': 6, 'age': 17})
...     print 'success!'
... except schemaish.Invalid, e:
...     print 'error',e.error_dict
...
error {'age': 'must be greater than 18', 'name': 'must be a string'}

>>> try:
...     schema.validate({'name': 'John Drake', 'age': 28})
...     print 'success!'
... except schemaish.Invalid, e:
...     print 'error',e.error_dict
...
success!

Because validators are just callables, they are very easy to write and adding validators to groups of items or sequences is simple. We’ve implemented Any and All validators (thanks Ian!) that work similarly to FormEncode’s to allow grouping of rules. We’re hoping to expand on the validators but not until we have a requirement (either from ourselves or from someone hoping to use the package). We’ve learned from experience to plan ahead but not to build ahead of requirements.

The Form and the Widget

So far, everythig we’ve shown has had nothing to do with forms.. Let’s change that. First of all we need to define the form. This is fairly simple with formish because most of the work has been done in schemaish.

Using the form definition with age and name from above, we create a form by passing in the schema.

>>> import formish
>>> form = formish.Form(schema)

and that is it... if you want to render the form now, you just call it (we’ve implemented a default mako renderer for testing).

>>> form()
'\n<form id="form" action="" class="formish-form" method="post" enctype="multipart/form-data" accept-charset="utf-8">\n\n  <input type="hidden" name="_charset_" />\n  <input type="hidden" name="__formish_form__" value="form" />\n\n<div id="form-name-field" class="field string input">\n\n<label for="form-name">Name</label>\n\n\n<div class="inputs">\n\n<input id="form-name" type="text" name="name" value="" />\n\n</div>\n\n\n\n\n\n</div>\n\n<div id="form-age-field" class="field integer input">\n\n<label for="form-age">Age</label>\n\n\n<div class="inputs">\n\n<input id="form-age" type="text" name="age" value="" />\n\n</div>\n\n\n\n\n\n</div>\n\n\n  <div class="actions">\n      <input type="submit" id="form-action-submit" name="submit" value="Submit" />\n  </div>\n\n</form>\n\n'
Lets tidy that up a little
<form id="form" action="" class="form-form" method="post" enctype="multipart/form-data" accept-charset="utf-8">
  <input type="hidden" name="_charset_" />
  <input type="hidden" name="__formish_form__" value="form" />
  <div id="form-name-field" class="field string input">
    <label for="form-name">Name</label>
    <div class="inputs">
      <input id="form-name" type="text" name="name" value="" />
    </div>
    <span class="error"></span>
  </div>
  <div id="form-age-field" class="field integer input">
    <label for="form-age">Age</label>
    <div class="inputs">
      <input id="form-age" type="text" name="age" value="" />
    </div>
    <span class="error"></span>
  </div>
  <div class="actions">
    <input type="submit" id="form-action-submit" name="submit" value="Submit" />
  </div>
</form>

Without defining any widgets, formish just uses some defaults. Let’s take a look at the default widget to find out what it is doing.

class Widget(object):

    _template = None

    def __init__(self, **k):
        self.converter_options = k.get('converter_options', {})
        self.css_class = k.get('css_class', None)
        self.converttostring = True
        if not self.converter_options.has_key('delimiter'):
            self.converter_options['delimiter'] = ','

    def pre_render(self, schema_type, data):
        string_data = string_converter(schema_type).from_type(data)
        if string_data is None:
            return ['']
        return [string_data]

    def pre_parse_request(self, schema_type, request_data):
        return request_data

    def convert(self, schema_type, request_data):
        return string_converter(schema_type).to_type(request_data[0])

    def __call__(self, field):
        return field.form.renderer('/formish/widgets/%s.html'%self._template, {'f':field})

This is the base class which shows how widgets work. First of all we have a couple of variables to do with converter options (which we’ll come back to in a moment). The four class methods are at the hear of formish though.

pre_render

Before a widget is rendered, the input data is converted from its schema type to raw request data. The data passed to pre_render is just that fields data.

convert

Takes the request data for the field and converts it to the schema type.

__call__

And finally, if you want the widget to render, just call it! That’s it.. So we have a path from data -> request data and back from request data > data..

Oh.. I left out one..

pre_parse_request_data

When a field is submitted, the request data can be munged to try to enforce some sort of symmetry between input request data and output request data. This is only really used for file uploads where the field storage is extracted to a temporary location before passing the request data to convert. So, for most cases just ignore this.

Convertish

You can see from the example that the main conversion process is done using string_converter. This is one of the converter types in convertish and maps any of the schemaish types into a consistent string representation. It does so using peak.rules (although we could be convinced otherwise) and each string_converter implements a from_type and a to_type. For examples

class IntegerToStringConverter(Converter):
    cast = int

    def from_type(self, value, converter_options={}):
        if value is None:
            return ''
        return str(value)

    def to_type(self, value, converter_options={}):
        if value == '':
            return None
        value = value.strip()
        try:
            value = self.cast(value)
        except ValueError:
            raise ConvertError("Not a valid number")
        return value

So we short circuit None values [1], strip the data and cast it to the right type and raise a conversion exception if it fails.

The widget templates

So we now have a form defined and an example of a simple widget. Let’s take a look at how formish renders its widgets, the bits involved in creating a form. We render a form by calling it so form() produces the templated output. Calling a form just passes the form to the form.html template which is as follows. We only have mako templates at the minute but we’ve designed formish with simple templating features in mind so adding other templating langauges should be simple..

form.html

${form.header()|n}
${form.metadata()|n}
${form.fields()|n}
${form.actions()|n}
${form.footer()|n}

So the form template just calls each individual part. Here are the templates for each part (I’ve combined them together and separated them by comments to save on space).

form_header.html

<%
if form.action_url:
    action_url = form.action_url
else:
    action_url = ''
%>
<form id="${form.name}" action="${action_url}" class="formish-form"
     method="post" enctype="multipart/form-data" accept-charset="utf-8">

form_metadata.html

<input type="hidden" name="_charset_" />
<input type="hidden" name="__formish_form__" value="${form.name}" />

form_fields.html

%for f in form.fields:
${f()|n}
%endfor

form_actions.html

<div class="actions">
%if form._actions == []:
  <input type="submit" id="${form.name}-action-submit" name="submit" value="Submit" />
%else:
  %for action in form._actions:
  <input type="submit" id="${form.name}-action-${action.name}"
       name="${action.name}" value="${action.label}" />
  %endfor
%endif
</div>

form_footer.html

</form>

The most complicated part is probably the actions because of the default submit action applied if no explicit actions are give.

Most values are available as attributes on the form such as form.name and action.label.

More interesting is how each field is rendered.

<div id="${field.cssname}-field" class="${field.classes}">
${field.label()|n}
${field.inputs()|n}
${field.error()|n}
${field.description()|n}
</div>

So each field is built in the same way as the main form. Here are the parst used.

field_label.html

<%page args="field" />
% if field.widget._template != 'Hidden':
<label for="${field.cssname}">${field.title}</label>
%endif

field_inputs.html

<%page args="field" />
<div class="inputs">
${field.widget()|n}
</div>

field_error.html

<%page args="field" />
% if field.error:
<span class="error">${unicode(field.error)}</span>
% endif

field_description.html

<%page args="field" />
% if str(field.description) != '':
<span class="description">${field.description}</span>
% endif

Here we can see that each part of the template uses the field attributes and methods to render themselves. Finally, here is the standard Input widget.

Input/widget.html

<%page args="field" />
<input id="${field.cssname}" type="text"
       name="${field.name}" value="${field.value[0]}" />

This does seem a little excessive though.. Why have all of these little template components? Doesn’t it make things more complicated? (note to self.. stop talking to yourself)

Actually for most users, there will be no exposure to these components. However, as soon as you want to create a custom form template, instead of adding ${form()} to your template to get totally automatic form production, you can do the following.

${form.header()|n}
${form.metadata()|n}

${form['firstName']()|n}

<div id="${form['surname'].cssname}-field" class="${form['surname'].classes}">
  <strong>${form['surname'].description}</strong>
  <em>${form['surname'].error}</em>
  ${form['surname'].widget()|n}
</div>

${form.actions()|n}
${form.footer()|n}

Allowing you to pick your own level of control from totally automatic, through partial overriding to totally custom form components.

Each part of the form can be overridden by using a local formish template directory. Allowing you to provide your own suite of templates.

We’re hoping to add the ability to pass in which fields to render and also individual custom templates too.. Something like the following

${form.fields(form.fieldlist[:4])|n}

<div id="${form['surname'].cssname}-field" class="${form['surname'].classes}">
  <strong>${form['surname'].description}</strong>
  <em>${form['surname'].error}</em>
  ${form['surname'].widget()|n}
</div>


${form.fields(form.fieldlist[6:])|n}

What we’re doing here is just passing the names of the fields we want to render to the form.fields object. In it’s most simple form it would be form.fields( ['name','age'] ) but we could easily uselist comprehensions, filters, etc.

If you want to see a few more examples of formish capabilities, have a look at http://ish.io:8891.

Using your form

In order to use the form that you’ve created, you pass the request.POST data to it and either check the results or pass a success/failure callback.. Here is an example using the success,failure callback (in this case self.html, self.thanks)

class SimpleSchema(schemaish.Structure):
    email = schemaish.String(validator=schemaish.All(schemaish.NotEmpty, schemaish.Email))
    first_names = schemaish.String(validator=schemaish.NotEmpty)
    last_name = schemaish.String(validator=schemaish.NotEmpty)
    comments = schemaish.String()


def get_form():
    form = formish.Form(SimpleSchema())
    form['comments'].widget = formish.TextArea()
    return form

class Root(resource.Resource):

    @resource.GET()
    @templating.page('test.html')
    def html(self, request, form=None):
        if form is None:
            form = get_form()
        return {'form': form}

    @resource.POST()
    def POST(self, request):
        return get_form().validate(request, self.html, self.thanks)

    @templating.page('thanks.html')
    def thanks(self, request, data):
        return {'data': data}

These examples are using the restish wsgi framework but because the form just works with dictionaries it’s simple to integrate into any web framework.

What else is there?

Well we’ve spent a lot of time trying to get file upload fields to work in a friendly fashion so that we can have nested sequences of file uploads with temporary storage, image resizing and caching right out the box. We’ve worked hard to make sequences work well and have tested nested lists of nested lists of structures and file uploads, selects, etc. etc. If you can think of a data structure made of lists and dictionaries, formish will represent it. Sequences currently use jquery to add, remove and reorder although we’ll have non-javascript support in the next few weeks.

Anything else interesting?

Date Parts Widget

Well, a dateparts widget is vaguely interesting as the converter methodology doesn’t work using the standard string converter.

The widget would have to convert the three fields into a string date representation first before passing it to convertish to cast it to the correct schema type and then valdatish for validation.

However we now have a widget doing conversion, which we were hoping to avoid. The only reason we would be forced into doing this is because of the string_converter choice. However, we can use any type of converter we like. For our DateParts widget we have used a DateTupleConverter which means that widget just passes the three values as a tuple to convertish which can raise convert errors against individual widget input boxes if required.

Fancy Converters

Because we can apply a widget to structures or sequences of items, we thought “How about a schema that is a sequence of sequences. This sounds like a csv. Lets map a TextArea to this and apply a SequenceOfSequences Converter”. So the following gives you a csv TextArea following the same layered patterns shown in the diagram at the start of this post.

This produces the widget shown here here. In this instance, the field is expecting an isoformat date for the third item in the tuple so the following data would work..

1,2,2008-2-3
4,5,2008-1-3

Note

I mentioned converter_options as one of the parameters that each widget can take. This can be used in the conversion process to guide the type of conversion. In the csv parsing case, you can tell the converter what separator to use for instance.

File Upload

File uploads are notoriously difficult to use in forms. The persistence of uploaded data before the form is finished is messy and a consistent preivew that works for this temporary persistence and also when you’ve implemented your final store is not straightforward. Formish needs three things for form uploads (and provides defaults for all of them

FileHandler

The filehandlers job is to persist file uploads up until a form is successfully completed. The FileUpload widget asks the filehandler to store the file and remove it after the process has finished. If you want to access the file then it will also give you a direct path for it and a mimtype.

class TempFileHandler(FileHandlerMinimal):
    """
    File handler using python tempfile module to store file
    """

    def __init__(self):
        self.prefix = tempfile.gettempprefix()
        self.tempdir = tempfile.gettempdir()


    def store_file(self, fieldstorage):
        """
        Given a filehandle, store the file and return an identifier, in this
        case the original filename
        """
        fileno, filename = tempfile.mkstemp( \
                        suffix='%s-%s'% (uuid.uuid4().hex,fieldstorage.filename))
        filehandle = os.fdopen(fileno, 'wb')
        filehandle.write(fieldstorage.value)
        filehandle.close()
        filename = ''.join( filename[(len(self.tempdir)+len(self.prefix)+1):] )
        return filename

    def delete_file(self, filename):
        """
        remove the tempfile
        """
        filename = '%s/%s%s'% (tempdir, self.prefix, self.filename)
        os.remove(filename)

    def get_path_for_file(self, filename):
        """
        given the filename, get the path for the temporary file
        """
        return '%s/%s%s'% (self.tempdir, self.prefix, filename)

    def get_mimetype(self, filename):
        """
        use python-magic to guess the mimetype or use application/octet-stream
        if no guess
        """
        mimetype = magic.from_file('%s/%s%s'%(self.tempdir,self.prefix,filename),mime=True)
        return mimetype or 'application/octet-stream'

If you want to server the files in your web apllication (and the default FileUpload widget includes facility for an image_preview box) then you’ll need to use TempFileHandlerWeb, which includes a resource_root and a fileaccessor

class TempFileHandlerWeb(TempFileHandler):
    """
    Same as the temporary file handler but includes ability to include a resource and a url generator (if you want access to the temporary files on the website, e.g. for previews)
    """

    def __init__(self, resource_root='/filehandler',urlfactory=None):
        TempFileHandler.__init__(self)
        self.default_url = default_url
        self.resource_root = resource_root
        self.urlfactory = urlfactory

    def get_url_for_file(self, identifier):
        """
        Generate a url given an identifier
        """
        if self.urlfactory is not None:
            return self.urlfactory(identifier)
        return '%s/%s'% (self.resource_root, identifier)

This extends the basic tempfile handler to allow you to pass a urlfactory and a root resource for where your assets are mounted.

urlfactory

When an attribute is stored in a database, an image is often represented by a uuid of somesort. Or possibly a directory and a file. urlfactory is used to take the identifier (lets say ‘foo/bar’) and convert it unto something that the file system can work with.

So here is a simple FileUpload widget

schema = schemaish.Structure()
schema.add( 'myFile', schemaish.File() )

form = formish.Form(schema)
form['myFile'].widget = formish.FileUpload(
                           filehandler=formish.TempFileHandlerWeb(),
                           originalurl='/images/nouploadyet.png',
                           show_image_preview=True
                           )

To set up a file resource handler at /filehandler, you could use the following (if you are using restish)

@resource.child()
def filehandler(self, request, segments):
    db = collection.CouchishDB(request)
    fa = couchish.FileAccessor(db)
    fh = formish.TempFileHandler()
    return FileResource(fileaccessor=fa,filehandler=fh)

This looks a little more complicated, and is. This resource needs to serve files that are already in your application’s persistent storage (the FileAccessor here) and also provide a way of accessing temporary files that have been uploaded as the form is being possibly posted repeatedly before finally succeeding. Don’t worry though, if you’re happy with using temporary file storage for a while, your resource could look like this

@resource.child()
def filehandler(self, request, segments):
    return FileResource()

Then at some point when you get your storage implemented, you could add your own custom fileaccessor.

We’ve got a file file examples working at http://ish.io:8891. (Please don’t upload any multi megabyte files.. I haven’t got a validator on it yet :-)

I don’t think I’ve explained file uploads as well as I could so perhaps I’ll refine this at a later date.

The way forward

At this point we’re using these forms in production code and they are holding up quite well and are very easy to customise. The next steps are probably additional templating customisation options (apply custom template snippets at the template level or at the code level), partial auto filling of forms (like the slicing mentioned above), getting sequences to work without javascript and then adding another templating language (probably Jinja?)

Slighly bigger pieces of work would be trying to implement some form of wsgi asset inclusion (for js, css, images) and also html injection for js, css snippets. This will take a bit of thinking about but the ToscaWidgets / lxml approach looks interesting.

Other goals..

  • Multi page forms..
  • relational validation (this required only if that not - is possible now but would like to make more intuitive)
  • GET form submissions
  • immutable and plain html versions of templates

Note

Please send any feedback to developers@ish.io

Footnotes

[1]By default the formish widgets equate the empty string with None. This means if you put a default value of ‘’ into a form, you will get None back. If you want to override this behaviour, set the empty widget attribute to something else (e.g. for a date field you might set the widget value to datetime.date.today()