Home | Trees | Indices | Help |
|
---|
|
object --+ | Cleaner
Instances cleans the document of each of the possible offending elements. The cleaning is controlled by attributes; you can override attributes in a subclass, or set them in the constructor. ``scripts``: Removes any ``<script>`` tags. ``javascript``: Removes any Javascript, like an ``onclick`` attribute. ``comments``: Removes any comments. ``style``: Removes any style tags or attributes. ``links``: Removes any ``<link>`` tags ``meta``: Removes any ``<meta>`` tags ``page_structure``: Structural parts of a page: ``<head>``, ``<html>``, ``<title>``. ``processing_instructions``: Removes any processing instructions. ``embedded``: Removes any embedded objects (flash, iframes) ``frames``: Removes any frame-related tags ``forms``: Removes any form tags ``annoying_tags``: Tags that aren't *wrong*, but are annoying. ``<blink>`` and ``<marque>`` ``remove_tags``: A list of tags to remove. ``allow_tags``: A list of tags to include (default include all). ``remove_unknown_tags``: Remove any tags that aren't standard parts of HTML. ``safe_attrs_only``: If true, only include 'safe' attributes (specifically the list from `feedparser <http://feedparser.org/docs/html-sanitization.html>`_). ``add_nofollow``: If true, then any <a> tags will have ``rel="nofollow"`` added to them. This modifies the document *in place*.
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
Inherited from |
|
|||
Inherited from |
|
|
|
Depending on the browser, stuff like ``e x p r e s s i o n(...)`` can get interpreted, or ``expre/* stuff */ssion(...)``. This checks for attempt to do stuff like this. Typically the response will be to kill the entire style; if you have just a bit of Javascript in the style another rule will catch that and remove only the Javascript from the style; this catches more sneaky attempts. |
Home | Trees | Indices | Help |
|
---|
Generated by Epydoc 3.0beta1 on Sun Sep 2 18:12:39 2007 | http://epydoc.sourceforge.net |