Embedded documentation from code

The following are embedded documentation selected from the Roundup code base. You can see the same information using the help function after importing the modules.

Client class

class roundup.cgi.client.Client(instance, request, env, form=None, translator=None)

Instantiate to handle one CGI request.

See inner_main for request processing.

Client attributes at instantiation:

  • “path” is the PATH_INFO inside the instance (with no leading ‘/’)

  • “base” is the base URL for the instance

  • “form” is the cgi form, an instance of FieldStorage from the standard cgi module

  • “additional_headers” is a dictionary of additional HTTP headers that should be sent to the client

  • “response_code” is the HTTP response code to send to the client

  • “translator” is TranslationService instance

  • “clientnonce” is a unique value for this client connection. Can be used as a nonce for CSP headers and to sign javascript code presented to the browser. This is different from the CSRF nonces and can not be used for anti-csrf measures.

During the processing of a request, the following attributes are used:

  • “db”

  • “_error_message” holds a list of error messages

  • “_ok_message” holds a list of OK messages

  • “session” is deprecated in favor of session_api (XXX remove)

  • “session_api” is the interface to store data in session

  • “user” is the current user’s name

  • “userid” is the current user’s id

  • “template” is the current :template context

  • “classname” is the current class context name

  • “nodeid” is the current context item id

Note: _error_message and _ok_message should not be modified directly, use add_ok_message and add_error_message, these, by default, escape the message added to avoid XSS security issues.

User Identification:

Users that are absent in session data are anonymous and are logged in as that user. This typically gives them all Permissions assigned to the Anonymous Role.

Every user is assigned a session. “session_api” is the interface to work with session data.

Special form variables:

Note that in various places throughout this code, special form variables of the form :<name> are used. The colon (“:”) part may actually be one of either “:” or “@”.

Set a cookie value to be sent in HTTP headers

Parameters:
name:

cookie name

value:

cookie value

expire:

cookie expiration time (seconds). If value is empty (meaning “delete cookie”), expiration time is forced in the past and this argument is ignored. If None, the cookie will expire at end-of-session. If omitted, the cookie will be kept for a year.

path:

cookie path (optional)

authenticate_bearer_token(challenge)

authenticate the bearer token. Refactored from determine_user() to allow it to be overridden if needed.

check_anonymous_access()

Check that the Anonymous user is actually allowed to use the web interface and short-circuit all further processing if they’re not.

clean_sessions()

Deprecated XXX remove

clean_up()

Remove expired sessions and One Time Keys.

Do it only once an hour.

determine_charset()

Look for client charset in the form parameters or browser cookie.

If no charset requested by client, use storage charset (utf-8).

If the charset is found, and differs from the storage charset, recode all form fields of type ‘text/plain’

determine_context(dre=re.compile('([^\\d]+)0*(\\d+)'))

Determine the context of this page from the URL:

The URL path after the instance identifier is examined. The path is generally only one entry long.

  • if there is no path, then we are in the “home” context.

  • if the path is “@@file” (or “_file”, then the additional path entry specifies the filename of a static file we’re to serve up from the instance “html” directory. Raises a SendStaticFile exception.(*)

  • if there is something in the path (eg “issue”), it identifies the tracker class we’re to display.

  • if the path is an item designator (eg “issue123”), then we’re to display a specific item.

  • if the path starts with an item designator and is longer than one entry, then we’re assumed to be handling an item of a FileClass, and the extra path information gives the filename that the client is going to label the download with (ie “file123/image.png” is nicer to download than “file123”). This raises a SendFile exception.(*)

Both of the “*” types of contexts stop before we bother to determine the template we’re going to use. That’s because they don’t actually use templates.

The template used is specified by the :template CGI variable, which defaults to:

  • only classname suplied: “index”

  • full item designator supplied: “item”

We set:

self.classname - the class to display, can be None

self.template - the template to render the current context with

self.nodeid - the nodeid of the class we’re displaying

determine_language()

Determine the language

determine_user(is_api=False)

Determine who the user is

expire_exposed_keys(method)

A nonce is used with a method it should not be. If the nonce exists, report to admin so they can fix the nonce leakage and destroy it. (Nonces used in a get are more exposed than those used in a post.) If nonce exists in the database, report the referer and origin headers to try to find where this comes from so it can be fixed. If nonce doesn’t exist just ignore it. If we reported invalid nonces, somebody could spam us with a ton of invalid keys and fill up the logs.

Use ?@csrf=key in a GET, HEAD, or OPTIONS request to test this code. Python’s http server library will not parse Content sent via one of these methods. So smuggle it via a query string when testing.

gzip = <module 'gzip' from '/usr/lib/python3.10/gzip.py'>
handle_action()

Determine whether there should be an Action called.

The action is defined by the form variable :action which identifies the method on this object to call. The actions are defined in the “actions” sequence on this class.

Actions may return a page (by default HTML) to return to the user, bypassing the usual template rendering.

We explicitly catch Reject and ValueError exceptions and present their messages to the user.

handle_csrf(api=False)

Handle csrf token lookup and validate current user and session

If the config.ini setting: WEB_USE_TOKENLESS_CSRF_PROTECTION is enabled, this routine returns the result from handle_csrf_tokenless() and doesn’t use Nonces at all.

This implements (or tries to implement) the Session-Dependent Nonce from https://seclab.stanford.edu/websec/csrf/csrf.pdf.

Changing this to an HMAC(sessionid,secret) will remove the need for saving a fair amount of state on the server (one nonce per form per page). If you have multiple forms/page this can lead to abandoned csrf tokens that have to time out and get cleaned up. But you lose per form tokens which may be an advantage. Also the HMAC is constant for the session, so provides more occasions for it to be exposed.

This only runs on post (or put and delete for future use). Nobody should be changing data with a get.

A session token lifetime is settable in config.ini. A future enhancement to the creation routines should allow for the requester of the token to set the lifetime.

The unique session key and user id is stored with the token. The token is valid if the stored values match the current client’s userid and session.

If a user logs out, the csrf keys are invalidated since no other connection should have the same session id.

At least to start I am reporting anti-csrf to the user. If it’s an attacker who can see the site, they can see the @csrf fields and can probably figure out that he needs to supply valid headers. Or they can just read this code 8-). So hiding it doesn’t seem to help but it does arguably show the enforcement settings, but given the newness of this code notifying the user and having them notify the admins for debugging seems to be an advantage.

handle_csrf_tokenless()

Modern way to handle csrf prevention quoted from:

and is reformatted with added commentary in []’s:

“In summary, to protect against CSRF applications (or, rather, libraries and frameworks) should reject cross-origin non-safe browser requests. The most developer-friendly way to do so is using primarily Fetch metadata, which requires no extra instrumentation or configuration.

  1. Allow all GET, HEAD, or OPTIONS requests. These are safe methods, and are assumed not to change state at various layers of the stack already.

  2. If the Origin header matches an allow-list [see *1 below] of trusted origins, allow the request.

    Trusted origins should be configured as full origins (e.g. https://example.com) and compared by simple equality with the header value.

  3. If the Sec-Fetch-Site header is present: if its value is same-origin or none, allow the request; otherwise, reject the request.

    This secures all major up-to-date browsers for sites hosted on trustworthy (HTTPS or localhost) origins.

  4. If neither the Sec-Fetch-Site nor the Origin headers are present, allow the request.

    These requests are not from (post-2020) browsers, and can’t be affected by CSRF.

  5. If the Origin header’s host (including the port) matches the Host header, allow the request, otherwise reject it.

    This is either a request to an HTTP origin, or by an out-of-date browser.

The only false positives (unnecessary blocking) of this algorithm are requests to non-trustworthy (plain HTTP) origins that go through a reverse proxy that changes the Host header. That edge case can be worked around by adding the origin [see *2 below] to the allow-list.

There are no false negatives in modern browsers, but pre-2023 browsers will be vulnerable to HTTP→HTTPS requests, because the Origin fallback is scheme-agnostic. HSTS can be used to mitigate that (in post-2020 browsers), but note that out-of-date browsers are likely to have more pressing security issues.”

Local Notes

*1. The allow list of trusted origins is obtained from the tracker’s config.ini file in two places:

  1. the web setting of the tracker section

  2. the allowed_api_origins setting in the web section

*2. I am not sure what is meant in this section. If the reverse proxy changes the ORIGIN header, then setting allowed_api_origins is a remedy. However the HOST header is only used in step 5 and is compared to the ORIGIN header not to a list of possible origins so….

The GET/HEAD/OPTIONS requests are scanned for @csrf tokens. If any are found, they are removed from the database. The @csrf token removal code can be deleted when @csrf token support is removed.

handle_range_header(length, etag)

Handle the ‘Range’ and ‘If-Range’ headers.

‘length’ – the length of the content available for the resource.

‘etag’ – the entity tag for this resources.

returns – If the request headers (including ‘Range’ and ‘If-Range’) indicate that only a portion of the entity should be returned, then the return value is a pair ‘(offfset, length)’ indicating the first byte and number of bytes of the content that should be returned to the client. In addition, this method will set ‘self.response_code’ to indicate Partial Content. In all other cases, the return value is ‘None’. If appropriate, ‘self.response_code’ will be set to indicate ‘REQUESTED_RANGE_NOT_SATISFIABLE’. In that case, the caller should not send any data to the client.

header(headers=None, response=None)

Put up the appropriate header.

http_split(content)

Split an HTTP list.

‘content’ – A string, giving a list of items.

returns – A sequence of strings, containing the elements of the list.

http_strip(content)

Remove HTTP Linear White Space from ‘content’.

‘content’ – A string.

returns – ‘content’, with all leading and trailing LWS removed.

inner_main()

Process a request.

The most common requests are handled like so:

  1. look for charset and language preferences, set up user locale see determine_charset, determine_language

  2. figure out who we are, defaulting to the “anonymous” user see determine_user

  3. figure out what the request is for - the context see determine_context

  4. handle any requested action (item edit, search, …) see handle_action

  5. render a template, resulting in HTML output

In some situations, exceptions occur:

  • HTTP Redirect (generally raised by an action)

  • SendFile (generally raised by determine_context) serve up a FileClass “content” property

  • SendStaticFile (generally raised by determine_context) serve up a file from the tracker “html” directory

  • Unauthorised (generally raised by an action) the action is cancelled, the request is rendered and an error message is displayed indicating that permission was not granted for the action to take place

  • templating.Unauthorised (templating action not permitted) raised by an attempted rendering of a template when the user doesn’t have permission

  • NotFound (raised wherever it needs to be) percolates up to the CGI interface that called the client

is_origin_header_ok(api=False, credentials=False)

Determine if origin is valid for the context

Header is ok (return True) if ORIGIN is missing and it is a GET. Header is ok if ORIGIN matches the base url. If this is an API call:

  • Header is ok if ORIGIN matches an element of allowed_api_origins.

  • Header is ok if allowed_api_origins includes ‘*’ as first element and credentials is False.

Otherwise header is not ok.

In a credentials context, if we match * we will return header is not ok. All credentialed requests must be explicitly matched.

main()

Wrap the real main in a try/finally so we always close off the db.

make_user_anonymous()

Make us anonymous

This method used to handle non-existence of the ‘anonymous’ user, but that user is mandatory now.

opendb(username)

Open the database and set the current user.

Opens a database once. On subsequent calls only the user is set on the database object the instance.optimize is set. If we are in “Development Mode” (cf. roundup_server) then the database is always re-opened.

reauth(exception)

Processing for a Reauth exception raised from an auditor.

Can be overridden by code in tracker’s interfaces.py.

renderContext()

Return a PageTemplate for the named page

renderFrontPage(message)

Return the front page of the tracker.

selectTemplate(name, view)

Choose existing template for the given combination of classname (name parameter) and template request variable (view parameter) and return its name.

View can be a single template or two templates separated by a vbar ‘|’ character. If the Client object has a non-empty _error_message attribute, the right hand template (error template) will be used. If the _error_message is empty, the left hand template (ok template) will be used.

In most cases the name will be “classname.view”, but if “view” is None, then template name “classname” will be returned.

If “classname.view” template doesn’t exist, the “_generic.view” is used as a fallback.

[ ] cover with tests

send_error_to_admin(subject, html, txt)

Send traceback information to admin via email. We send both, the formatted html (with more information) and the text version of the traceback. We use multipart/alternative so the receiver can chose which version to display.

serve_file(designator, dre=re.compile('([^\\d]+)(\\d+)'))

Serve the file from the content property of the designated item.

serve_static_file(file)

Serve up the file named from the templates dir

setHeader(header, value)

Override or delete a header to be returned to the user’s browser.

setTranslator(translator=None)

Replace the translation engine

‘translator’

is TranslationService instance. It must define methods ‘translate’ (TAL-compatible i18n), ‘gettext’ and ‘ngettext’ (gettext-compatible i18n).

If omitted, create default TranslationService.

setVary(header)

Vary header will include the new header. This will append if Vary exists.

standard_message(to, subject, body, author=None)

Send a standard email message from Roundup.

“to” - recipients list “subject” - Subject “body” - Message “author” - (name, address) tuple or None for admin email

Arguments are passed to the Mailer.standard_message code.

write_file(filename)

Send the contents of ‘filename’ to the user. Send an acceptable pre-compressed version of the file if it is newer than the uncompressed version.

CGI Action class

Action class and selected derived classes.

Action

class roundup.cgi.actions.Action(client)
examine_url(url)

Return URL validated to be under self.base and properly escaped

If url not properly escaped or validation fails raise ValueError.

To try to prevent XSS attacks, validate that the url that is passed in is under self.base for the tracker. This is used to clean up “__came_from” and “__redirect_to” form variables used by the LoginAction and NewItemAction actions.

The url that is passed in must be a properly url quoted argument. I.E. all characters that are not valid according to RFC3986 must be % encoded. Schema should be lower case.

It parses the passed url into components.

It verifies that the scheme is http or https (so a redirect can force https even if normal access to the tracker is via http). Validates that the network component is the same as in self.base. Validates that the path component in the base url starts the url’s path component. It not it raises ValueError. If everything validates:

For each component, Appendix A of RFC 3986 says the following are allowed:

pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
query         = *( pchar / "/" / "?" )
unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded   = "%" HEXDIG HEXDIG
sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
                 / "*" / "+" / "," / ";" / "="

Checks all parts with a regexp that matches any run of 0 or more allowed characters. If the component doesn’t validate, raise ValueError. Don’t attempt to urllib_.quote it. Either it’s correct as it comes in or it’s a ValueError.

Finally paste the whole thing together and return the new url.

execute()

Execute the action specified by this object.

gettext(msgid)

Return the localized translation of msgid

handle()

Action handler procedure

hasPermission(permission, classname=[], itemid=None, property=None)

Check whether the user has ‘permission’ on the current class.

permission()

Check whether the user has permission to execute this action.

True by default. If the permissionType attribute is a string containing a simple permission, check whether the user has that permission. Subclasses must also define the name attribute if they define permissionType.

Despite having this permission, users may still be unauthorised to perform parts of actions. It is up to the subclasses to detect this.

LoginAction

class roundup.cgi.actions.LoginAction(client)
handle()

Attempt to log a user in.

Sets up a session for the user which contains the login credentials.

rateLimitLogin(username, is_api=False, update=True)

Implement rate limiting of logins by login name.

username - username attempting to log in. May or may not

be valid.

is_api - set to False for login via html page

set to “xmlrpc” for xmlrpc api set to “rest” for rest api

update - if False will raise RateLimitExceeded without

updating the stored value. Default is True which updates stored value. Used to deny access when user successfully logs in but the user doesn’t have a valid attempt available.

Login rates for a user are split based on html vs api login.

Set self.client.db.config.WEB_LOGIN_ATTEMPTS_MIN to 0 to disable for web interface. Set self.client.db.config.API_LOGIN_ATTEMPTS to 0 to disable for web interface.

By setting LoginAction.limitLogin, the admin can override the HTML web page rate limiter if they need to change the interval from 1 minute.

verifyLogin(username, password, is_api=False)

Authenticate the user with rate limits.

All logins (valid and failing) from a web page calling the LoginAction method are rate limited to the config.ini configured value in 1 minute. (Interval can be changed see rateLimitLogin method.)

API logins are only rate limited if they fail. Successful api logins are rate limited using the api_calls_per_interval and api_interval_in_sec settings in config.ini.

Once a user receives a rate limit notice, they must wait the recommended time to try again as the account is locked out for the recommended time. If a user tries to log in while locked out, they will get a 429 rejection even if the username and password are correct.

verifyPassword(userid, givenpw)

Verify the password that the user has supplied. Optionally migrate to new password scheme if configured

ReauthAction

class roundup.cgi.actions.ReauthAction(client)

Allow an auditor to require change verification with user’s password.

When changing sensitive information (e.g. passwords) it is useful to ask for a validated authorization. This makes sure that the user is present by typing their password.

In an auditor adding:

if 'password' in newvalues and not getattr(db, 'reauth_done', False):
   raise Reauth()

will present the user with a authorization page when the password is changed. The page is generated from the _generic.reauth.html template by default.

Once the user enters their password and submits the page, the password will be verified using: roundup.cgi.actions:LoginAction::verifyPassword(). If the password is correct the original change is done.

To prevent the auditor from trigering another Reauth, the attribute “reauth_done” is added to the db object. As a result, the getattr call will return True and not raise Reauth.

You get one reauth for the submitted change. Note you cannot Reauth multiple properties separately. If you need to auth multiple properties separately, you need to reject the change and force the user to submit each sensitive property separately. For example:

  if 'password' in newvalues and 'realname' in newvalues:
     raise Reject('Changing the username and the realname '
                  'at the same time is not allowed. Please '
                  'submit two changes.')

  if 'password' in newvalues and not getattr(db, 'reauth_done', False):
     raise Reauth()

  if 'realname' in newvalues and not getattr(db, 'reauth_done', False):
     raise Reauth()

Limitations: Handling file inputs requires JavaScript on the browser.

See also: client.py:Client:reauth() which can be changed
using interfaces.py in your tracker.
handle()

Handle a form with a reauth request.

verifyPassword()

Verify the reauth password/token

This can be overridden using interfaces.py.

The default implementation uses the LoginAction::verifyPassword() method.

Templating Utils class

class roundup.cgi.templating.TemplatingUtils(client)

Utilities for templating

embed_form_fields(excluded_fields=None)

Used to create a hidden input field for each client.form element

Parameters:

excluded_fields – these fields will not have a hidden field created for them. Value can be a string or multiple strings contained in something with a __contains__ dunder method: tuple, list, set….

File input fields are represented by a <pre> tag with base64 encoded contents and attributes to store the filename and mimetype. It requires JavaScript on the browser to turn these <pre> tags back into files that can be submitted with the form.

expandfile(name, values=None, optional=False)

Read a file and replace token placeholders.

Given a file name and a dict of tokens and replacements, read the file from the tracker template directory. Then replace all tokens of the form ‘%(token_name)s’ with the values in the dict. If the values dict is set to None, it acts like readfile(). In addition to values passed into the method, the value for the tracker base directory taken from TRACKER_WEB is available as the ‘base’ token. The client_nonce used for Content Security Policy (CSP) is available as ‘client_nonce’. If a token is not in the dict, an empty string is returned and an error log message is logged. See readfile for an usage example.

html_calendar(request)

Generate a HTML calendar.

request - the roundup.request object
  • @template : name of the template

  • form : name of the form to store back the date

  • property : name of the property of the form to store back the date

  • date : date marked as current value on calendar

  • display : when browsing, specifies year and month

html will simply be a table.

html_quote(html)

HTML-quote the supplied text.

readfile(name, optional=False)

Used to inline a file from the template directory.

Used to inline file content into a template. If file is not found in the template directory and optional=False, it reports an error to the user via a NoTemplate exception. If optional=True it returns an empty string when it can’t find the file.

Useful for inlining JavaScript kept in an external file where you can use linters/minifiers and other tools on it.

A TAL example:

<script tal:attributes="nonce request/client/client_nonce"
tal:content="python:utils.readfile('mylibrary.js')"></script>

This method does not expands any tokens in the file. See expandfile() for replacing tokens in the file.

set_http_response(code)

Set the HTTP response code to the integer code. Example:

<tal:x
 tal:replace="python:utils.set_response(404);"
/>

will make the template return code 404 (not found).

url_quote(url)

URL-quote the supplied text.

Logcontext Module

Generate and store thread local logging context including unique trace id for request, request source etc. to be logged.

Trace id generator can use nanoid or uuid.uuid4 stdlib function. Nanoid is preferred if nanoid is installed using pip. Nanoid is faster and generates a shorter id.

If nanoid is installed in the tracker’s lib subdirectory, it must be enabled using the tracker’s interfaces.py by adding:

# if nanoid is installed in the tracker's lib directory and
# if you want to change the length of the nanoid from 12
# to 14 chars use:
from functools import partial
from nanoid import generate
import roundup.logcontext
# change 14 to 12 to get the default nanoid size.
roundup.logcontext.idgen=partial(generate, size=14)

# If nanoid is instanned and you want to use use the shortened uuid
# add this to interfaces.py::
import roundup.logcontext
roundup.logcontext.idgen=roundup.logcontext.short_uuid

If you are wrapping a staticmethod, you need to include staticmethod before the calls to gen_trace_id or store_trace_reason when wrapping a static method. If you don’t the static method gets the self argument prepended which breaks the call. For example to wrap the ‘serve’ staticmethod from WhiteNoise:

WhiteNoise.serve = staticmethod(store_trace_reason(
         extract="'whitenoise ' + args[1]['REQUEST_URI']")(
         gen_trace_id()(WhiteNoise.serve)))
roundup.logcontext.gen_trace_id() Callable

Decorator to generate a trace id (nanoid or encoded uuid4) as contextvar

The logging routine uses this to label every log line. All logs with the same trace_id should be generated from a single request.

This decorator is applied to an entry point for a request. Different methods of invoking Roundup have different entry points. As a result, this decorator can be called multiple times as some entry points can traverse another entry point used by a different invocation method. It will not set a trace_id if one is already assigned.

If a uuid4() is used as the id, the uuid4 integer is encoded into a 62 character alphabet (A-Za-z0-9) to shorten the log line.

This decorator may produce duplicate (colliding) trace_id’s when used with multiple processes on some platforms where uuid.uuid4().is_safe is unknown. Probability of a collision is unknown.

If nanoid is used to generate the id, it is 12 chars long and uses a 64 char ascii alphabet, the 62 above with ‘_’ and ‘-‘. The shorter nanoid has < 1% chance of collision in ~4 months when generating 1000 id’s per second.

See the help text for the module to change how the id is generated.

roundup.logcontext.get_context_dict()

Return dict of context var tuples {“var_name”: “var_value”, …}

roundup.logcontext.get_context_info()

Return list of context var tuples [(var_name, var_value), …]

roundup.logcontext.idgen() str

Variable used for setting the id generator.

roundup.logcontext.set_processName(name)

Decorator to set the processName used in the LogRecord

roundup.logcontext.short_uuid() str

Encode a UUID integer in a shorter form for display.

A uuid is long. Make a shorter version that takes less room in a log line and is easier to store.

roundup.logcontext.store_trace_reason(location='unset', extract=None)

Decorator finds and stores a reason trace was started in contextvar.

Record the url path for a regular web triggered request. Record the message id for an email triggered request. Record a roundup-admin command/action for roundup-admin request.

(*) There are multiple entry points to the code. Some entry points call through other entry points. As a result this can be called multiple times in one request. Because the reason can be stored from multiple locations depending on where this is called, it is called with a location hint to identify the caller (faster than looking up the stack).

If a reason has already been stored (and it’s not “missing”, it tries to extract it again and verifies it’s the same as the stored reason. If it’s not the same it logs an error. This safety check may be removed in a future version of Roundup.