SlipGURU Dipartimento di Informatica e Scienze dell'Informazione Università Degli Studi di Genova

core Package

core Package

Provides core low–level functionality of KDVS.

_version Module

Provides version for the release of KDVS.

kdvs.core._version.version = '2.0.0'

Major number for main changes. Minor number for new features. Release number for bug fixes and minor updates. Status = ‘alpha’/’beta’/None.

action Module

Provides specification for actions executed by execution environment.

class kdvs.core.action.ActionSpec(input_vars, output_vars, func, args, kwargs)

Bases: object

Creates new action to be submitted to execution environment. Action specification encapsulates arbitrary Python callable and its arguments. In addition, if requested, execution environment can check if certain variables have been present and/or produced before and after the execution.

Parameters :

input_vars : list/tuple

names of the variables that shall be present in the environment before the execution of the action

output_vars : list/tuple

names of the variables that shall be present in the environment after the execution of the action

func : callable

function to be executed

args : iterable

any additional positional arguments passed to function

kwargs : dict

any additional keyworded arguments passed to function

as_tuple()

Return all the elements of action specification as tuple: (input_vars, output_vars, func, args, kwargs)

config Module

Provides mechanisms for configuration of KDVS. Configuration files are normal Python scripts that are evaluated with ‘execfile’. Since KDVS uses many options in many configuration files, this module provides utilities for evaluating and merging configurations.

kdvs.core.config.getDefaultCfgFilePath(defpath=None)

Return absolute path of default configuration file for KDVS.

kdvs.core.config.getDefaultDataRootPath(defpath=None)

Return absolute path of default data directory for KDVS.

kdvs.core.config.getGODataPath(defpath=None)

Return absolute path of default Gene Ontology (GO) data directory for KDVS.

kdvs.core.config.getVisDataPath(defpath=None)

Return absolute path of default visualization data directory for KDVS.

kdvs.core.config.mergeCfg(parent_cfg, child_cfg)

Merge parent end child config variables to produce final config variables, as follows. If the variable exists only in parent config or child config, write it to the final config. If the variable exists in both parent and child configs, retain the one from child config.

It is used to merge variables obtained from default configuration file and specified user configuration file.

Parameters :

parent_cfg : dict

dictionary of variables obtained from parent configuration file

child_cfg : dict

dictionary of variables obtained from child configuration file

Returns :

final_cfg : dict

merged configuration variables

kdvs.core.config.evaluateDefaultCfg()

Evaluate default configuration file of KDVS and return its configuration variables.

Returns :

default_cfg_vars : dict

dictionary of variables obtained from default configuration file

kdvs.core.config.evaluateUserCfg(cfg_file, ignore_default_cfg=False)

Evaluate specified user configuration file of KDVS, performs merge with default configuration, and returns final configuration variables.

Parameters :

cfg_file : string

path to user configuration file

ignore_default_cfg : bool

ignores default configuration file; no merge is performed and all variables come only from user configuration file; should be used with caution

Returns :

user_cfg : dict

dictionary of variables obtained from user configuration file

db Module

Provides layer for all DB operations performed by KDVS.

class kdvs.core.db.DBManager(arbitrary_data_root=None, provider=None, rootdbid=None)

Bases: object

General manager of all DB operations performed by KDVS. It provides:

  • automatic handling of meta–database that contains information of all used subordinated databases
  • automated opening/closing of multiple subordinated databases
Parameters :

arbitrary_data_root : string/None

path to directory containing all database objects managed by this manager instance; also, all new database objects will be created here; if None, default path ‘~/.kdvs/’ will be used

provider : DBProvider/None

concrete DBProvider instance that provides internal details about requested database system; if None, default provider for SQLite3 is used

rootdbid : string/None

custom ID for meta–database; if not specified, the default one will be used

getDB(db_id)

Obtain connection for subordinated database with requested ID. If database does not exist, create it. If underlying provider accepts it, special ID ‘memdb’ may be used for single in–memory database.

Parameters :

db_id : string

ID for requested database

Returns :

handle : connection (depends on provider)

connection to the requested subordinated database

getDBloc(db_id)

Obtain location for subordinated database with requested ID. If database with requested ID has not been yet created, return None. The meaning of ‘location’ depends on the provider. For instance, in case of SQLite3, single database is stored in single file, and ‘location’ is the absolute path to that file.

Parameters :

db_id : string

ID for requested database

Returns :

location : (depends on provider)

location of the requested subordinated database, in the sense of provider; or None if database does not exist

close(dbname=None)

Closes requested subordinated database managed by this manager instance. If None is requested, then all databases are closed.

Parameters :

dbname : string/None

ID for requested database to close; if None, all databases wil be closed

dep Module

Provides mechanism for dynamic verifying of dependences in KDVS.

kdvs.core.dep.verifyDepModule(modname)

Verify that requested module is importable from within KDVS.

Parameters :

modname : string

module name to be verified

Returns :

module_instance : module

instance of the verified module, as present in ‘sys.modules’

Raises :

Error :

if requested module is not importable from within KDVS

env Module

Provides functionality of encapsulated execution environment for small dependent computational tasks (dubbed actions). Actions are encapsulated Python callables that can be submitted to environment for execution. Actions are executed in submission order. Actions can communicate through shared variables that are kept within the environment instance. In addition, environment can verify if certain variables are present before or after the execution of specific action.

class kdvs.core.env.ExecutionEnvironment(expected_cfg, env_cfg)

Bases: kdvs.core.util.Configurable

Implements basic execution environment, where set of actions is executed in order, and common set of environment variables is available for any action to store to and retrieve data from. The environment is configurable.

Parameters :

expected_cfg : dict

expected configuration of environment

env_cfg : dict

provided environment configuration as read from configuration file

Raises :

Error :

if provided configuration does not conform to expected configuration

actions()

Returns list of actions submitted for execution by this environment.

Returns :

actions : iterable of ActionSpec

all actions submitted for execution by this environment

addAction(action_spec)

Submit instance of action specification (ActionSpec) for execution by this environment.

addCallable(action_callable, action_kwargs=None, action_input=None, action_output=None)

Submit specified callable for execution by this environment.

Parameters :

action_callable : callable

callable to be executed

action_kwargs : dict/None

any keyworded arguments passed to callable

action_input : list/tuple/None

names of the variables that shall be present in the environment before the execution of the action

action_output : list/tuple/None

names of the variables that shall be present in the environment after the execution of the action

clearActions()

Clear all currently submitted actions.

addVar(varkey, var, replace=False)

Add new variable to execution environment, with possible replacement. Variables are managed in a dictionary, therefore all rules for adding items to standard dictionaries apply here as well.

Parameters :

varkey : object

hashable key for new variable

var : object

value for new variable

replace : bool

if True, already existing variable will be replaced

Raises :

ValueError :

if variable already exists and replacement was not requested

delVar(varkey)

Remove requested variable from execution environment. If variable does not exist, do nothing.

Parameters :

varkey : object

key of variable to be removed

updateVars(vardict, replace=False)

Add all variables from requested dictionary to execution environment.

Parameters :

vardict : dict

dictionary of variables to be added

replace : bool

if True, any already existing variable will be replaced

Raises :

ValueError :

if one of new variables already exists (replacements are not requested here)

See also

addVar

var(varkey)

Retrieve value of variable present in execution environment.

Parameters :

varkey : object

key of variable to be retrieved

Returns :

var : object

value of requested variable

Raises :

ValueError :

if variable does not exist

varkeys()

Retrieve keys of all existing variables present in execution environment.

Returns :

varkeys : iterable

keys of variables present in execution environment

execute(verify_io=True, dump_env=False)

Execute all submitted actions so far in submission order (FIFO) and performs explicit garbage collection between each action run. When any action raises an exception during its run, the whole execution is stopped and diagnostic information is returned. If all actions finish without exception, also clear them.

Parameters :

verify_io : bool

if environment shall perform verification of input and output for every action executed; True by default

dump_env : bool

if in case of exception thrown by action, all environment variables shall be returned in diagnostic information

Returns :

diagnostic : tuple/None

None if all actions were executed silently, otherwise the following information is returned:
  • number of action that has thrown an exception
  • total number of actions to be executed
  • failed action details, as tuple (action_func_callable, args, kwargs)
  • thrown exception details, as tuple (exception instance, result of ‘sys.exc_info’)
  • details of actions already executed before failed action, as iterable of tuples (action_func_callable, args, kwargs)
  • details of actions to be executed after failed action, as iterable of tuples (action_func_callable, args, kwargs)
  • if environment dump was requested, all environment variables present in the moment of exception throw

See also

sys.exc_info()

Notes

Dumping the environment that executes many long and complicated interlinked actions that use lots of environment variables produces a LOT OF diagnostic information.

postActionCallback()

Callback function called after each successful action execution. By default it does nothing.

class kdvs.core.env.LoggedExecutionEnvironment(expected_cfg, env_cfg)

Bases: kdvs.core.env.ExecutionEnvironment

Implements execution environment that utilizes logging.

Parameters :

expected_cfg : dict

expected configuration of environment

env_cfg : dict

provided environment configuration as read from configuration file

Raises :

Error :

if provided configuration does not conform to expected configuration

execute()

Execute actions and log any diagnostic information obtained.

postActionCallback()

Called after each successful action execution. Flushes pending logger output.

error Module

Provides unified error handling.

exception kdvs.core.error.Error

Bases: exceptions.Exception

Specific KDVS exception.

exception kdvs.core.error.Warn

Bases: exceptions.Warning

Specific KDVS warning.

log Module

Provides logging utilities for KDVS.

class kdvs.core.log.Logger(name=None, level=20, handler=None, formatter=None, add_atts=None)

Bases: object

Abstract logger for KDVS system.

Parameters :

name : string/None

name of the logger; if None, random name will be used

level : integer

level granularity of this logger, INFO by default

handler : Handler/None

handler for the logger; if None, null handler will be used (that does not emit anything)

formatter : Formatter/None

formatter for the logger; if None, standard Formatter will be used

add_atts : dict/None

any custom attributes to be associated with the logger

stdAttrs()

Return dictionary of standard attributes for the logger.

addAttrs()

Return dictionary of associated custom attributes for the logger.

class kdvs.core.log.RotatingFileLogger(name=None, level=20, path=None, maxBytes=10485760, backupCount=5)

Bases: kdvs.core.log.Logger

Logger that uses rotating file mechanism. By design, the default logger that KDVS uses.

Parameters :

name : string/None

name of the logger; if None, random name will be used

level : integer

level granularity of this logger, INFO by default

path : string/None

path to log file; if None, random log file will be created in current directory, as per os.getcwd()

maxBytes : integer

maximum size of log file in bytes before it gets rotated; 10 MB by default

backupCount : integer

maximum number of old log files kept in rotation cycle; 5 by default

class kdvs.core.log.StreamLogger(name=None, level=20, stream=<open file '<stdout>', mode 'w' at 0x2aeb938c31e0>)

Bases: kdvs.core.log.Logger

Logger that directs messages to specified stream.

Parameters :

name : string/None

name of the logger; if None, random name will be used

level : integer

level granularity of this logger, INFO by default

stream : file

Stream to direct messages to; sys.stdout by default

class kdvs.core.log.NullHandler

Bases: logging.Handler

Null handler used for compatibility with Python 2.6. It consumes given messages without emitting them.

emit(*args)
flush()
close()

provider Module

Contains set of useful providers for various generic classes of objects handled by KDVS.

class kdvs.core.provider.DBProvider

Bases: object

Abstract class for providers of database services. All methods must be implemented in subclasses.

connect(*args, **kwargs)

Appropriately connect to specified database and return connection object.

getConnectionFunc()

Return appropriate low–level connection function.

getOperationalError()

Return appropriate instance of OperationalError.

getTextColumnType()

Return appropriate type for DB column that contains unformatted text data.

checkTableExistence(*args, **kwargs)

Perform appropriate check if table is present in the database.

getEngine()

Get appropriate information about DB engine, as the following dictionary: {‘name’ : name, ‘version’ : version}.

class kdvs.core.provider.SQLite3DBProvider

Bases: kdvs.core.provider.DBProvider

Provider for SQLite3 database based on sqlite3.

connect(*args, **kwargs)

Connect to specified database. All positional arguments are passed to sqlite3.connect() function. All keyworded arguments may be used by the user. Currently the following arguments are recognized:

  • ‘unicode_strings’ (boolean)

    if True, all strings returned by sqlite3 will be Unicode, and normal strings otherwise; sets global attribute sqlite3.text_factory appropriately; False by default (may be omitted)

Parameters :

args : iterable

positional arguments passed directly to appropriate connection function

kwargs : dict

keyworded arguments to be used by the user

Raises :

Error :

if could not connect to specified database for whatever reason; essentially, re–raise OperationalError with details

getConnectionFunc()

Returns sqlite3.connect() instance.

getOperationalError()

Returns sqlite3.OperationalError instance.

getTextColumnType()

Returns ‘TEXT’ as the default unformatted table content. See SQLite documentation for more details.

checkTableExistence(*args)

Check if specific table exists in given database. The check is performed as a query of ‘sqlite_master’ table. See SQLite documentation for more details.

Parameters :

conn : sqlite3.Connection

opened connection to database

tablename : string

name of the table to be checked

Raises :

Error :

if could not check table existence; essentially, re–raise OperationalError with details

getEngine()

Get version information if the SQLite engine as dictionary {‘name’ : name, ‘version’ : version}. Version information is obtained as the result of SQLite function ‘sqlite_version’ executed from in–memory database. See SQLite documentation for more details.

Raises :

Error :

if could not get engine version; essentially, re–raise OperationalError with details

kdvs.core.provider.fileProvider(filename, *args, **kwargs)

Return opened file object suitable for use with context manager, regardless of file type. Provides transparent handling of compressed files.

Parameters :

filename : string

path to specific file

args : iterable

any positional arguments passed into opener function

kwargs : dictionary

any keyword arguments passed into opener function

Returns :

file_object : file_like

opened file object, suitable for use with context manager

class kdvs.core.provider.fpGzipFile(filename=None, mode=None, compresslevel=9, fileobj=None, mtime=None)

Bases: gzip.GzipFile

Wrapper class to allow opening gzip-ed files with context manager.

Constructor for the GzipFile class.

At least one of fileobj and filename must be given a non-trivial value.

The new class instance is based on fileobj, which can be a regular file, a StringIO object, or any other object which simulates a file. It defaults to None, in which case filename is opened to provide a file object.

When fileobj is not None, the filename argument is only used to be included in the gzip file header, which may includes the original filename of the uncompressed file. It defaults to the filename of fileobj, if discernible; otherwise, it defaults to the empty string, and in this case the original filename is not included in the header.

The mode argument can be any of ‘r’, ‘rb’, ‘a’, ‘ab’, ‘w’, or ‘wb’, depending on whether the file will be read or written. The default is the mode of fileobj if discernible; otherwise, the default is ‘rb’. Be aware that only the ‘rb’, ‘ab’, and ‘wb’ values should be used for cross-platform portability.

The compresslevel argument is an integer from 1 to 9 controlling the level of compression; 1 is fastest and produces the least compression, and 9 is slowest and produces the most compression. The default is 9.

The mtime argument is an optional numeric timestamp to be written to the stream when compressing. All gzip compressed streams are required to contain a timestamp. If omitted or None, the current time is used. This module ignores the timestamp when decompressing; however, some programs, such as gunzip, make use of it. The format of the timestamp is the same as that of the return value of time.time() and of the st_mtime member of the object returned by os.stat().

class kdvs.core.provider.fpBzip2File

Bases: bz2.BZ2File

Wrapper class to allow opening bzip2-ed files with context manager.

kdvs.core.provider.RECOGNIZED_FILE_PROVIDERS = (<type 'file'>, <class 'kdvs.core.provider.fpGzipFile'>, <class 'kdvs.core.provider.fpBzip2File'>)

File providers currently recognized by KDVS.

util Module

Provides various utility classes and functions.

kdvs.core.util.quote(s, quote='"')

Surrounds requested string with given “quote” strings.

Parameters :

s : string

string to quote

quote : string

string to be appended to the beginning and the end of requested string

Returns :

quoted_str : string

quote + s + quote

kdvs.core.util.CommentSkipper(seq, comment=None)

Generator that discards strings prefixed with given comment string.

Parameters :

seq : iterable

sequence of strings

comment : string/None

comment prefix; if None, do not perform any discarding at all

Returns :

ncomseq : iterable

filtered input sequence of strings without commented ones

kdvs.core.util.isListOrTuple(obj)

Return True if given object is an instance of list or tuple, and False otherwise.

kdvs.core.util.isTuple(obj)

Return True if given object is an instance of tuple, and False otherwise.

kdvs.core.util.isIntegralNumber(obj)

Return True if given object is an instance of integer number, and False otherwise.

kdvs.core.util.className(obj)

Return class name of the object.

kdvs.core.util.emptyGenerator()

Generator that yields nothing.

kdvs.core.util.emptyCall()

Empty callable.

kdvs.core.util.revIndices(sliceobj)

Internal function. Revert given slice object.

class kdvs.core.util.NPGenFromTxtWrapper(dbresult, delim=' ', id_col_idx=None)

Bases: object

Wraps DBResult instance into a generator that can be passed to numpy.loadtxt() family of functions. It exposes appropriate methods accepted by numpy.loadtxt() depending on the numpy version: ‘readline’ (for numpy < 1.6.0), ‘next’ (for numpy >= 1.6.0).

Parameters :

dbresult : DBResult

DBResult object to be wrapped

delim : string

requested delimiter for ‘loadtxt’–type function, space by default

id_col_idx : integer/None

index of column that contains ID; if not None, the content of that column will be discarded by the generator

See also

numpy.genfromtxt

next()
readline()
kdvs.core.util.getSerializableContent(obj)

Produce representation of the object that is as serializable as possible (in the sense of pickle).

Parameters :

obj : object

object to be serialized

Returns :

ser_obj : object

serialized object (in terms of pickle) or serializable surrogate that identifies the origin of the original object.

kdvs.core.util.serializeObj(obj, out_fh, protocol=None)

Serialize given input object to opened file–like handle. Self–contained function, can be used as depfunc with PPlusJobContainer.

Parameters :

obj : object

object to be serialized

out_fh : file-like

file-like object that accepts serialized object

protocol : integer/None

protocol used by pickle/cPickle in serialization of the input object; if None, the highest possible is used

kdvs.core.util.deserializeObj(in_fh)

Deserialize object from opened file–like handle. Self–contained function, can be used as depfunc with PPlusJobContainer.

Parameters :

in_fh : file-like

file-like object that contains serialized object

Returns :

obj : object

deserialized object

See also

pickle, cPickle

kdvs.core.util.serializeTxt(lines, out_fh)

Serialize given lines of text to opened file–like handle. Self–contained function, can be used as depfunc with PPlusJobContainer.

Parameters :

lines : iterable

lines of text to be serialized

out_fh : file-like

file-like object that accepts serialized object

Notes

Currently it just writes text lines with ‘writelines’. Override for more detailed control.

kdvs.core.util.pprintObj(obj, out_fh, indent=2)

Pretty prints in sense of pprint.pprint(), the represenation of given input object, to opened file–like handle. Self–contained function, can be used as depfunc with PPlusJobContainer.

Parameters :

obj : object

object to be pretty printed

out_fh : file-like

file-like object that accepts pretty printed object

indent : integer

indentation used by pretty printer; 2 spaces by default

kdvs.core.util.writeObj(obj, out_fh)

Write given input object to opened file–like handle. Self–contained function, can be used as depfunc with PPlusJobContainer.

Parameters :

obj : object

object to be written

out_fh : file-like

file-like object that accepts written object

Notes

Currently it just writes object with ‘write’. The exact behavior depends on the given input object and file-like object. The function does not check for any error. Override for more detailed control.

kdvs.core.util.isDirWritable(directory)

Return True if given directory is writable, and False otherwise.

kdvs.core.util.importComponent(qualified_name, qualifier='.')

Dynamically import given module.

Parameters :

qualified_name : string

fully qualified module path as resolvable by Python, e.g. ‘kdvs.core.util.importComponent’.

qualifier : string

qualifier symbol used in given module path; dot (‘.’) by default

Returns :

module : module

imported module instance, as leaf (not root) in Python module hierarchy; see __import__() for more details

Notes

Importing is done relatively to ROOT_IMPORT_PATH only. This function is used for detailed dynamic imports from within KDVS.

kdvs.core.util.getFileNameComponent(path)

Return filename component of given file path.

kdvs.core.util.resolveIndexes(dsv, indexes_info)

Internal helper function. Return correct indexes for given DSV instance.

kdvs.core.util.pairwise(iterable)

Taken from itertools receipts. Split given iterable of elements into overlapping pairs of elements as follows: (s0,s1,s2,s3,...) -> (s0,s1),(s1,s2),(s2, s3),...

class kdvs.core.util.Parametrizable(ref_parameters=(), **kwargs)

Bases: object

Abstract class that can be parametrized. During instantiation, given parameters can be compared to reference ones, and Error is thrown when they do not match.

Parameters :

ref_parameters : iterable

reference parameters to be checked against during instantiation; empty tuple by default

kwargs : dict

actual parameters supplied during instantiation; they will be checked against reference ones

Raises :

Error :

if some supplied parameters are not on the reference list

class kdvs.core.util.Configurable(expected_cfg={}, actual_cfg={})

Bases: object

Abstract class that can be configured. During instantiation, given configuration can be compared to reference one, and Error is thrown when they do not match. Expected configuration is a dictionary of objects {‘par1’ : partype1obj, ‘par2’ : partype2obj, ...}, where partypeXobj is the example object of expected type. For instance, if partypeXobj is {}, it will be checked if the parameter value is a dictionary. Object types are checked with isinstance().

Parameters :

expected_cfg : dict

expected configuration; empty dictionary by default

actual_cfg : dict

actual supplied configuration; empty dictionary by default

Raises :

Error :

if any element is missing from expected configuration

Error :

if expected type of any element is wrong

keys()

Return all keys of actual supplied configuration.

class kdvs.core.util.Constant(srepr)

Bases: object

Class that represents string constants.

Parameters :

srepr : object

Textual representation of string constant.

Notes

In current implementation, it represents ‘text’ with ‘<text>’.

Table Of Contents