SlipGURU Dipartimento di Informatica e Scienze dell'Informazione Università Degli Studi di Genova

go Package

go Package

Provides concrete functionalities related to processing of Gene Ontology (GO) information as prior knowledge source.

GeneOntology Module

Provides concrete implementation of Gene Ontology (GO) manager that manages individual GO terms as prior knowledge concepts. Also provides various constants specific for GO.

kdvs.fw.impl.pk.go.GeneOntology.GO_BP_DS = 'BP'

Internal KDVS symbol that refers to Biological Process (BP) domain of GO.

kdvs.fw.impl.pk.go.GeneOntology.GO_MF_DS = 'MF'

Internal KDVS symbol that refers to Molecular Function (MF) domain of GO.

kdvs.fw.impl.pk.go.GeneOntology.GO_CC_DS = 'CC'

Internal KDVS symbol that refers to Cellular Component (CC) domain of GO.

kdvs.fw.impl.pk.go.GeneOntology.GO_DOMAINS = {'CC': 'cellular_component', 'MF': 'molecular_function', 'BP': 'biological_process'}

Descriptive identifiers for GO domains.

kdvs.fw.impl.pk.go.GeneOntology.GO_DS = ['CC', 'MF', 'BP']

Symbols of GO domains recognized by KDVS.

kdvs.fw.impl.pk.go.GeneOntology.GO_ROOT_TERMS = {'CC': 'GO:0005575', 'MF': 'GO:0003674', 'BP': 'GO:0008150'}

Root terms of GO domains.

kdvs.fw.impl.pk.go.GeneOntology.GO_EVIDENCE_CODES = {'IC': 'inferred by curator', 'NAS': 'non-traceable author statement', 'ISM': 'inferred from sequence model', 'TAS': 'traceable author statement', 'ISS': 'inferred from sequence or structural similarity', 'IEP': 'inferred from expression pattern', 'IPI': 'inferred from physical interaction', 'ND': 'no biological data available', 'IMP': 'inferred from mutant phenotype', 'ISO': 'inferred from sequence orthology', 'EXP': 'inferred from experiment', 'IEA': 'inferred from electronic annotation', 'IGC': 'inferred from genomic context', 'ISA': 'inferred from sequence', 'NR': 'not recorded', 'RCA': 'inferred from reviewed computational analysis', 'IGI': 'inferred from genetic interaction', 'IDA': 'inferred from direct assay'}

Mapping of recognized GO evidence codes: {code : description}.

kdvs.fw.impl.pk.go.GeneOntology.GO_INV_EVIDENCE_CODES = {'inferred from expression pattern': 'IEP', 'non-traceable author statement': 'NAS', 'inferred by curator': 'IC', 'inferred from genetic interaction': 'IGI', 'inferred from physical interaction': 'IPI', 'inferred from direct assay': 'IDA', 'inferred from sequence': 'ISA', 'no biological data available': 'ND', 'inferred from sequence or structural similarity': 'ISS', 'inferred from genomic context': 'IGC', 'inferred from reviewed computational analysis': 'RCA', 'inferred from experiment': 'EXP', 'traceable author statement': 'TAS', 'inferred from mutant phenotype': 'IMP', 'inferred from sequence model': 'ISM', 'not recorded': 'NR', 'inferred from electronic annotation': 'IEA', 'inferred from sequence orthology': 'ISO'}

Inverted mapping of recognized GO evidence codes: {description : code}.

kdvs.fw.impl.pk.go.GeneOntology.GO_UNKNOWN_EV_CODE = 'UNK'

KDVS–specific artificial ‘unknown evidence code’ used when non recognized evidence code is encountered.

kdvs.fw.impl.pk.go.GeneOntology.GO_DEF_RECOGNIZED_RELATIONS = ['is_a', 'part_of', 'regulates', 'positively_regulates', 'negatively_regulates']

Default inter–term relation recognized by KDVS.

kdvs.fw.impl.pk.go.GeneOntology.GO_OBOXML_ROOT_TAG = 'obo'

Standard root tag of GO release encoded as OBO-XML file.

kdvs.fw.impl.pk.go.GeneOntology.isGOID(goid)

Return True if goid is valid GO term ID, False othwerwise.

kdvs.fw.impl.pk.go.GeneOntology.GO_num2id(num)

Resolve numerical part of GO term ID into full GO term ID.

Parameters :

num : integer/string

supposed numerical part of GO term ID

Returns :

termID : string

full GO term ID

Raises :

Error :

if numerical part does not resolve to valid GO term ID

kdvs.fw.impl.pk.go.GeneOntology.GO_id2num(goid, numint=True)

Extract numerical part of full GO term ID.

Parameters :

goid : string

full GO term ID

numint : boolean

if True, return numerical part converted to integer; if False, return numerical part as string; True by default

Returns :

num : integer/string

numerical part of full GO term ID

class kdvs.fw.impl.pk.go.GeneOntology.GOManager

Bases: kdvs.fw.PK.PKCManager

Concrete prior knowledge manager that parses GO release encoded in OBO-XML file and keeps track of all individual GO terms (i.e. prior knowledge concepts). The following content is exposed through public attributes after load() method finishes successfully:

  • terms – dictionary of individual term data
  • synonymsSetBDMap of synonymous terms
  • termsPlainHierarchySetBDMap of term hierarchy {parent : children} independent of relations
  • termsRelationsHierarchy – {relation_name : setBDMap} mapping of term hierarchies grouped by recognized relations
  • obsolete_terms – iterable of obsolete terms IDs
  • valid_terms – iterable of valid terms names IDs
  • domain2validTerms – valid terms grouped by domains
configure(domains=None, recognized_relations=None)

Configure this manager.

Parameters :

domains : iterable of string/None

iterable of GO domains that this manager will recognize; if None, all domains are recognized; None by default

recognized_relations : iterable of string/None

iterable of inter–term relations that this manager will recognize; if None, default relations will be recognized (GO_DEF_RECOGNIZED_RELATIONS); None by default

load(fh, root_tag=None)

Read GO release from OBO-XML file and build all data structures. XML parsing is done with xml.etree.ElementTree (xml.etree.cElementTree if possible).

Parameters :

fh : file–like

opened file handle of the OBO-XML file that contains encoded GO release; file handle must come from any recognized KDVS file provider

root_tag : string/None

root XML tag of OBO-XML file that will be accepted; if None, default root tag (GO_OBOXML_ROOT_TAG) will be used; None by default

Raises :

Error :

if requested root tag has not been found

Error :

if file handle comes from unrecognized file provider

Error :

if parsing of OBO-XML is interrupted with an error (re–raised ElementTree exception)

getPKC(conceptID)

Get PKC instance for specified concept ID (i.e. GO term ID). This method resolves synonymous GO terms.

Parameters :

conceptID : string

full GO term ID for valid (i.e. not obsolete) term

Returns :

pkc : PriorKnowledgeConcept / None

PKC instance that corresponds to specified GO term, or None if the conceptID has not been found

Raises :

Warn :

if synonymic term resolves into more than one direct terms

dump()

Build dictionary dump of all the information produced by this manager, if possible in textual format, and return it. The dump dictionary contains representations of data structures keyed by the names of relevant public attributes. For bi–directional mappings, forward and backward parts are separated into ‘fwd’ and ‘bwd’ subkeyed parts.

getSynonyms(termID)

Get recognized synonymous term IDs for specified GO term ID.

Parameters :

termID : string

full GO term ID

Returns :

synonyms : iterable of string / None

iterable of synonymous full GO term IDs, or None if not found

getDescendants(parentTermID, depth=False)

Follow term hierarchy and return descendant terms for specified GO term ID. Optionally, return numerical depth information.

Parameters :

parentTermID : string

full GO term ID

depth : boolean

if True, return descendant terms with additional depth information; if False, return only descendant terms without depth information; False by default

Returns :

descendants : iterable of string

if ‘depth’ is False; iterable of full GO term IDs of descendant terms; parent term is included at the beginning

descendants_with_depth : iterable of (string, int) tuples

if ‘depth’ is True; iterable of the following tuples: (full GO term ID of descendant term, level of depth as integer relative to the parent); parent term is included at the beginning with depth 0; subsequent depths are positive integers

getAncestors(childTermID, depth=False)

Follow term hierarchy and return ancestor terms for specified GO term ID. Optionally, return numerical depth information.

Parameters :

childTermID : string

full GO term ID

depth : boolean

if True, return ancestor terms with additional depth information; if False, return only ancestor terms without depth information; False by default

Returns :

ancestors : iterable of string

if ‘depth’ is False; iterable of full GO term IDs of ancestor terms; child term is included at the beginning

ancestors_with_depth : iterable of (string, int) tuples

if ‘depth’ is True; iterable of the following tuples: (full GO term ID of ancestor term, level of depth as integer relative to the child); child term is included at the beginning with depth 0; subsequent depths are negative integers

Table Of Contents