g:Profiler Python package documentation¶
This is the documentation for the official g:Profiler Python package. The package contains both a
module for inclusion into a Python codebase and the gprofiler.py
command-line tool for querying
g:GOSt. Invocation of the latter is not documented here, but executing gprofiler.py --help
yields a manual.
Synopsis:
from gprofiler import GProfiler
gp = GProfiler("MyTool/0.1")
gp.gprofile("sox2")
GProfiler class¶
-
class
gprofiler.
GProfiler
(user_agent, base_url='http://biit.cs.ut.ee/gprofiler/', output_type=10, want_header=False)¶ A class representing the g:Profiler toolkit. Contains methods for querying the g:GOSt, g:Convert and g:Orth tools. Please see the g:Profiler web tool for extensive documentation on all the options to the methods.
user_agent
- Required (String) A short user agent string for your tool.base_url
- (String) An absolute URL of the g:Profiler instance to use; the stable release by default.output_type
- Controls the data structure returned from the methods.GProfiler.OUTPUT_TYPE_FORMATTED
- Default Returns a list (lines) of lists (fields), with each field cast into its proper type orNone
for “N/A” values.GProfiler.OUTPUT_TYPE_LINES
- Returns a list containing the raw lines from g:Profiler.
want_header
- Prepend the header (column names) as the first row of output; false by default.
Options common to several methods:
query
- Required (String | List) The query is a space- separated string or a list of genes, proteins or other biological entities.organism
- (String) The organism name in g:Profiler format.region_query
- (Boolean) The query consists of chromosomal regions.numeric_ns
- (String) Namespace to use for fully numeric IDs.
-
gconvert
(query, organism='hsapiens', target='ENSG', region_query=False, numeric_ns=None)¶ Query g:Convert.
target
- (String) The target namespace.
-
gorth
(query, source_organism='hsapiens', target_organism='mmusculus', region_query=False, numeric_ns=None)¶ Query g:Orth.
source_organism
,target_organism
- The source and target organism IDs, in g:Profiler format
-
gprofile
(query, organism='hsapiens', all_results=False, ordered=False, region_query=False, exclude_iea=False, underrep=False, evcodes=False, hier_sorting=False, hier_filtering=None, max_p_value=1.0, min_set_size=None, max_set_size=None, min_isect_size=None, max_isect_size=None, correction_method=None, domain_size=None, numeric_ns=None, custom_bg=None, src_filter=None)¶ Query g:GOSt.
all_results
- (Boolean) All results, including those deemed not significant.ordered
- (Boolean) Ordered query.exclude_iea
- (Boolean) Exclude electronic GO annotations.underrep
- (Boolean) Measure underrepresentation.evcodes
- (Boolean) Request evidence codes in output as the final column.hier_sorting
- (Boolean) Sort output into subgraphs.hier_filtering
- (Boolean) Hierarchical filtering.max_p_value
- (Float) Custom p-value threshold.min_set_size
- (Int) Minimum size of functional category.max_set_size
- (Int) Maximum size of functional category.min_isect_size
- (Int) Minimum size of query / functional category intersection.max_isect_size
- (Int) Maximum size of query / functional category intersection.correction_method
- Algorithm used for multiple testing correction, one of:GProfiler.THR_GSCS
Default g:SCS.GProfiler.THR_FDR
Benjamini-Hochberg FDR.GProfiler.THR_BONFERRONI
Bonferroni.
domain_size
- Statistical domain size, one of:GProfiler.DOMAIN_ANNOTATED
- Default Only annotated genes.GProfiler.DOMAIN_KNOWN
- All known genes.
custom_bg
- (String | List) Custom statistical backgroundsrc_filter
- (List) A list of data source ID strings, e.g.["GO:BP", "KEGG"]
.