An interface to submit queries to an existing Gemini database and iterate over the results of the query.
We create a GeminiQuery object by specifying database to which to connect:
from gemini import GeminiQuery
gq = GeminiQuery("my.db")
We can then issue a query against the database and iterate through the results by using the run() method:
for row in gq:
print row
Instead of printing the entire row, one access print specific columns:
gq.run("select chrom, start, end from variants")
for row in gq:
print row['chrom']
Also, all of the underlying numpy genotype arrays are always available:
gq.run("select chrom, start, end from variants")
for row in gq:
gts = row.gts
print row['chrom'], gts
# yields "chr1" ['A/G' 'G/G' ... 'A/G']
The run() methods also accepts genotype filter:
query = "select chrom, start, end" from variants"
gt_filter = "gt_types.NA20814 == HET"
gq.run(query)
for row in gq:
print row
Lastly, one can use the sample_to_idx and idx_to_sample dictionaries to gain access to sample-level genotype information either by sample name or by sample index:
# grab dict mapping sample to genotype array indices
smp2idx = gq.sample_to_idx
query = "select chrom, start, end from variants"
gt_filter = "gt_types.NA20814 == HET"
gq.run(query, gt_filter)
# print a header listing the selected columns
print gq.header
for row in gq:
# access a NUMPY array of the sample genotypes.
gts = row['gts']
# use the smp2idx dict to access sample genotypes
idx = smp2idx['NA20814']
print row, gts[idx]
Execute a query against a Gemini database. The user may specify:
- (reqd.) an SQL query.
- (opt.) a genotype filter.
Return a header describing the columns that were selected in the query issued to a GeminiQuery object.
Return a dictionary mapping sample names to genotype array offsets:
gq = GeminiQuery("my.db")
s2i = gq.sample2index
print s2i['NA20814']
# yields 1088
Return a dictionary mapping sample names to genotype array offsets:
gq = GeminiQuery("my.db")
i2s = gq.index2sample
print i2s[1088]
# yields "NA20814"
This file can be edited directly through the Web. Anyone can update and fix errors in this document with few clicks -- no downloads needed.
For an introduction to the documentation format please see the reST primer.