This module provides access to the Saccharomyces cerevisiae genome from Python. Sequences can be accessed as Bio.SeqRecord objects provided by Biopython.
Bases: str
This function provide nicer output of strings in the IPython shell
Returns True if a gene is not expressed in the same direction on the chromosome as the gene immediatelly upstream.
Tandem genes:
Gene1 Gene2 —-> —->
bidirectional genes:
Gene1 Gene2 —-> <—-
Gene1 Gene2 <—- —->
Returns the coding sequence assciated with a standard name (eg. CYC1) or a systematic name (eg. YJR048W).
cds_genbank_accession cds_pydna_code
>>> from pygenome import sg
>>> sg.cds("TDH3")
SeqRecord(seq=Seq('ATGGTTAGAGTTGCTATTAACGGTTTCGGTAGAATCGGTAGATTGGTCATGAGA...TAA', IUPACAmbiguousDNA()), id='<unknown id>', name='<unknown name>', description='<unknown description>', dbxrefs=[])
>>> len(sg.cds("TDH3"))
999
>>> sg.cds("YJR048W")
SeqRecord(seq=Seq('ATGACTGAATTCAAGGCCGGTTCTGCTAAGAAAGGTGCTACACTTTTCAAGACT...TAA', IUPACAmbiguousDNA()), id='BK006943.2', name='BK006943', description='TPA: Saccharomyces cerevisiae S288c chromosome X.', dbxrefs=[])
>>> len(sg.cds("YJR048W"))
330
>>>
Same as the cds method, but returns a string representing a portion of a Genbank file.
>>> from pygenome import sg
>>> sg.cds_genbank_accession("TDH3")
'BK006941.2 REGION: complement(882812..883810)'
>>>
Returns the chromosome associated with the number id
chromosomes
Some of the yeast chromosomes return large sequences:
——- ——– chr A 230218 chr B 316620 chr C 813184 chr D 576874 chr E 1531933 chr F 1090940 chr G 270161 chr H 439888 chr I 562643 chr J 666816 chr K 745751 chr L 924431 chr M 1078177 chr N 1091291 chr O 784333 ——- ——-
>>> from pygenome import sg
>>> len(sg.chromosome(1))
230218
>>> sg.chromosome(1)
SeqRecord(seq=Seq('CCACACCACACCCACACACCCACACACCACACCACACACCACACCACACCCACA...GGG', IUPACAmbiguousDNA()), id='BK006935.2', name='BK006935', description='TPA: Saccharomyces cerevisiae S288c chromosome I.', dbxrefs=[])
>>> len(sg.chromosome(16))
948066
>>> len(sg.chromosome("A"))
230218
>>>
Returns a generator containing all yeast chromosomes in the form of Bio.SeqRecord objects
chromosome
>>> from pygenome import sg
>>> sg.chromosomes().next()
SeqRecord(seq=Seq('CCACACCACACCCACACACCCACACACCACACCACACACCACACCACACCCACA...GGG', IUPACAmbiguousDNA()), id='BK006935.2', name='BK006935', description='TPA: Saccharomyces cerevisiae S288c chromosome I.', dbxrefs=[])
>>>
Download the sequence files from Saccharomyces Genome Database (www.sgd.org) This is typically only done once.
Returns the coding sequence (cds) assciated with the gene downstream of gene. This is defined as the gene on the chromosome located 3’ of the transcription stop point of gene. The gene can be given as a standard name (eg. CYC1) or a systematic name (eg. YJR048W).
>>> from pygenome import sg
>>> sg.downstream_gene("RFA1")
'YAR003W'
>>> sg.systematic_name("RFA1")
'YAR007C'
>>> sg.downstream_gene("CYC3")
'YAL040C'
>>> sg.systematic_name("CYC3")
'YAL039C'
>>>
Function wrapping the decorated function.
Same as the cds method, but returns a string representing a portion of a Genbank file.
>>> from pygenome import sg
>>> sg.intergenic_sequence_genbank_accession("YGR192C", "YGR193C")
'BK006941.2 REGION: 883811..884508'
Function wrapping the decorated function.
Function wrapping the decorated function.
Same as the promoter_genbank method, but returns a string representing a portion of a Genbank file.
>>> from pygenome import sg
>>> sg.promoter_genbank_accession("TDH3")
'BK006941.2 REGION: complement(883811..884508)'
>>>
Returns the systematic name associated with a standard name.
>>> from pygenome import sg
>>> sg.systematic_name("GAL1")
'YBR020W'
>>> sg.systematic_name("CYC1")
'YJR048W'
>>> sg.systematic_name("TDH3")
'YGR192C'
>>>
Returns True if a gene is expressed in the same direction on the chromosome as the gene immediatelly upstream.
Tandem genes:
Gene1 Gene2 —-> —->
bidirectional genes:
Gene1 Gene2 —-> <—-
Function wrapping the decorated function.
Returns the coding sequence (cds) assciated with the gene upstream of gene. This is defined as the gene on the chromosome located 5’ of the transcription start point of gene. The gene can be given as a standard name (eg. CYC1) or a systematic name (eg. YJR048W).
>>> from pygenome import sg
>>> sg.systematic_name("RFA1")
'YAR007C'
>>> sg.upstream_gene("RFA1")
'YAR008W'
>>> sg.systematic_name("CYC3")
'YAL039C'
>>> sg.systematic_name("CYC3")
'YAL039C'
>>>