Welcome to cobilib’s documentation!

Contents:

Brief

A library for analysing codon usage bias with the quasispecies model.

Summary

Routines

All included routines::
CodonUsage Fitnessfunction Model __builtins__ __doc__ __file__ __name__ __package__ c_codon_mut_dist calculate_CAI_dic calculate_NC calculate_RF calculate_RF_dic calculate_RSCU calculate_RSCU_dic change_amino_acid_code codon_hist_index codon_index codon_mut_dist codon_table codons compute_distance compute_optimum compute_optimum_two_step compute_steady_state config_from_file decomposition euclidean_distance hessian_em highly_expressed_genes_by_file highly_expressed_genes_by_id init_fitnessfunctions init_fitnessmatrix_for_amino_acids init_fitnessmatrix_for_codons init_list_of_genes is_ambig_codon kl_distance load_fasta load_fasta_from_url load_genbank load_plaintext make_codon_histogram make_codon_histogram_dic make_codon_histogram_dic_combined make_evolmatrix make_jc69_mutationmatrix make_mutationmatrix mds_from_fasta minimize number_of_codons optimize optimize_dic optimize_sequence parametric_run_from_config plot_all_reduction_methods plot_mds plot_pca py_codon_mut_dist relative_cosine_distance relative_euclidean_distance relative_hellinger_distance relative_jeffrey_divergence relative_minkowski_distance remove_stopcodons_and_one_codon_amino_acids remove_stopcodons_and_one_codon_amino_acids_dic run_model sample_run setup_parser transition

Examples

class cobilib.Fitnessfunction(description, parameter, values, interpolation, filename='')[source]

A Class for fitness functions!

cobilib.c_codon_mut_dist(a, b)[source]

Calculates the number of transitions, transversions and staying-const when one codon mutates to another. and is implemented in C with scipy.weave. A pure Python version for easier readability is implemented via py_codon_mut_dist.

Parameters :

a: string with codon 1 :

b: string with codon 2 :

Returns :

List with three doubles `results` with `results[0]` containing the number of unchanged nucleotides, :

`results[1]` number of transitions and `results[2]` number of transversions. :

cobilib.calculate_RF(codon_hist)[source]

Calculates relative codon frequency for a codon

cobilib.calculate_RF_dic(codon_hist)[source]

Calculates relative codon frequency for each gene in codon histogram

Test against run of testdata2.ffn and handcalculated for amino acid A (codons 52 - 55) should give 0.24,0.3066,0.12,0.33

>>> calculate_RF_dic(  (make_codon_histogram_dic( load_fasta("testdata2.ffn")))[0] ) # doctest : +NORMALIZE_WHITESPACE
{'fid|18348942|locus|VBIEscCol44059_0001|': array([ 0.375     ,  0.625     ,  0.14102564,  0.11538462,  0.15555556,
        0.26666667,  0.13333333,  0.2       ,  0.57894737,  0.42105263,
        0.        ,  0.        ,  0.25      ,  0.75      ,  1.        ,
        1.        ,  0.07692308,  0.14102564,  0.03846154,  0.48717949,
        0.12      ,  0.16      ,  0.04      ,  0.68      ,  0.58333333,
        0.41666667,  0.36363636,  0.63636364,  0.3902439 ,  0.43902439,
        0.04878049,  0.09756098,  0.68421053,  0.31578947,  0.        ,
        1.        ,  0.14285714,  0.60714286,  0.        ,  0.25      ,
        0.58823529,  0.41176471,  0.65517241,  0.34482759,  0.11111111,
        0.13333333,  0.        ,  0.02439024,  0.29508197,  0.2295082 ,
        0.09836066,  0.37704918,  0.24      ,  0.30666667,  0.12      ,
        0.33333333,  0.675     ,  0.325     ,  0.72340426,  0.27659574,
        0.35714286,  0.35714286,  0.16071429,  0.125     ]), 'fid|129049020348348942|locus|VBIEscCol44059_0001|': array([ 0.375     ,  0.625     ,  0.14102564,  0.11538462,  0.15555556,
        0.26666667,  0.13333333,  0.2       ,  0.57894737,  0.42105263,
        0.        ,  0.        ,  0.25      ,  0.75      ,  1.        ,
        1.        ,  0.07692308,  0.14102564,  0.03846154,  0.48717949,
        0.12      ,  0.16      ,  0.04      ,  0.68      ,  0.58333333,
        0.41666667,  0.36363636,  0.63636364,  0.3902439 ,  0.43902439,
        0.04878049,  0.09756098,  0.68421053,  0.31578947,  0.        ,
        1.        ,  0.14285714,  0.60714286,  0.        ,  0.25      ,
        0.58823529,  0.41176471,  0.65517241,  0.34482759,  0.11111111,
        0.13333333,  0.        ,  0.02439024,  0.29508197,  0.2295082 ,
        0.09836066,  0.37704918,  0.24      ,  0.30666667,  0.12      ,
        0.33333333,  0.675     ,  0.325     ,  0.72340426,  0.27659574,
        0.35714286,  0.35714286,  0.16071429,  0.125     ])}
cobilib.calculate_RSCU(codon_hist)[source]

returns codon_rscu for a gene

cobilib.calculate_RSCU_dic(codon_hist)[source]

returns codon_rscu for each gene in codon histogram

test like calculate_rf and checked against genomes.urv.es/optimizer

>>> calculate_RSCU_dic(  (make_codon_histogram_dic( load_fasta("testdata2.ffn")))[0] )
{'fid|18348942|locus|VBIEscCol44059_0001|': array([ 0.75      ,  1.25      ,  0.84615385,  0.69230769,  0.93333333,
        1.6       ,  0.8       ,  1.2       ,  1.15789474,  0.84210526,
        0.        ,  0.        ,  0.5       ,  1.5       ,  3.        ,
        1.        ,  0.46153846,  0.84615385,  0.23076923,  2.92307692,
        0.48      ,  0.64      ,  0.16      ,  2.72      ,  1.16666667,
        0.83333333,  0.72727273,  1.27272727,  2.34146341,  2.63414634,
        0.29268293,  0.58536585,  2.05263158,  0.94736842,  0.        ,
        1.        ,  0.57142857,  2.42857143,  0.        ,  1.        ,
        1.17647059,  0.82352941,  1.31034483,  0.68965517,  0.66666667,
        0.8       ,  0.        ,  0.14634146,  1.18032787,  0.91803279,
        0.39344262,  1.50819672,  0.96      ,  1.22666667,  0.48      ,
        1.33333333,  1.35      ,  0.65      ,  1.44680851,  0.55319149,
        1.42857143,  1.42857143,  0.64285714,  0.5       ]), 'fid|129049020348348942|locus|VBIEscCol44059_0001|': array([ 0.75      ,  1.25      ,  0.84615385,  0.69230769,  0.93333333,
        1.6       ,  0.8       ,  1.2       ,  1.15789474,  0.84210526,
        0.        ,  0.        ,  0.5       ,  1.5       ,  3.        ,
        1.        ,  0.46153846,  0.84615385,  0.23076923,  2.92307692,
        0.48      ,  0.64      ,  0.16      ,  2.72      ,  1.16666667,
        0.83333333,  0.72727273,  1.27272727,  2.34146341,  2.63414634,
        0.29268293,  0.58536585,  2.05263158,  0.94736842,  0.        ,
        1.        ,  0.57142857,  2.42857143,  0.        ,  1.        ,
        1.17647059,  0.82352941,  1.31034483,  0.68965517,  0.66666667,
        0.8       ,  0.        ,  0.14634146,  1.18032787,  0.91803279,
        0.39344262,  1.50819672,  0.96      ,  1.22666667,  0.48      ,
        1.33333333,  1.35      ,  0.65      ,  1.44680851,  0.55319149,
        1.42857143,  1.42857143,  0.64285714,  0.5       ])}
cobilib.change_amino_acid_code(code_name)[source]

Various organisms have different genetic codes. Given the code_name which must be a member of genetic_codes the global dictionaries for translating codons into amino acids are reset with the genetic code you want.:

genetic_codes['The Standard Code']                                        = 'FFLLSSSSYY**CC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG'
genetic_codes['The Vertebrate Mitochondrial Code']                        = 'FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIMMTTTTNNKKSS**VVVVAAAADDEEGGGG'
genetic_codes['The Yeast Mitochondrial Code']                             = 'FFLLSSSSYY**CCWWTTTTPPPPHHQQRRRRIIMMTTTTNNKKSSRRVVVVAAAADDEEGGGG'
genetic_codes['The Mold, Protozoan, and Coelenterate Mitochondrial Code'] = 'FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG'
genetic_codes['The Invertebrate Mitochondrial Code']                      = 'FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIMMTTTTNNKKSSSSVVVVAAAADDEEGGGG'
genetic_codes['The Ciliate, Dasycladacean and Hexamita Nuclear Code']     = 'FFLLSSSSYYQQCC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG'
genetic_codes['The Echinoderm and Flatworm Mitochondrial Code']           = 'FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNNKSSSSVVVVAAAADDEEGGGG'
genetic_codes['The Euplotid Nuclear Code']                                = 'FFLLSSSSYY**CCCWLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG'
genetic_codes['The Bacterial, Archaeal and Plant Plastid Code']           = 'FFLLSSSSYY**CC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG'
genetic_codes['The Alternative Yeast Nuclear Code']                       = 'FFLLSSSSYY**CC*WLLLSPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG'
genetic_codes['The Ascidian Mitochondrial Code']                          = 'FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIMMTTTTNNKKSSGGVVVVAAAADDEEGGGG'
genetic_codes['The Alternative Flatworm Mitochondrial Code']              = 'FFLLSSSSYYY*CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNNKSSSSVVVVAAAADDEEGGGG'
genetic_codes['The Blepharisma Nuclear Code']                             = 'FFLLSSSSYY*QCC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG'
genetic_codes['Chlorophycean Mitochondrial Code']                         = 'FFLLSSSSYY*LCC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG'
genetic_codes['Trematode Mitochondrial Code']                             = 'FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIMMTTTTNNNKSSSSVVVVAAAADDEEGGGG'
genetic_codes['Scenedesmus obliquus mitochondrial Code']                  = 'FFLLSS*SYY*LCC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG'
genetic_codes['Thraustochytrium Mitochondrial Code']                      = 'FF*LSSSSYY**CC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG'
genetic_codes['Pterobranchia mitochondrial code']                         = 'FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSSKVVVVAAAADDEEGGGG'
code_name:str
the name of the code which must be a key of genetic_codes

Examples

After changing the genetic code to something different than the standard code, the global amino_acids variable should have changed to the new code. >>> _=change_amino_acid_code(‘The Mold, Protozoan, and Coelenterate Mitochondrial Code’) >>> amino_acids ‘FFLLSSSSYY**CCWWLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG’

cobilib.codon_mut_dist(a, b)

Calculates the number of transitions, transversions and staying-const when one codon mutates to another. and is implemented in C with scipy.weave. A pure Python version for easier readability is implemented via py_codon_mut_dist.

Parameters :

a: string with codon 1 :

b: string with codon 2 :

Returns :

List with three doubles `results` with `results[0]` containing the number of unchanged nucleotides, :

`results[1]` number of transitions and `results[2]` number of transversions. :

cobilib.compute_steady_state(evolmatrix, aa_hist)[source]

The steady state is the eigenvector belonging to the largest eigenvalue of the evolution matrix for a specific amino acid

cobilib.hessian_em(hist, n_neighbors=10, n_components=2)[source]

docstring for hessian_em

cobilib.highly_expressed_genes_by_file(filename, hist)[source]

loads a list of highly expressed genes and returns and returns an index where 0 if no heg and 1 if heg is returned. the format used is that from ecai/heg and it tries to find the id in the description of the histogram keys

cobilib.highly_expressed_genes_by_id(id_list, hist)[source]

not implemented right yet

cobilib.init_fitnessfunctions(config=None, filenames=None)[source]

Loads fitnessfunctions from file. Either config from file oder list of filenames.

cobilib.init_fitnessmatrix_for_amino_acids(config=None, filenames=None, amino_acid_order=['A', 'R', 'N', 'D', 'C', 'E', 'Q', 'G', 'H', 'I', 'K', 'L', 'M', 'F', 'P', 'S', 'T', 'W', 'Y', 'V'])[source]

Load fitnessmatrix form file. Either config from file or list of filenames

cobilib.is_ambig_codon(codon)[source]

Tests whether the string codon contains an ambigous letter that is found in ambig_fasta_chars

codon:str
nucleotide sequence

Examples

The table of codons, codons should not contain any ambigous codons >>> map(is_ambig_codon,codons) [False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False]

However, the string ‘aay’ should >>> is_ambig_codon(‘aay’) True

cobilib.isomap(hist, n_neighbors=10, n_components=2)[source]

docstring for isomap

cobilib.lle(hist, n_neighbors=10, n_components=2)[source]

docstring for lle

cobilib.load_fasta(filename)[source]

Loads a fasta file with filename as arg. Returns generator object with list of genes

Parameters :

filename : str

Path to the file to load

Returns :

fasta : Seq

Parser from Biopython

Examples

If everything is right, a generator object is returned

>>> load_fasta("testdata.ffn") 
<generator object parse at ...>

The file must exist

>>> load_fasta("yikes")
Open: yikes  failed
[Errno 2] No such file or directory: 'yikes'

However, we use Biopython and the fasta file’s syntax is not checked here!

>>> load_fasta("testdata_fail.ffn") 
<generator object parse at ...>
cobilib.load_fasta_from_url(url)[source]

Load fasta from a url

Parameters :

url:str :

url to the fasta file

Returns :

—— :

fasta:Bio.SeqIO :

fasta file parser object from Biopython None, if error has occurred

Examples :

——- :

cobilib.load_genbank(filename)[source]

loads a genbank file with filename as arg. RReturns generator object with list of gene

Parameters :

filename : str

Path to the file to load

Returns :

genes : Seq

Parser from Biopython

cobilib.load_plaintext(filename)[source]

loads a nucleotide file with plaintext sequence. filename as arg. Returns list of genes

cobilib.make_codon_histogram(gene)[source]

returns codon_hist,aa_hist for one gene

Parameters :

gene : SeqIO

after loading fasta

Returns :

codon_hist : dict with each gene.id as key and num_codonsx1 np.array for each codon

aa_hist : dict with each gene.id as key and num_aax1 np.array for each aacid

Examples

The histogram should reproduce what www.kazusa.org.jp/codon/cgi-bin/countcodon.cgi computes for the gene in testdata.ffn:

UUU 12.6(     9)  UCU  9.8(     7)  UAU 15.4(    11)  UGU  4.2(     3)
UUC 21.1(    15)  UCC 16.9(    12)  UAC 11.2(     8)  UGC 12.6(     9)
UUA 15.4(    11)  UCA  8.4(     6)  UAA  0.0(     0)  UGA  1.4(     1)
UUG 12.6(     9)  UCG 12.6(     9)  UAG  0.0(     0)  UGG  5.6(     4)
CUU  8.4(     6)  CCU  4.2(     3)  CAU  9.8(     7)  CGU 22.5(    16)
CUC 15.4(    11)  CCC  5.6(     4)  CAC  7.0(     5)  CGC 25.3(    18)
CUA  4.2(     3)  CCA  1.4(     1)  CAA 11.2(     8)  CGA  2.8(     2)
CUG 53.4(    38)  CCG 23.9(    17)  CAG 19.7(    14)  CGG  5.6(     4)
AUU 36.5(    26)  ACU  5.6(     4)  AAU 28.1(    20)  AGU  7.0(     5)
AUC 16.9(    12)  ACC 23.9(    17)  AAC 19.7(    14)  AGC  8.4(     6)
AUA  0.0(     0)  ACA  0.0(     0)  AAA 26.7(    19)  AGA  0.0(     0)
AUG 29.5(    21)  ACG  9.8(     7)  AAG 14.0(    10)  AGG  1.4(     1)
GUU 25.3(    18)  GCU 25.3(    18)  GAU 37.9(    27)  GGU 28.1(    20)
GUC 19.7(    14)  GCC 32.3(    23)  GAC 18.3(    13)  GGC 28.1(    20)
GUA  8.4(     6)  GCA 12.6(     9)  GAA 47.8(    34)  GGA 12.6(     9)
GUG 32.3(    23)  GCG 35.1(    25)  GAG 18.3(    13)  GGG  9.8(     7)

in the first field. Remember, the order of codons in this package is given in the variable codons

>>> make_codon_histogram( load_fasta("testdata.ffn").next() )
([0.012640449438202247, 0.021067415730337078, 0.015449438202247191, 0.012640449438202247, 0.0098314606741573031, 0.016853932584269662, 0.0084269662921348312, 0.012640449438202247, 0.015449438202247191, 0.011235955056179775, 0.0, 0.0, 0.0042134831460674156, 0.012640449438202247, 0.0014044943820224719, 0.0056179775280898875, 0.0084269662921348312, 0.015449438202247191, 0.0042134831460674156, 0.053370786516853931, 0.0042134831460674156, 0.0056179775280898875, 0.0014044943820224719, 0.023876404494382022, 0.0098314606741573031, 0.0070224719101123594, 0.011235955056179775, 0.019662921348314606, 0.02247191011235955, 0.025280898876404494, 0.0028089887640449437, 0.0056179775280898875, 0.036516853932584269, 0.016853932584269662, 0.0, 0.029494382022471909, 0.0056179775280898875, 0.023876404494382022, 0.0, 0.0098314606741573031, 0.028089887640449437, 0.019662921348314606, 0.026685393258426966, 0.014044943820224719, 0.0070224719101123594, 0.0084269662921348312, 0.0, 0.0014044943820224719, 0.025280898876404494, 0.019662921348314606, 0.0084269662921348312, 0.032303370786516857, 0.025280898876404494, 0.032303370786516857, 0.012640449438202247, 0.0351123595505618, 0.037921348314606744, 0.018258426966292134, 0.047752808988764044, 0.018258426966292134, 0.028089887640449437, 0.028089887640449437, 0.012640449438202247, 0.0098314606741573031], [0.033707865168539325, 0.10955056179775281, 0.063202247191011238, 0.026685393258426966, 0.0014044943820224719, 0.016853932584269662, 0.0056179775280898875, 0.0351123595505618, 0.016853932584269662, 0.030898876404494381, 0.05758426966292135, 0.053370786516853931, 0.029494382022471909, 0.039325842696629212, 0.047752808988764044, 0.040730337078651688, 0.085674157303370788, 0.10533707865168539, 0.056179775280898875, 0.066011235955056174, 0.078651685393258425])
cobilib.make_codon_histogram_dic(f)[source]

returns codon_hist for all genes in a fasta file in form of a dictionary with fasta identifiers as keys

The histogram should reproduce what www.kazusa.org.jp/codon/cgi-bin/countcodon.cgi for the first gene in the testdata2.ffn:

UUU 12.6(     9)  UCU  9.8(     7)  UAU 15.4(    11)  UGU  4.2(     3)
UUC 21.1(    15)  UCC 16.9(    12)  UAC 11.2(     8)  UGC 12.6(     9)
UUA 15.4(    11)  UCA  8.4(     6)  UAA  0.0(     0)  UGA  1.4(     1)
UUG 12.6(     9)  UCG 12.6(     9)  UAG  0.0(     0)  UGG  5.6(     4)
CUU  8.4(     6)  CCU  4.2(     3)  CAU  9.8(     7)  CGU 22.5(    16)
CUC 15.4(    11)  CCC  5.6(     4)  CAC  7.0(     5)  CGC 25.3(    18)
CUA  4.2(     3)  CCA  1.4(     1)  CAA 11.2(     8)  CGA  2.8(     2)
CUG 53.4(    38)  CCG 23.9(    17)  CAG 19.7(    14)  CGG  5.6(     4)
AUU 36.5(    26)  ACU  5.6(     4)  AAU 28.1(    20)  AGU  7.0(     5)
AUC 16.9(    12)  ACC 23.9(    17)  AAC 19.7(    14)  AGC  8.4(     6)
AUA  0.0(     0)  ACA  0.0(     0)  AAA 26.7(    19)  AGA  0.0(     0)
AUG 29.5(    21)  ACG  9.8(     7)  AAG 14.0(    10)  AGG  1.4(     1)
GUU 25.3(    18)  GCU 25.3(    18)  GAU 37.9(    27)  GGU 28.1(    20)
GUC 19.7(    14)  GCC 32.3(    23)  GAC 18.3(    13)  GGC 28.1(    20)
GUA  8.4(     6)  GCA 12.6(     9)  GAA 47.8(    34)  GGA 12.6(     9)
GUG 32.3(    23)  GCG 35.1(    25)  GAG 18.3(    13)  GGG  9.8(     7)

in the first field. Remember, the order of codons in this package is given in the variable codons

>>> make_codon_histogram_dic( load_fasta("testdata2.ffn") )[0]
{'fid|18348942|locus|VBIEscCol44059_0001|': [0.012640449438202247, 0.021067415730337078, 0.015449438202247191, 0.012640449438202247, 0.0098314606741573031, 0.016853932584269662, 0.0084269662921348312, 0.012640449438202247, 0.015449438202247191, 0.011235955056179775, 0.0, 0.0, 0.0042134831460674156, 0.012640449438202247, 0.0014044943820224719, 0.0056179775280898875, 0.0084269662921348312, 0.015449438202247191, 0.0042134831460674156, 0.053370786516853931, 0.0042134831460674156, 0.0056179775280898875, 0.0014044943820224719, 0.023876404494382022, 0.0098314606741573031, 0.0070224719101123594, 0.011235955056179775, 0.019662921348314606, 0.02247191011235955, 0.025280898876404494, 0.0028089887640449437, 0.0056179775280898875, 0.036516853932584269, 0.016853932584269662, 0.0, 0.029494382022471909, 0.0056179775280898875, 0.023876404494382022, 0.0, 0.0098314606741573031, 0.028089887640449437, 0.019662921348314606, 0.026685393258426966, 0.014044943820224719, 0.0070224719101123594, 0.0084269662921348312, 0.0, 0.0014044943820224719, 0.025280898876404494, 0.019662921348314606, 0.0084269662921348312, 0.032303370786516857, 0.025280898876404494, 0.032303370786516857, 0.012640449438202247, 0.0351123595505618, 0.037921348314606744, 0.018258426966292134, 0.047752808988764044, 0.018258426966292134, 0.028089887640449437, 0.028089887640449437, 0.012640449438202247, 0.0098314606741573031], 'fid|129049020348348942|locus|VBIEscCol44059_0001|': [0.012640449438202247, 0.021067415730337078, 0.015449438202247191, 0.012640449438202247, 0.0098314606741573031, 0.016853932584269662, 0.0084269662921348312, 0.012640449438202247, 0.015449438202247191, 0.011235955056179775, 0.0, 0.0, 0.0042134831460674156, 0.012640449438202247, 0.0014044943820224719, 0.0056179775280898875, 0.0084269662921348312, 0.015449438202247191, 0.0042134831460674156, 0.053370786516853931, 0.0042134831460674156, 0.0056179775280898875, 0.0014044943820224719, 0.023876404494382022, 0.0098314606741573031, 0.0070224719101123594, 0.011235955056179775, 0.019662921348314606, 0.02247191011235955, 0.025280898876404494, 0.0028089887640449437, 0.0056179775280898875, 0.036516853932584269, 0.016853932584269662, 0.0, 0.029494382022471909, 0.0056179775280898875, 0.023876404494382022, 0.0, 0.0098314606741573031, 0.028089887640449437, 0.019662921348314606, 0.026685393258426966, 0.014044943820224719, 0.0070224719101123594, 0.0084269662921348312, 0.0, 0.0014044943820224719, 0.025280898876404494, 0.019662921348314606, 0.0084269662921348312, 0.032303370786516857, 0.025280898876404494, 0.032303370786516857, 0.012640449438202247, 0.0351123595505618, 0.037921348314606744, 0.018258426966292134, 0.047752808988764044, 0.018258426966292134, 0.028089887640449437, 0.028089887640449437, 0.012640449438202247, 0.0098314606741573031]}
cobilib.make_codon_histogram_dic_combined(f)[source]

make a codon and amino acid histogram from a file parser object. But this time combine all genes so that only one histogram is returned.

f:Bio.SeqIO
file parser object
Returns :

codon_hist:dict[‘combined genome’] :: np.array x 64 :

returns codon histogram

cobilib.make_evolmatrix(mutationmatrix, fitnessfunctions, fitnessmatrices, selection, additive=False)[source]

Given the mutationmatrix, the fitnessfunctions, the fitnessmatrices (the amino-acid identity matrix) and the selection strength, this builds the evolutionmatrix

cobilib.make_jc69_mutationmatrix(mu=0, alpha=0, beta=0)[source]

ups, not correct! read models_of_dna_evolution on wiki

cobilib.optimize_dic(fasta_filename, fitfu, fitmat, minimal_number_of_nucleotides_per_gene=500)[source]

The probability when we put r number of codons in a gene into n bins of different codons to find a codon with k occurences in the gene is ..math:

p_k = left(

rac{r}{k}) ight) rac{(n-1)^{r-k}} rac{n^r}

hence, the probability of finding more than one occurence is

..math:
p_{k>1} (63/64)^r( (64/63)^r - 1 )

if we want to to be reasonably sure our gene contains at least one codon from every kind we have to solve the inequality p_{k>1} > p-value. A p-value of 0.95 implies we have to use a gene with at least 500 nucleotides

cobilib.optimize_sequence(steady_state, target_sequence)[source]

given the steady state and the target sequence, let us optimize!

cobilib.relative_hellinger_distance(steady_state, codon_hist)[source]

TODO: not implemented

cobilib.relative_jeffrey_divergence(steady_state, codon_hist)[source]

TODO: not implemented

cobilib.relative_minkowski_distance(steady_state, codon_hist)[source]

TODO: not implemented

cobilib.remove_stopcodons_and_one_codon_amino_acids(codon_hist)[source]
cobilib.run_model(parameters, fitnessfunctions, fitnessmatrices, aa_hist)[source]

parameters = alpha,beta,selection,t0,t1,t2,t3...

cobilib.setup_parser()[source]

Setting up the parser for command line usage, returns args

cobilib.spectral(hist, n_neighbors=10, n_components=2)[source]

docstring for spectral

Indices and tables

Table Of Contents

This Page