VirtualMicrobes.virtual_cell package

Submodules

VirtualMicrobes.virtual_cell.Cell module

class VirtualMicrobes.virtual_cell.Cell.Cell(params, environment, rand_gen=None, production_function=None, toxicity_function=None, toxicity_effect_function=None, time_birth=-1, **kwargs)[source]

Bases: VirtualMicrobes.virtual_cell.PhyloUnit.PhyloUnit

Version:
Author:
GRN(genes=None, prot_color_func=<function <lambda>>, with_gene_refs=False, with_self_marker=False)[source]
add_gene_copy(gene, concentration=0.0, diff_constant=None, degr_constant=None)[source]

Add gene product to the cell. If the ‘gene’ is already present as a key in the subdictionary, adding it again means we have to increase the multiplicity of the gene product, since there are multiple genes coding for this gene product. Otherwise, we add an entry for this gene product.

Parameters:
  • gene (virtual_cell:Gene:Gene) – The gene for which copy number is increased.
  • concentration (float) – Initial concentration of gene product.
  • diff_constant (float) – Diffusion constant of gene product over the cell membrane. (If None, no diffusion is modeled)
  • degr_constant (float) – Degradation constant of gene product.
add_small_molecule(mol, env, concentration, diff_const, degr_const)[source]
add_small_molecules(env, conc, degr_const, ene_diff_const, ene_degr_const, bb_degr_const)[source]
ancestral_mutations

Cumulative mutations in the line of descent.

apply_hgt(time, gp, environment, rand_gene_params=None, mutation_rates=None, global_mut_mod=None, rand_gen=None, verbose=False)[source]

Applies HGT to a cell, and updates the GRN.

If applied HGT is applied, the gp updated flag is set to True, to inform integrator update in next timestep.

Parameters:
  • time (int) – time point in the simulation
  • gp (environment:Grid:GridPoint) – location of cell on the grid
  • environment (environment:Environment) – simulation environment containing all possible reactions to draw a random gene for external HGT
  • rand_gene_params (dict) – parameter space for properties of externally HGTed genes
  • mutation_rates (dict) – contains rates for eHGT and iHGT
  • global_mut_mod (double) – scaling factor for all mutation rates
  • rand_gen (RNG) –
Returns:

Return type:

The applied mutations are returned

asexual(time)[source]

Asexual reproduction.

Returns:
Return type:virtual_cell.Cell
avrg_promoter_strengths
building_blocks
calculate_raw_death_rate(base_rate, toxicity_scaling, toxicity_effect_func=None, toxic_building_blocks=None)[source]

Raw death rate based on a base rate and the toxic effects of internal molecule concentrations :param base_rate: :param toxicity_effect_func: :param e:

check_gene_multiplicities()[source]
check_sane_values(sane=(0, 500.0))[source]
chromosomal_mut
chromosomal_mut_count
chromosome_count
chromosome_del
chromosome_del_count
chromosome_dup
chromosome_dup_count
chromosome_fiss_count
chromosome_fission
chromosome_fuse_count
chromosome_fusion
chromosome_mutate_genome(time, mutation_rates, global_mut_mod, rand_gen, rand_gen_np, verbose)[source]
class_version = '2.5'
clear_mol_time_courses()[source]

Set time course arrays to None.

Reduces memory footprint of cell, for example before pickling cell objects.

clone(time)[source]

Make a clone of this cell.

Clones are identical, having no mutations, and maintain a reference to the parent.

Returns:
Return type:virtual_cell.Cell
compute_product_toxicity(toxicity_func=None)[source]
consumer_type

Return the set of all metabolic classes consumed in enzymatic reactions by this cell.

consumes

Set of consumed metabolic species.

consuming_count
conversions_type

The set of conversion reactions in the genome.

copy_numbers
copy_numbers_eff_pumps
copy_numbers_enzymes
copy_numbers_inf_pumps
copy_numbers_tfs
delete_chromosome(chrom, time)[source]

Chromosome deletion. See “duplicate_chromosome” for implementation details.

delete_genes(genes, time)[source]
delete_stretch(chrom, start_pos, end_pos, time, verbose=False)[source]
die(time, verbose=False, clear_time_courses=False, wiped=False)[source]

Cell death.

Informs genomic units of cell death.

Parameters:
  • time (float) – simulation time point
  • verbose (bool) – report on cell death
  • clear_time_courses (bool) – set time courses to None
  • wiped (bool) – set if cell death was due to wiping of (a fraction of) the population
divide_volume(factor=2.0)[source]

Divide the volume of the cell.

Parameters:factor (float) – division factor
duplicate_chromosome(chrom, time)[source]

Chromosome duplication. Creates a mutation object that is then responsible for calling the appropriate genomic operations (duplication methods in chromosome and genome updation functions of genome. The Cell then updates its gene product counts accordingly.

Parameters:chrom – chromosome to be duplicated
eff_pump_count
enz_avrg_promoter_strengths
enz_promoter_strengths
enz_subs_differential_ks
enz_subs_ks
enz_sum_promoter_strengths
enz_vmaxs
enzyme_count
enzymes
exploiting

Return the set of metabolic classes that are imported and consumed.

exploiting_count
export_type

Return the set of all metabolic classes exported by this cell.

exporter_type

The set of export reactions in the genome.

exporting_count
external_hgt
external_hgt_count
fiss_chromosome(chrom, pos, time)[source]

Fision of a chromosome. Break up a chromosome at a given position and reinsert the parts into the genome.

fuse_chromosomes(chrom1, chrom2, end1, end2, time)[source]

Fusion of two chromosomes. Two chromosomes are fused end to end. The new product is then inserted back into the genome. :param end1: boolean to indicate whether the fusion site is the end (True) or start of the chromosome :param end2: analogous to “end1”

gene_substrate_differential_ks(ks_param, genes)[source]
gene_substrate_ks(ks_param, genes)[source]
gene_type_counts
generate_enzymes(nr_enzymes, env, rand_gen, metabolic_pathway_dict, prioritize_influxed=True)[source]

Generate the set of enzymes of the cell.

Uses heuristics for constructing ‘viable’ initial metabolic pathways for the production of building blocks and energy metabolites.

Parameters:
  • nr_enzymes (int) – number of enzymes to generate
  • env (environment.Environment) – environment provides the set of metabolites and possible metabolic reactions
  • rand_gen (RNG) –
  • metabolic_pathway_dict (dict) – dictionary to keep track of metabolites consumed and produced by the cell.
  • prioritize_influxed (bool) – first create enzymes that can utilize metabolites that are present (influxed) in environment
Returns:

Return type:

list of `virtual_cell.Gene.MetabolicGene`s

generate_pumps(nr_pumps, env, rand_gen, metabolic_pathway_dict, import_first=True, prioritize_influxed=True)[source]

Generate set of transporters of the cell.

Use heuristics for connecting to the metabolic pathways formed so far by the enzymes present in the cell.

Parameters:
  • nr_pumps (int) – number of transporters to generate
  • env (environment.Environment) – environment provides the set of metabolites and possible metabolic reactions
  • rand_gen (RNG) –
  • metabolic_pathway_dict (dict) – dictionary to keep track of metabolites consumed and produced by the cell.
  • import_first (bool) – first create import enzymes for all metabolites that can be consumed in metabolism
  • prioritize_influxed (bool) – first create transporters that can import metabolites that are present (influxed) in environment
Returns:

Return type:

list of :class:`virtual_cell.Gene.Transporter`s

See also

generate_enzymes

generate_tfs(nr_tfs, env, rand_gen, metabolic_pathway_dict, bb_ene_only=False)[source]

Create TFs of the cell.

Prioritizes TFs that sense metabolites that are relevant for the cell, i.e. metabolites that are actively consumed or produced by the cell’s enzymes (or production function).

Parameters:
  • nr_tfs (int) – number of tfs to generate
  • env (environment.Environment) – environment provides the set of metabolites and possible metabolic reactions
  • rand_gen (RNG) –
  • metabolic_pathway_dict (dict) – dictionary to keep track of metabolites consumed and produced by the cell.
  • bb_ene_only (bool) – create TFs with either a building block or an energy carrier as ligand exclusively
Returns:

Return type:

list of :class:`virtual_cell.Gene.TranscriptionFactor`s

genes_get_prop_vals(genes, get_prop)[source]

Construct array of values of a gene property for all duplicates of a list of genes.

Parameters:
  • genes (list of virtual_cell:Gene:Gene objects) – The genes for which to list the proprty
  • get_prop (func [virtual_cell:Gene:Gene] -> value) – A function that takes a gene as argument and returns a value
Returns:

Return type:

numpy array

genome_size
genotype

Construct frozen set of the genotype classification of this cell.

The genotype represents the gene functionality that the cell is capable of. It is expressed as the total set of transport, enzymatic and transcription sensing capabilities of the cell.

Returns:
Return type:frozenset
genotype_vector(env)[source]

Construct boolean vector of gene-type presence-absence.

Parameters:env (environment.Environment) – environment relative to which the gene type presence absence is determined
Returns::class:`event.Molecule.MoleculeClass`s presence/absence.
Return type:mapping of gene-function to event.Reaction.Reaction |
get_ancestor(gen_back)[source]
get_ancestor_from_time(time)[source]
get_building_blocks()[source]
get_cell_size_time_course()[source]

Time course of cell size.

get_gene_concentration_dict()[source]

Mapping of gene products to current concentration.

get_gene_degradation_dict()[source]

Mapping of gene products to degradation rates.

get_gene_diffusion_dict()[source]

Mapping of gene products to diffusion rates.

get_gene_multiplicities_dict()[source]

Mapping of gene products to copy numbers in the genome.

get_gene_prod_conc(gene)[source]
get_gene_time_course_dict()[source]

Fetches time courses for gene products (proteins)

Returns:
Return type:dictionary of all time courses
get_gene_type_time_course_dict()[source]

Return a dictionary of concentration time course data for different gene types.

get_mol_concentration_dict()[source]

Mapping of internal molecule species to current concentrations.

get_mol_degradation_dict()[source]

Mapping of internal molecule species to degradation rates.

get_mol_diffusion_dict()[source]

Mapping of internal molecule species to diffusion rates.

get_mol_time_course_dict()[source]

Mapping of internal molecule species to concentration time course.

get_pos_prod_time_course()[source]

Time course of positive component of production rate.

get_raw_production_time_course()[source]

Time course of raw production value.

get_small_mol_conc(mol)[source]
get_time_points()[source]
get_total_reaction_type_time_course_dict()[source]

Return a dictionary of summed time courses per reaction type.

Gets the time courses per gene type and then sums concentrations of gene products with the same reaction (enzymes/pumps) or ligand (tfs).

get_toxicity_time_course()[source]

Time course of cell toxicity.

grow_time_course_arrays(factor=1.5)[source]

Grow time course arrays if they cannot hold enough new time points.

Parameters:factor (float) – if necessary, increase capacity with this factor
hgt_external(external_hgt, global_mut_mod, environment, time, rand_gene_params, rand_gen)[source]

Insert a randomly generated gene into the genome.

Gene is created with random parameters and inserted (as a stretch) into a randomly picked chromosome and position.

hgt_internal(internal_hgt, global_mut_mod, gp, time, rand_gene_params, rand_gen)[source]

Select and copy a gene from one potential donor cell and transfer this gene (as a stretch) into the genome of the acceptor cell.

hybridize(second_parent, time)[source]
import_type

Return the set of all metabolic classes imported by this cell.

importer_type

The set of import reactions in the genome.

importing_count
inf_pump_count
init_building_blocks_dict(nr_blocks, rand_gen, stois)[source]

Initialize the dictionary of cellular building blocks and their stoichiometry.

Parameters:
  • nr_blocks (int) – number of different metabolic building blocks of the cell.
  • rand_gen (RNG) –
  • stois (tuple of int) – (lower, higher) range of stoichiometric constants from which to randomly draw stoichiometries.
init_cell_params()[source]
init_cell_time_courses(length=None)[source]

Initialize arrays to hold time course data.

Parameters:length (int) – initial length of array
init_energy_mols(environment)[source]

Store the energy molecules of the cell.

energy_mols used in odes.pyx.

init_gene_products(concentration=None)[source]
init_genome(environment, chrom_compositions, min_bind_score, prioritize_influxed, rand_gene_params, circular_chromosomes, rand_gen, randomize=True)[source]

Iniitialize the genome of the cell.

The genome is constructed according to a list of chromosome composition descriptions. These descriptions specify the number of each gene type a set of chromosomes. Sets of metabolic , transport and transcription factor genes are initialized using heuristic functions that guarantee basic viability of the cell and a basic logic in transported and sensed metabolites, by taking into account the set of indispensible metabolites (building blocks and energy carriers). Finally, kinetic and other parameters of all genes are randomized and the genes distributed over the chromosomes.

Parameters:
  • environment (environment.Environment) – simulation environment that determines metabolic universe
  • chrom_compositions (list of my_tools.utility.GeneTypeNumbers) – per chromosome composition of different gene types
  • min_bind_score (float) – parameter for minimal binding score for transcription regulation
  • prioritize_influxed (bool) – first choose enzymatic reactions that use substrates that are influxed in environment
  • rand_gene_params (my_tools.utility.ParamSpace) – range from which to draw randomized gene parameters
  • circular_chromosomes (bool) – make the chromosomes circular
  • rand_gen (RNG) –
  • randomize (bool) – whether gene parameters should be randomized after creation
init_mol_time_course(mol_struct, length=None)[source]

Initialize an array for time course data in a molecule structure SmallMol or GeneProduct.

Parameters:
  • mol_struct (my_tools.utility.SmallMol) – simple c-struct like object that holds data on small molecules or gene products
  • length (int) – initial length of time course array
init_mol_views()[source]

Initialize ‘aliases’ for the set of small molecules and that of gene products in the cell.

init_mutations_dict()[source]

Initialize a dictionary for storing the mutations in the life time of this cell.

init_time_courses()[source]

Initialize arrays that hold time course data for molecules and cell variables

insert_stretch(chrom, insert_pos, stretch, time, is_external, verbose=False)[source]

Insert a stretch of exogenous genomic material. Also adds product to the dict of to proteins made by the cell

internal_hgt
internal_hgt_count
invert_stretch(chrom, start_pos, end_pos, time, verbose=False)[source]
is_autotroph(env)[source]

Determine if cell is autotrophic within an environment.

Autotrophy is defined as the ability to produce building blocks from precursors that are present ‘natively’ in the environment. This may be done in a multistep pathway in which the cell produces intermediates with its own metabolic enzymes.

Parameters:env (environment.Environment) – the environment relative to which autotrophy is tested

See also

is_heterotroph()

is_heterotroph(env)[source]

Determine if cell is heterotrophic within an environment.

Heterotrophy is defined as the ability to produce the building blocks from precursors that could only be present as (by)products from metabolism of other individuals in the environment, but not natively present (through influx).

Parameters:env (environment.Environment) – the environment relative to which autotrophy is tested

See also

is_autotroph()

mean_life_time_cell_size
mean_life_time_pos_production
mean_life_time_production
mean_life_time_toxicity
metabolic_type

Return a set of sets that uniquely defines the metabolic functions of this cell.

metabolic_type_vector(env)[source]

Construct boolean vector of metabolic capacity of the cell.

Based on the complete set of environmental molecule classes, write out a cells metabolism in terms of produced, consumed, imported and exported molecule classes by the cells metabolism.

Parameters:env (environment.Environment) – environment relative to which the metabolic capacity is determined
Returns:
Return type:mapping of metabolic-function to :class:`event.Molecule.MoleculeClass`s presence/absence.
mutate(time, environment, rand_gene_params=None, mutation_rates=None, mutation_param_space=None, global_mut_mod=None, point_mutation_dict=None, point_mutation_ratios=None, regulatory_mutation_dict=None, regulatory_mutation_ratios=None, rand_gen=None, rand_gen_np=None, verbose=False)[source]
nodes_edges(genes=None)[source]

Returns a list of nodes and edges of the cells gene regulatory network.

Parameters:genes (list of :class:`virtual_cell.Gene.Gene`s) – genes for which to find nodes and edges. If None (default) all gene products in the cell that have copy nr > 0.
point_mut
point_mut_count
point_mutate_gene(chrom, pos, mut_dict, point_mut_ratios, environment, time, rand_gen)[source]
point_mutate_genome(p_mutate, mut_dict, global_mut_mod, point_mut_ratios, environment, time, rand_gen)[source]
pos_production
producer_type

Return the set of all metabolic classes produced in enzymatic reactions by this cell.

produces

Set of produced metabolic species.

producing_count
promoter_strengths
providing

Return the set of metabolic classes that are produced AND exported.

providing_count
prune_dead_phylo_branches()[source]
pump_avrg_promoter_strengths
pump_count
pump_ene_differential_ks
pump_ene_ks
pump_promoter_strengths
pump_subs_differential_ks
pump_subs_ks
pump_sum_promoter_strengths
pump_vmaxs
pumps
raw_production
raw_production_change_rate
reaction_genotype

Construct frozen set of the reaction genotype classification of this cell.

The genotype represents the enzyme functionality that the cell is capable of. It is expressed as the total set of transport, enzymatic capabilities of the cell, but excluding the tf sensing capabilities.

Returns:
Return type:frozenset

See also

func
genotype
reaction_set_dict

Dictionary of enzymatic reaction types to reactions.

Reactions of genes are mapped to their reaction types (Conversion, Transport).

reaction_set_dict2
reduce_gene_copies(gene)[source]

Reduce the copy number for a gene.

Parameters:gene (virtual_cell:Gene:Gene) – The gene for which copy number is reduced.
regulatory_region_mutate(chrom, genome, pos, mut_dict, mut_ratios, stretch_exp_lambda, environment, time, rand_gen, rand_gen_np)[source]
regulatory_region_mutate_genome(mutation_rates, mut_dict, global_mut_mod, reg_mut_ratios, environment, time, rand_gen, rand_gen_np)[source]
rejuvenate(reference_cell, verbose=False)[source]

Reset cell properties to the state of the young cell

remove_unproduced_gene_products(conc_cutoff=None)[source]

Remove gene products when they are no longer produced and have a below threshold concentrations.

Parameters:conc_cutoff (float) – threshold concentration below which gene product is removed
Returns:True if any product was removed
Return type:bool
reproduce(spent_production, time, second_parent=None)[source]

Create a new child of this cell.

Copies all relevant properties, including the genome. Divides the original cell volume between the parent and child.

Returns:
Return type:virtual_cell.Cell
reset_grn(min_bind_score=None)[source]

Recalculate and reset all binding interactions in the genome.

Parameters:min_bind_score (float) – minimum identity score to set a regulatory interaction.
resize_time_courses(new_max_time_points)[source]

Set a new size for arrays that hold time course data.

Parameters:new_max_time_points – max number of time points
sequence_mut
sequence_mut_count
set_gene_prod_conc(gene, conc)[source]
set_mol_concentrations_from_time_point()[source]

record cellular concentrations and values from time course arrays

During the integration step, for each molecule or other variable time course data is stored in an array. The position that was last filled is the new concentration The value stored under index pos will be copied to a dedicated concentration or ‘value’ member.

Parameters:pos – position in array to
set_small_mol_conc(mol, conc)[source]
stretch_del
stretch_del_count
stretch_invert
stretch_invert_count
stretch_mut
stretch_mut_count
stretch_mutate_genome(time, mutation_rates, global_mut_mod, rand_gen, rand_gen_np, mut_types=['tandem_dup', 'stretch_del', 'stretch_invert', 'stretch_translocate'], verbose=False)[source]

Iterate over chromosomes and positions to select stretches of genes for mutational events.

For every chromosome, iterate over positions and select front and end positions of stretch mutations. The direction of iteration is randomly chosen (back-to-front or front-to-back). Multiple mutations per chromosome may occur. Every position may be independently selected as the front site of a stretch mutation. The length of the stretch is a geometrically distributed random variable, using a lambda parameter. The end position is the minimum of the remaining chromosome length and the randomly drawn stretch length. If any positions remain in the chromosome after the stretch mutation, these positions are then iterated over in a random direction (direction of iteration is reversed at random after the application of the stretch mutation). Reversing direction is significant, because it randomizes the locations of ‘truncated’ stretches when reaching the end of a chromosome.

strict_exploiting

Return exploited resources classes that are not produced by self.

See also

exploiting()

strict_exploiting_count
strict_providing

Return provided resources classes that are not imported by self.

See also

providing()

strict_providing_count
sum_promoter_strengths
tandem_dup
tandem_dup_count
tandem_duplicate_stretch(chrom, start_pos, end_pos, time, verbose=False)[source]
tf_avrg_promoter_strengths
tf_count
tf_differential_reg
tf_k_bind_ops
tf_ligand_differential_ks
tf_ligand_ks
tf_promoter_strengths
tf_sensed

The set of molecule classes that are sensed by TFs.

tf_sum_promoter_strengths
tfs
toxicity
toxicity_change_rate
tp_index
translocate
translocate_count
translocate_stretch(chrom, start_pos, end_pos, target_chrom, insert_pos, invert, time, verbose=False)[source]
trophic_type(env)[source]
truncate_time_courses(max_tp=None)[source]

Truncate the time course to maximum length of stored data.

Intended to be called when a cell dies and no additional data points are expected to be stored.

uid = 0
update(state)[source]
update_grn(min_bind_score=None)[source]

Update the regulatory network, by finding (new) matches between TF binding sites and gene operators.

min_bind_score
: float
minimum identity score to set a regulatory interaction.
update_mutated_gene_product(old, new)[source]

Decrease the copy number of the old gene and increase/initialize the new gene.

Parameters:
  • old (virtual_cell:Gene:Gene) – pre mutation gene
  • new (virtual_cell:Gene:Gene) – post mutation gene
update_small_molecules_diff(env)[source]
upgrade()[source]

Upgrading from older pickled version of class to latest version. Version information is saved as class variable and should be updated when class invariants (e.g. fields) are added.

volume
VirtualMicrobes.virtual_cell.Cell.make_inherited_hgt_dict()[source]
VirtualMicrobes.virtual_cell.Cell.make_mutations_dict()[source]

VirtualMicrobes.virtual_cell.Chromosome module

class VirtualMicrobes.virtual_cell.Chromosome.Chromosome(gene_list=None, positions=None, time_birth=0, circular=False, **kwargs)[source]

Bases: VirtualMicrobes.virtual_cell.PhyloUnit.PhyloBase

A list with all the positions on the chromosome. So far it is not apparent why there should be an additional encapsulation of GenomicElements in Position wrappers. If they do not provide a sufficient benefit in a real usecase, this layere may be taken out.

positions

Version:
Author:
append_gene(g)[source]
delete_stretch(start_pos, end_pos)[source]

Deletes a stretch of genes ‘in place’, i.e. played on this chromosome. :param start_pos: :param end_pos:

duplicate(time, verbose=False)[source]
fiss(pos, time, verbose=False)[source]
classmethod fuse(chrom1, chrom2, time, end1=True, end2=True)[source]
init_positions(circular=False)[source]
insert_stretch(stretch, pos, verbose=False)[source]

Insert a list of nodes at (=after) pos. :param stretch: :param pos: :param verbose:

invert(start_pos, end_pos)[source]
positions
tandem_duplicate(start_pos, end_pos)[source]

Duplicates a stretch of genes ‘in place’, right after end_pos i.e. played on this chromosome. :param start_pos: :param end_pos:

toJSON(index, *args, **kwargs)[source]
translocate_stretch(start_pos, end_pos, target_pos, target_chrom)[source]
uid = 0

VirtualMicrobes.virtual_cell.Gene module

class VirtualMicrobes.virtual_cell.Gene.Gene(type_, pr_str=1.0, operator_seq_len=10, fixed_length=None, is_enzyme=False, promoter_phylo_type='base', operator_phylo_type='base', **kwargs)[source]

Bases: VirtualMicrobes.virtual_cell.GenomicElement.GenomicElement

is_enzyme
mutated(param, new_val, time, verbose=False)[source]

Mutates a parameter of the gene. To maintain a full ancestry, the mutation should be applied to a (shallow) copy of the gene and this copy reinserted in the original ancestral position. The shallow copy will however have a new deepcopied version of the parameter dictionary so that the mutation will not affect the ancestral gene state.

Parameters:
  • param – parameter to mutate, where param is a dictionary key
  • new_val
operator
params
promoter
randomize(rand_gene_params, rand_gen, **kwargs)[source]

randomization of the Gene

randomize_params(rand_gene_params, rand_generator)[source]
toJSON(attr_mapper, index, d=None, *args, **kwargs)[source]
class VirtualMicrobes.virtual_cell.Gene.MetabolicGene(reaction, substrates_ks=None, v_max=1.0, forward=True, **kwargs)[source]

Bases: VirtualMicrobes.virtual_cell.Gene.Gene

Version:
Author:
init_substrates_ks(reaction)[source]
ode_params()[source]

Returns a list of dictionaries of parameters necessary and sufficient to parameterize an ODE for all the sub-reactions associated with this Gene.

randomize_params(rand_gene_params, rand_generator)[source]
reaction
simple_str()[source]
toJSON(*args, **kwargs)[source]
class VirtualMicrobes.virtual_cell.Gene.Promoter(pr_str, time_birth=0, **kwargs)[source]

Bases: VirtualMicrobes.virtual_cell.PhyloUnit.PhyloBase

“Private” class of genes. At the moment it just encodes the basal promoter strength of genes.

mutate(mut_modifier, rand_gen)[source]
randomize(rand_gene_params, rand_generator)[source]
strength
toJSON(attr_mapper, *args, **kwargs)[source]
uid = 0
class VirtualMicrobes.virtual_cell.Gene.TranscriptionFactor(ligand_mol_class, ligand_ks=None, ligand_cooperativity=1.0, binding_seq_len=10, eff_apo=1, eff_bound=1, k_bind_op=1.0, binding_cooperativity=2, sense_external=False, **kwargs)[source]

Bases: VirtualMicrobes.virtual_cell.Gene.Gene

Version:
Author:
binding_sequence
ligand_class
randomize(rand_gene_params, rand_gen)[source]
randomize_params(rand_gene_params, rand_generator)[source]
simple_str()[source]
toJSON(*args, **kwargs)[source]
class VirtualMicrobes.virtual_cell.Gene.Transporter(reaction, ene_ks=None, substrates_ks=None, v_max=1.0, exporting=False, **kwargs)[source]

Bases: VirtualMicrobes.virtual_cell.Gene.Gene

Version:
Author:
ode_params()[source]

Returns a list of dictionaries of parameters necessary and sufficient to parameterize an ODE for all the sub-reactions associated with this Gene.

randomize_params(rand_gene_params, rand_generator, rand_direction=False)[source]
reaction
simple_str()[source]
toJSON(*args, **kwargs)[source]
VirtualMicrobes.virtual_cell.Gene.convert_rates(enzyme, enzyme_conc, metabolite_conc_dict)[source]

Estimate of conversion rates per substrate

Parameters:
  • enzyme – enzyme gene
  • enzyme_conc – internal enzyme concentrations
  • metabolite_conc_dict – concentrations of metabolites
VirtualMicrobes.virtual_cell.Gene.pump_rates(pump, pump_conc, metabolite_conc_dict)[source]

Estimate of pumping rates for each substrate

Parameters:
  • pump – pump gene
  • pump_conc – internal pump concentration
  • metabolite_conc_dict – concentrations of metabolites
VirtualMicrobes.virtual_cell.Gene.random_gene(environment, rand_gene_params, rand_gen, params, keep_transport_direction=True)[source]
VirtualMicrobes.virtual_cell.Gene.randomized_param(rand_gene_params, rand_generator)[source]

VirtualMicrobes.virtual_cell.Genome module

class VirtualMicrobes.virtual_cell.Genome.Genome(chromosomes, min_bind_score)[source]

Bases: object

add_chromosome(chrom, verbose=False)[source]

Add a chromosome to the list of chromosomes.

binding_sequences
binding_tfs_scores(op)[source]

Return tfs that bind this operator and their scores.

Parameters:op (class:virtual_cell.Sequence.Operator) – operator sequence
Returns:list of class
Return type:virtual_cell.Gene.TranscriptionFactor, float tuples
bs_to_tfs_dict()[source]

Create mapping from binding sequences to tfs.

For each binding sequence in the genome map to the set of tfs that contain this binding sequence

Returns:
  • mapping from class (virtual_cell.Sequence.BindingSequence to set)
  • of class (`virtual_cell.Gene.TranscriptionFactor)
class_version = '1.0'
copy_number_dist
copy_numbers
copy_numbers_eff_pumps
copy_numbers_enzymes
copy_numbers_inf_pumps
copy_numbers_tfs
del_chromosome(chrom, remove_genes=True, verbose=False)[source]

Delete a chromosome.

Remove a chromosome from the list of chromosomes. If remove_genes is True the genome will be further updated to reflect deletion of genes. E.g. the sequence bindings should be updated when genes are removed from the genome. It may be useful to defer updating if it is already known that the genes will be readded immediately. This may be the case when a chromosome is split (fission) or fused and no genes will be actually lost from the genome.

Parameters:
  • chrom (class:virtual_cell.Chromosome.Chromosome) – chromosome to be removed
  • remove_genes (bool) – if True update the genome
  • verbose (bool) – be verbose
die(time)[source]

Record death of phylogenetic units in the genome.

Typically called from the cell when it dies. All phylogenetic units in the genome are instructed to record their death. When phylogenetic units are no longer alive, they may be pruned from their respective phylogenetic trees if there are no more living descendants of the phylogenetic unit.

Parameters:time (float) – simulation time
eff_pumps
enzymes
inf_pumps
init_chromosomes(chromosomes)[source]

Initialize chromosomes.

Add preinitialized chromosomes to the genome.

Parameters:chromosomes (iterable of class:virtual_cell.Chromosome.Chromosome) – chromosomes to add
init_regulatory_network(min_bind_score)[source]

Initialize the binding state of the regulatory network.

Iterate over all class:`virtual_cell.Sequence.Operator`s in the genome and match them against all class:`virtual_cell.Sequence.BindingSequence`s.

Parameters:min_bind_score (float) – minimum binding score for sequence matching
op_to_tfs_scores_dict()[source]

Create mapping from operators to the tfs that bind them, with their scores.

For each operator in the genome map the set of tfs, together with their binding scores.

Returns:
  • mapping from class (virtual_cell.Sequence.Operator to set)
  • of class (`virtual_cell.Gene.TranscriptionFactor, binding-score (float) tuples.)
operators
prune_genomic_ancestries()[source]

Prune the phylogenetic trees of phylogenetic units in the genome.

Returns:
  • tuple (set of class (virtual_cell.Chromosome.Chromosome ,)
  • set of class (virtual_cell.GenomicElement.GenomicElement))
pumps
reset_regulatory_network(min_bind_score)[source]

Reset the binding state of the regulatory network.

Iterate over all Sequences in the genome and clear all bindings. Then re-initialize the regulatory network.

Parameters:min_bind_score (float) – minimum binding score for sequence matching
size
tf_connections_dict()[source]

A dictionry of TFs to sets of downstream bound genes.

tfs
toJSON(*args, **kwargs)[source]
update(state)[source]
update_genome_removed_gene(gene)[source]

Remove a gene from the genome if no more copies exist in the genome.

Updates the genome

Parameters:gene (class:virtual_cell.GenomicElement.GenomicElement) – gene to be removed
update_genome_removed_genes(genes)[source]

Update the genome to reflect gene deletions.

After the deletion of (part of) a chromosome, the genome has to be updated to reflect the change. Because exact copies of deleted genes may still be present in another part of the genome a check has to be performed before definitive removal.

Parameters:genes (iterable of class:virtual_cell.GenomicElement.GenomicElement) – genes that were targeted by a deletion operation.
update_regulatory_network(min_bind_score)[source]

Update the binding state of the regulatory network.

Iterate over all Sequences in the genome and if their check_binding flag is set, match the sequence against all potential binders in the genome.

Parameters:min_bind_score (float) – minimum binding score for sequence matching
upgrade()[source]

VirtualMicrobes.virtual_cell.GenomicElement module

class VirtualMicrobes.virtual_cell.GenomicElement.GenomicElement(time_birth=0, **kwargs)[source]

Bases: VirtualMicrobes.virtual_cell.PhyloUnit.PhyloUnit

Version:
Author:
types = ['tf', 'pump', 'enz']
uid = 0

VirtualMicrobes.virtual_cell.Identifier module

Created on Nov 14, 2013

@author: thocu

class VirtualMicrobes.virtual_cell.Identifier.Identifier(obj, versioned_id=1, increment_func=None)[source]

Bases: object

clear_offspring()[source]
classmethod count_class_types(obj)[source]
from_parent(parent, flat=True, pos=-1)[source]

Set an Identifier from a parent id.

If id is incremented in a ‘flat’ way, the new id is the unique count of objects of the (parent) type that this id belongs to. Else, the id increment is done in a ‘versioned’ manner. If parent has 0 offspring ids so far, the parent id is simply copied and no increment is done. If parent already has > 0 offspring ids, then a version element is added that indicates this id as the “n’th” offspring of the parent id. E.g. if parent id is 2.3 and it has 2 offspring already: from_parent(parent, flat=False) -> 2.3.2

Parameters:
  • parent (object with an Identifier attribute) – The parent of this Identifier.
  • flat (bool) – If True, set versioned id position as the total number of counted objects of a the type of parent. Else, add versioned id information.
  • pos (int (index)) – index of the version bit to update
increment_func
increment_offspring()[source]
is_copy(identifier)[source]

Test if identifier and self are different copies of the same major_id.

identifier
: Identifier
Identifier to compare to.
Returns:
Return type:bool
major_id

Return the first part (highest order) of identifier.

Returns:
Return type:int (or other id format)
minor_id

Return version part of the identifier.

Returns:
Return type:list of int (or other id format)
offspring_count
parse(versioned_id)[source]
unique_unit_dict = Counter()
versioned_id
VirtualMicrobes.virtual_cell.Identifier.increment(x)[source]

VirtualMicrobes.virtual_cell.PhyloUnit module

Created on Nov 17, 2014

@author: thocu

class VirtualMicrobes.virtual_cell.PhyloUnit.AddInheritanceType[source]

Bases: type

A metaclass that can set a class instances base type to support phylogenetic linking

The base type of the instantiated class can be either a PhyloBase or a PhyloUnit, depending on its _phylo_type class attribute. This enables an at run-time decision (via a program options) to fix the ancestry structure the class supports. PhyloBase instances keep references to neither parents nor children and hence do not need use a linker dict or unique_key generator. PhyloUnit does support ancestry. Phylogenetic linking is delegated to a global linker dict.

class VirtualMicrobes.virtual_cell.PhyloUnit.PhyloBase(time_birth)[source]

Bases: object

Base class for all classes that can behave as phylogenetic units of inheritance.

Phylogenetic Base units record their time of birth and death and have an identifier field that can indicate a relation to parents and offspring. PhyloBase object may come into existence when a mother cell gives rise to a daughter cell and all units of inheritance that it contains (i.e. when a genome get’s copied), but also when a phylogenetic unit (such as a gene or chromosome) mutates and the ancestral version will be kept intact for analysis purposes.

alive
die(time)[source]

Death of a phylo unit happens either when a cell dies and all its genetic material with it, or when a mutation gives rise to a new variant of the unit.

Parameters:time – Time of death in simulation time units.
id
living_offspring()[source]
mark(marker, mark)[source]

Set a marker on the phylo unit.

Parameters:
  • marker – marker type
  • mark – value
marker_dict
prune_dead_branch()[source]

Return self to be removed from the global phylo linker dict if not alive.

This is the degenerate base version of pruning. See the version in Phylo Unit for the case when units keep track of parent-child relationships.

time_birth
time_death
class VirtualMicrobes.virtual_cell.PhyloUnit.PhyloUnit(time_birth)[source]

Bases: VirtualMicrobes.virtual_cell.PhyloUnit.PhyloBase

Extended Base class for all classes that can be represented in phylogenies. These classes should support ancestor and child retrieval and setting time of birth and death. This is implemented by appointing

child_of(phylo_unit)[source]

Return whether this PhyloUnit is the child of another PhyloUnit.

phylo_unit : PhyloUnit

children
common_ancestors(phylo_unit)[source]
die(*args, **kwargs)[source]
has_living_offspring(exclude_set=set([]))[source]

Returns True if any of the phylo units descendants are alive

init_phylo_dicts()[source]
living_offspring()[source]

Returns a list of all offspring of this phylo unit that are currently alive.

lod_down_single()[source]

Proceed down a single branch on the line of descent until there is a branch point or terminal node.

lod_up_single()[source]

Proceed up a single branch on the line of descent until there is a branch point or terminal node.

lods_down()[source]

Composes all the lines of descent leading down from phylo unit in a non- recursive way (compare lods_up) .

lods_up()[source]

Composes all the lines of descent leading up from phylo unit (compare lods_down)

parent_of(phylo_unit)[source]

Return whether this PhyloUnit is the parent of another PhyloUnit.

phylo_unit : PhyloUnit

parents
prune_dead_branch(exclude_offspring_check_set=set([]))[source]

Return a set of phylogenetically related units that represent a dead phylo- genetic branch.

Recursively checks for parent nodes whether the nodes descendants are all dead. In that case, the node can be pruned and its parents may additionally be checked for being part of the extended dead branch. The exclude is used to prevent superfluous checks of living offspring when it is already known that the current phylo_unit has no living_offspring.

Parameters:exclude_offspring_check_set (set of class:virtual_cell.PhyloUnit.PhyloUnit) –
set_ancestor(ge)[source]
set_unique_key()[source]

Generate a unique key that can be used for mapping in a global linker dict.

VirtualMicrobes.virtual_cell.Population module

class VirtualMicrobes.virtual_cell.Population.Population(params, environment)[source]

Bases: object

add_cell(cell)[source]
ancestors_all_phylo_units()[source]
annotate_phylo_tree(ete_tree_struct, features=[], func_features={}, max_tree_depth=None, prune_internal=False, cummulative=True, to_rate=False, ete_root=None)[source]

Annotate the phylogenetic tree with cell data for tree plotting.

Assumes that the ete_tree has been constructed/updated. Creates a dictionary of feature dictionaries, keyed by the cells in the tree. Attaches the feature dictionaries to nodes in the ete_tree (annotate_ete_tree). Transforms some data to cummulative data along the branches of the tree. Optionally, prunes internal tree nodes (this will greatly simplify the tree drawing algorithm). Finally, transforms some data to rate of change data, using branch length for rate calculation.

Parameters:prune_internal – if True, nodes with 1 offspring only will be

removed/collapsed and their branch length added to the preceding node on the branch.

average_death_rate(cells=None)[source]
average_production(cells=None)[source]
average_promoter_strengths(cells=None)[source]
best_producer()[source]
calculate_cells_production(production_scaling_funct=None)[source]
calculate_death_rates(base_death_rate=None, max_die_off_fract=None, toxicity_scaling=None, cells=None)[source]
calculate_reference_production(pressure=None, historic_production_weight=None)[source]

Calculates a reference production value used to scale the reproductive potential of cells during competition to reproduce.

Parameters:pressure – type of selection pressure scaling; can be based on

current or historical production values.

cap_death_rates(_max)[source]
cell_death(cell, time, wiped=False)[source]
cell_markers(marker, cells=None)[source]
cell_sizes(cells=None)[source]
check_sane_concentrations(sane_value=(0.0, 500), cells=None)[source]
chromosomal_mut_counts(cells=None)[source]
chromosome_counts(cells=None)[source]
class_version = '0.0'
clear_mol_time_courses()[source]
clear_pop_changes()[source]
cloned_pop(pop_size, environment, params_dict, time=0)[source]
consumer_type_counts(cells=None)[source]
death_rates(cells=None)[source]
die_off(time, max_die_off_frac=None)[source]

assumes death rates have been calculated and set :param max_die_off_frac:

differential_regulation(cells=None)[source]
enz_average_promoter_strengths(cells=None)[source]
enzyme_average_vmaxs(cells=None)[source]
enzyme_counts(cells=None)[source]
enzyme_substrate_ks(cells=None)[source]
export_type_counts(cells=None)[source]
exporter_counts(cells=None)[source]
find_the_one(cells_competition_value, rand_nr, non=0.0, competition_scaling_fact=None)[source]
genome_sizes(cells=None)[source]
genotype_counts(cells=None)[source]

Frequencies of cell sets with equal genotypes

get_cell_death_rate(cell)[source]
get_cell_death_rate_dict(cells=None)[source]
get_cell_pos_production_dict(cells=None)[source]
get_cell_production(cell)[source]
get_cell_production_dict(cells=None, life_time_prod=None)[source]

Map cells to their last/ life time production value.

Parameters:
  • cells (sequence of Cell objects) – individuals to map, default is the current population
  • life_time_prod (bool) – take life time mean production instead of current
get_cell_production_rate_dict(cells=None)[source]
get_cell_reproduction_dict(cells=None)[source]

Counts of the number of reproduction events for each cell (living and dead direct children)

get_cell_size_dict(cells=None)[source]
get_cell_toxicity_rate_dict(cells=None)[source]
grow_time_course_arrays()[source]
horizontal_transfer(time, grid, environment, rand_gen=None, rand_gen_np=None)[source]

Applies HGT to all cells in the grid

Parameters:
  • grid (needed for internal HGT and setting the update-flags) –
  • environment (contains all possible reactions to draw a random gene for external HGT) –
  • rand_gen (RNG) –
Returns:

Return type:

import_type_counts(cells=None)[source]
importer_counts(cells=None)[source]
init_cells_view()[source]
init_current_ancestors()[source]
init_evo_rand_gens(evo_rand_seed=None)[source]
init_phylo_tree(supertree=None)[source]
init_pop(environment, pop_size=None, params_dict=None)[source]
init_pop_rand_gen(pop_rand_seed=None)[source]
init_range_dicts()[source]
init_roots(roots=None)[source]
iterages(cells=None)[source]
mark_cells_lineage(cells=None)[source]
mark_cells_metabolic_type(cells=None)[source]
marker_counts(marker, cells=None)[source]
classmethod metabolic_complementarity(cells, strict_providing=False, strict_exploiting=False)[source]

Determine for the list of cells what the overlap is in metabolites provided and exploited. To provide a metabolite a cell should simultaneous produce and export the metabolite. To exploit, it should be imported and consumed in a reaction.

metabolic_complementarity_pop(strict=False)[source]
metabolic_type_color(cell)[source]
metabolic_type_counts(cells=None)[source]

Frequencies of cell sets with equal metabolic capabilities

A metabolic type is defined on the bases of the full set of metabolic reactions that an individual can perform using its metabolic gene set. A frequency spectrum of these types is than produced in the form of a collections.Counter object. From this object we can ask things like: most_common(N) N elements etc.

most_abundant_marker(marker_name, cells=None)[source]
most_offspring()[source]
mutate_new_offspring(time, environment, rand_gen=None, rand_gen_np=None)[source]
offspring_counts(cells=None)[source]
oldest_cell()[source]
pan_metabolome_dict(cells=None)[source]
point_mut_counts(cells=None)[source]
pop_marker_counts(marker_name, cells=None)[source]
pos_production(cells=None)[source]
print_state()[source]
producer_type_counts(cells=None)[source]
production_rates(cells=None)[source]
production_values(cells=None)[source]
prune_metabolic_types(cells=None)[source]
pump_average_promoter_strengths(cells=None)[source]
pump_average_vmaxs(cells=None)[source]
pump_energy_ks(cells=None)[source]
pump_substrate_ks(cells=None)[source]
reaction_counts(cells=None)[source]
reaction_counts_split(cells=None)[source]
reaction_genotype_counts(cells=None)[source]

Frequencies of cell sets with equal reaction genotypes

remove_unproduced_gene_products(grid, cutoff=None)[source]
reproduce_at_minimum_production(time, competitors=None, max_reproduce=None, reproduction_cost=None)[source]
reproduce_cell(cell, time, spent_production=0.0, report=False)[source]
reproduce_on_grid(grid, max_pop_per_gp, time, neighborhood='competition', non=None, selection_pressure=None)[source]
reproduce_production_proportional(time, competitors, max_reproduce=None, production_spending_fract=None, non=0.0)[source]
reproduce_size_proportional(time, competitors, max_reproduce=None, non=0.0)[source]
reset_divided()[source]
reset_production_toxicity_volume(cells=None)[source]
resize_time_courses(new_max_time_points)[source]

resize the arrays that can hold time course information of cellular concentrations etc.

Parameters:new_max_time_points – new length of time course array
scale_death_rates(max_die_off_fract, cells=None)[source]
set_mol_concentrations_from_time_point(pos=None)[source]
store_pop_characters()[source]
tf_average_promoter_strengths(cells=None)[source]
tf_counts(cells=None)[source]
tf_k_bind_operators(cells=None)[source]
tf_ligand_ks(cells=None)[source]
toxicity_rates(cells=None)[source]
trophic_type_counts(env, cells=None)[source]
unique_pop(pop_size, environment, params_dict)[source]
update_cell_params(cells=None)[source]
update_ete_tree()[source]

Update the ete tree representation of the phylo_tree.

update_lineage_markers(cells=None, min_nr_marks=1)[source]
update_offspring_regulatory_network(min_bind_score=None)[source]
update_phylogeny(new_roots=None, reconstruct_tree=None)[source]
update_prod_val_hist(hist_prod_func=<function median>, historic_production_window=None, pop_size_scaling=None)[source]

Keep a sliding window view on historic production values.

Parameters:
  • hist_prod_func – calculates the population production value
  • historic_production_window – length of the sliding window
upgrade()[source]

Upgrading from older pickled version of class to latest version. Version information is saved as class variable and should be updated when class invariants (e.g. fields) are added.

wipe_pop(fract, time, min_surv=None, cells=None)[source]

VirtualMicrobes.virtual_cell.Sequence module

class VirtualMicrobes.virtual_cell.Sequence.BindingSequence(sequence=None, length=None, elements=['0', '1'], flip_dict={'1': '0', '0': '1'}, **kwargs)[source]

Bases: VirtualMicrobes.virtual_cell.Sequence.Sequence

Binding sequence of a Transcription Factor

bound_operators
clear_bound_operators()[source]
inform_operators()[source]
init_bound_operators()[source]
match_operators(operators, minimum_score)[source]
remove_bound_operator(op)[source]
toJSON(*args, **kwargs)[source]
class VirtualMicrobes.virtual_cell.Sequence.Operator(sequence=None, length=None, elements=['0', '1'], flip_dict={'1': '0', '0': '1'}, **kwargs)[source]

Bases: VirtualMicrobes.virtual_cell.Sequence.Sequence

Version:
Author:
bind_to_bs(bs, minimum_score, report=False)[source]
binding_sequences
calc_score_for_bs(binding_site, minimum_score=1.0, report=False)[source]
calculate_regulation()[source]
clear_binding_sequences()[source]
inform_bss()[source]
init_binding_sequences()[source]

A dictionary from binding sequences that bind to this operator to binding scores

remove_binding_sequence(bs)[source]
toJSON(*args, **kwargs)[source]
update_binding_sequences(all_binding_sequences, minimum_score)[source]

Find the Binding Sequences that match this Operator and update dictionaries accordingly. Matching depends on a threshold “minimum_score”.

Parameters:
  • binding_sequences – set of BS
  • minimum_score – threshold for calling a match between this

Operator and a BS

class VirtualMicrobes.virtual_cell.Sequence.Sequence(sequence, elements, length, flip_dict, time_birth=0, **kwargs)[source]

Bases: VirtualMicrobes.virtual_cell.PhyloUnit.PhyloBase

Version:
Author:
best_match(s2, at_least=0, penalty=0.5, report=False)[source]

Finds the best match possible between self and s2 anywhere in both sequences, but only if the match score reaches at least a minimum threshold. If the returned score is below this threshold it is not guaranteed to be the best possible match. No gaps are allowed.

Parameters:
  • s2 – sequence to compare to
  • at_least – the minimum score that should still be attainable by

the scoring algorithm for it to proceed computing the scoring matrix :param penalty: mismatch penalty

bit_flip(bit, flip_dict=None)[source]
check_binding
elements
flip_dict
insert_mutate(pos, sequence_stretch, constant_length=True)[source]

Insert a sequence stretch into the current sequence.

Sets the check_binding flag to indicate that the sequence should be checked for changed binding status.

Parameters:
  • pos (int) – position in current sequence to insert
  • sequence_stretch (str) – stretch to be inserted
  • constant_length (bool) – whether to truncate after insertion to maintain the original sequence length
Returns:

Return type:

newly mutated sequences

match(sequence, func)[source]
mutate(rand_gen=None, change_magnitude=None)[source]

Mutates the sequence.

Number of bits to mutate is either 1 or an amount of bits determined by the probability per bit to be changed or a given number of bits, depending on the value of “change_magnitude”. Sets the check_binding flag to indicate that the sequence should be checked for changed binding status.

Parameters:
  • rand_gen (RNG) –
  • change_magnitude (float or int) – when < 1, it is a probability, otherwise it is assumed to be the number of bits that should be changed (rounded up to nearest integer).
Returns:

Return type:

newly mutated sequences

mutate_bits(nr=1, rand_gen=None)[source]

Mutates a given number of random bits in the sequence to new values, chosen from the set of elements (possible values) that a bit can take, randomly.

Parameters:
  • nr – number of bits to mutate
  • rand_gen – random generator
random_sequence(n, rand_gen)[source]

Create random sequence of length n from elements.

Parameters:
  • n (int) – length of sequence
  • rand_gen (RNG) –
Returns:

Return type:

sequence string

randomize(rand_gen=None)[source]
sequence
classmethod substring_scoring_matrix(s1, s2, at_least=0, penalty=0.5)[source]

Computes a scoring matrix for matching between 2 sequences. Starts with a matrix filled in with all -1 . Comparisons between the strings continue as long as it is still possible to obtain the ‘at_least’ score when comparing the remainder of the strings. When the score is too low and the remaining substring length that can still be matched too short, the algorithm will stop, leaving the rest of the scores uncomputed. In that case, the _max is not guaranteed to be the maximum attainable matching score.

example matrix when matching sequences 0000000000 and 0000010100 with the default mismatch penalty of 0.5 (penalty substracted from score attained up to that point).

0 0 0 0 0 0 0 0 0 0 <- sequence 1

0 0 0 0 0 0 0 0 0 0 0

0 0 1 1 1 1 1 1 1 1 1 1 0 0 1 2 2 2 2 2 2 2 2 2 0 0 1 2 3 3 3 3 3 3 3 3 0 0 1 2 3 4 4 4 4 4 4 4 0 0 1 2 3 4 5 5 5 5 5 5 1 0 0 0.5 1.5 2.5 3.5 4.5 4.5 4.5 4.5 4.5 0 0 1 1 1.5 2.5 3.5 4.5 5.5 5.5 5.5 5.5 1 0 0 0.5 0.5 1.0 2.0 3.0 4.0 5.0 5.0 5.0 0 0 1 1 1.5 1.5 2.0 3.0 4.0 5.0 6.0 6.0 0 0 1 2 2 2.5 2.5 3.0 4.0 5.0 6.0 7.0

sequence 2

sequence 1: 0000000000 sequence 2: 0000010100 match: 0.7

uid = 0
VirtualMicrobes.virtual_cell.Sequence.pretty_scoring_matrix(seq1, seq2, scoring_mat, width=4)[source]

Module contents