cogent3.core.moltype.MolType#

class MolType(motifset, gap='-', missing='?', gaps=None, seq_constructor=None, ambiguities=None, label=None, complements=None, pairs=None, mw_calculator=None, add_lower=False, preserve_existing_moltypes=False, make_alphabet_group=False, array_seq_constructor=None, colors=None, coerce_string=None)#

MolType: Handles operations that depend on the sequence type (e.g. DNA).

The MolType knows how to connect alphabets, sequences, alignments, and so forth, and how to disambiguate ambiguous symbols and perform base pairing (where appropriate).

WARNING: Objects passed to a MolType become associated with that MolType, i.e. if you pass ProteinSequence to a new MolType you make up, all ProteinSequences will now be associated with the new MolType. This may not be what you expect. Use preserve_existing_moltypes=True if you don’t want to reset the moltype.

Methods

can_match(first, second)

Returns True if every pos in 1st could match same pos in 2nd.

can_mismatch(first, second)

Returns True if any position in 1st could cause a mismatch with 2nd.

can_mispair(first, second)

Returns True if any position in 1st could mispair with 2nd.

can_pair(first, second)

Returns True if first and second could pair.

complement(item)

Returns complement of item, using data from self.complements.

count_degenerate(sequence)

Counts the degenerate bases in the specified sequence.

count_gaps(sequence)

Counts the gaps in the specified sequence.

degap(sequence)

Deletes all gap characters from sequence.

degenerate_from_seq(sequence)

Returns least degenerate symbol corresponding to chars in sequence.

disambiguate(sequence[, method])

Returns a non-degenerate sequence from a degenerate one.

first_degenerate(sequence)

Returns the index of first degenerate symbol in sequence, or None.

first_gap(sequence)

Returns the index of the first gap in the sequence, or None.

first_invalid(sequence)

Returns the index of first invalid symbol in sequence, or None.

first_non_strict(sequence)

Returns the index of first non-strict symbol in sequence, or None.

first_not_in_alphabet(sequence[, alphabet])

Returns index of first item not in alphabet, or None.

gap_indices(sequence)

Returns list of indices of all gaps in the sequence, or [].

gap_maps(sequence)

Returns tuple containing dicts mapping between gapped and ungapped.

gap_vector(sequence)

Returns list of bool indicating gap or non-gap in sequence.

get_css_style([colors, font_size, font_family])

returns string of CSS classes and {character: <CSS class name>, ...}

get_degenerate_positions(sequence[, include_gap])

returns indices matching degenerate characters

get_type()

Return the moltype label

is_ambiguity(querymotif)

Return True if querymotif is an amibiguity character in alphabet.

is_degenerate(sequence)

Returns True if sequence contains degenerate characters.

is_gap(char)

Returns True if char is a gap.

is_gapped(sequence)

Returns True if sequence contains gaps.

is_strict(sequence)

Returns True if sequence contains only items in self.alphabet.

is_valid(sequence)

Returns True if sequence contains no items that are not in self.

make_array_seq(seq[, name])

creates an array sequence

make_seq(seq[, name])

Returns sequence of correct type.

must_match(first, second)

Returns True if all positions in 1st must match positions in second.

must_pair(first, second)

Returns True if all positions in 1st must pair with second.

mw(sequence[, method, delta])

Returns the molecular weight of the sequence.

possibilities(sequence)

Counts number of possible sequences matching the sequence.

rc(item)

Returns reverse complement of item w/ data from self.complements.

resolve_ambiguity(ambig_motif[, alphabet, ...])

Returns tuple of all possible canonical characters corresponding to ambig_motif

strand_symmetric_motifs([motif_length])

returns ordered pairs of strand complementary motifs

to_json()

returns result of json formatted string

to_regex(seq)

returns a regex pattern with ambiguities expanded to a character set

valid_on_alphabet(sequence[, alphabet])

Returns True if sequence contains only items in alphabet.

verify_sequence(seq[, gaps_allowed, ...])

Checks whether sequence is valid on the default alphabet.

what_ambiguity(motifs)

The code that represents all of 'motifs', and minimal others.

coerce_str

to_rich_dict