cogent3.core.alphabet.Alphabet#

class Alphabet(motifset, gap='-', moltype=None)#

An ordered set of fixed-length strings, e.g. the 61 sense codons.

ambiguities (e.g. N for any base in DNA) are not considered part of the alphabet itself, although a sequence is valid on the alphabet even if it contains ambiguities that are known to the alphabet. A gap is considered a separate motif and is not part of the alphabet itself.

The typical use is for the Alphabet to hold nucleic acid bases, amino acids, or codons.

The moltype, if supplied, handles ambiguities, coercion of the sequence to the correct data type, and complementation (if appropriate).

Attributes:
moltype

Methods

count(value, /)

Return number of occurrences of value.

from_indices(data)

Returns sequence of elements from sequence of indices.

get_gap_motif()

Returns the motif that self is using as a gap.

get_matched_array(motifs[, dtype])

deprecated, use function in evolve.likelihood_tree

get_motif_len()

Returns the length of the items in self, or None if they differ.

get_subset(motif_subset[, excluded])

Returns a new Alphabet object containing a subset of motifs in self.

get_word_alphabet(word_length)

Returns a new Alphabet object with items as word_length strings.

includes_gap_motif()

Returns True if self includes the gap motif, False otherwise.

index(item)

Returns the index of a specified item.

is_valid(seq)

Returns True if seq contains only items in self.

resolve_ambiguity(ambig_motif)

deprecated, use method on MolType

to_indices(data)

Returns sequence of indices from sequence of elements.

to_json()

returns result of json formatted string

with_gap_motif()

Returns an Alphabet object resembling self but including the gap.

AlphabetError

to_rich_dict