cogent3.core.alignment.Alignment#

class Alignment(*args, **kwargs)#

An annotatable alignment class

Attributes
annotation_db
num_seqs

Returns the number of sequences in the alignment.

positions

Iterates over positions in the alignment, in order.

seqs

Methods

add_feature(*, biotype, name, spans[, ...])

add feature on named sequence, or on the alignment itself

add_from_ref_aln(ref_aln[, before_name, ...])

Insert sequence(s) to self based on their alignment to a reference sequence.

add_seqs(other[, before_name, after_name])

Returns new object of class self with sequences from other added.

alignment_quality([app_name])

Computes the alignment quality using the indicated app

annotate_from_gff(f[, seq_ids])

copies annotations from a gff file to a sequence in self

apply_pssm([pssm, path, background, ...])

scores sequences using the specified pssm

coevolution([method, segments, drawable, ...])

performs pairwise coevolution measurement

copy()

Returns deep copy of self.

copy_annotations(seq_db)

copy annotations into attached annotation db

count_gaps_per_pos([include_ambiguity])

return counts of gaps per position as a DictArray

count_gaps_per_seq([induced_by, unique, ...])

return counts of gaps per sequence as a DictArray

counts([motif_length, include_ambiguity, ...])

counts of motifs

counts_per_pos([motif_length, ...])

return DictArray of counts per position

counts_per_seq([motif_length, ...])

counts of non-overlapping motifs per sequence

deepcopy([sliced])

returns deep copy of self.

degap(**kwargs)

Returns copy in which sequences have no gaps.

distance_matrix([calc, show_progress, ...])

Returns pairwise distances between sequences.

dotplot([name1, name2, window, threshold, ...])

make a dotplot between specified sequences.

entropy_per_pos([motif_length, ...])

returns shannon entropy per position

entropy_per_seq([motif_length, ...])

returns the Shannon entropy per sequence

filtered(predicate[, motif_length, ...])

The alignment positions where predicate(column) is true.

get_ambiguous_positions()

Returns dict of seq:{position:char} for ambiguous chars.

get_degapped_relative_to(name)

Remove all columns with gaps in sequence with given name.

get_drawable(*[, biotype, width, vertical])

make a figure from sequence features

get_drawables(*[, biotype])

returns a dict of drawables, keyed by type

get_features(*[, seqid, biotype, name, ...])

yields Feature instances

get_gap_array([include_ambiguity])

returns bool array with gap state True, False otherwise

get_gapped_seq(seq_name[, recode_gaps, moltype])

Return a gapped Sequence object for the specified seqname.

get_identical_sets([mask_degen])

returns sets of names for sequences that are identical

get_lengths([include_ambiguity, allow_gap])

returns {name: seq length, ...}

get_motif_probs([alphabet, ...])

Return a dictionary of motif probs, calculated as the averaged frequency across sequences.

get_position_indices(f[, native, negate])

Returns list of column indices for which f(col) is True.

get_projected_feature(*, seqid, feature)

returns an alignment feature projected onto the seqid sequence

get_projected_features(*, seqid, **kwargs)

projects all features from other sequences onto seqid

get_seq(seqname)

Return a ungapped Sequence object for the specified seqname.

get_seq_indices(f[, negate])

Returns list of keys of seqs where f(row) is True.

get_similar(target[, min_similarity, ...])

Returns new Alignment containing sequences similar to target.

get_translation([gc, incomplete_ok, ...])

translate from nucleic acid to protein

has_terminal_stop([gc, strict, allow_partial])

Returns True if any sequence has a terminal stop codon.

has_terminal_stops(**kwargs)

deprecated

information_plot([width, height, window, ...])

plot information per position

is_ragged()

Returns True if alignment has sequences of different lengths.

iter_positions([pos_order])

Iterates over positions in the alignment, in order.

iter_selected([seq_order, pos_order])

Iterates over elements in the alignment.

iter_seqs([seq_order])

Iterates over values (sequences) in the alignment, in order.

iupac_consensus([alphabet, allow_gaps])

Returns string containing IUPAC consensus sequence of the alignment.

majority_consensus()

Returns list containing most frequent item at each position.

make_feature(*, feature[, on_alignment])

create a feature on named sequence, or on the alignment itself

matching_ref(ref_name, gap_fraction, gap_run)

Returns new alignment with seqs well aligned with a reference.

no_degenerates([motif_length, allow_gap])

returns new alignment without degenerate characters

omit_bad_seqs([quantile])

Returns new alignment without sequences with a number of uniquely introduced gaps exceeding quantile

omit_gap_pos([allowed_gap_frac, motif_length])

Returns new alignment where all cols (motifs) have <= allowed_gap_frac gaps.

omit_gap_runs([allowed_run])

Returns new alignment where all seqs have runs of gaps <=allowed_run.

omit_gap_seqs([allowed_gap_frac])

Returns new alignment with seqs that have <= allowed_gap_frac.

pad_seqs([pad_length])

Returns copy in which sequences are padded to same length.

probs_per_pos([motif_length, ...])

returns MotifFreqsArray per position

probs_per_seq([motif_length, ...])

return MotifFreqsArray per sequence

project_annotation(seqid, annot)

projects the alignment coordinate annotation onto seq

quick_tree([calc, bootstrap, drop_invalid, ...])

Returns pairwise distances between sequences.

rc()

Returns the reverse complement alignment

rename_seqs(renamer)

returns new instance with sequences renamed

replace_seqs(seqs[, aa_to_codon])

Returns new alignment with same shape but with data taken from seqs.

reverse_complement()

Returns the reverse complement alignment.

sample([n, with_replacement, motif_length, ...])

Returns random sample of positions from self, e.g.

seqlogo([width, height, wrap, vspace, colours])

returns Drawable sequence logo using mutual information

set_repr_policy([num_seqs, num_pos, ...])

specify policy for repr(self)

sliding_windows(window, step[, start, end])

Generator yielding new alignments of given length and interval.

strand_symmetry([motif_length])

returns dict of strand symmetry test results per seq

take_positions(cols[, negate])

Returns new Alignment containing only specified positions.

take_positions_if(f[, negate])

Returns new Alignment containing cols where f(col) is True.

take_seqs(seqs[, negate])

Returns new Alignment containing only specified seqs.

take_seqs_if(f[, negate])

Returns new Alignment containing seqs where f(row) is True.

to_dict()

Returns the alignment as dict of names -> strings.

to_dna()

returns copy of self as an alignment of DNA moltype seqs

to_fasta()

Return alignment in Fasta format

to_html([name_order, wrap, limit, ref_name, ...])

returns html with embedded styles for sequence colouring

to_json()

returns json formatted string

to_moltype(moltype)

returns copy of self with moltype seqs

to_nexus(seq_type[, wrap])

Return alignment in NEXUS format and mapping to sequence ids

to_phylip()

Return alignment in PHYLIP format and mapping to sequence ids

to_pretty([name_order, wrap])

returns a string representation of the alignment in pretty print format

to_protein()

returns copy of self as an alignment of PROTEIN moltype seqs

to_rich_dict()

returns detailed content including info and moltype attributes

to_rna()

returns copy of self as an alignment of RNA moltype seqs

to_type([array_align, moltype, alphabet])

returns alignment of type indicated by array_align

trim_stop_codons([gc, strict, allow_partial])

Removes any terminal stop codons from the sequences

variable_positions([include_gap_motif])

Return a list of variable position indexes.

with_gaps_from(template)

Same alignment but overwritten with the gaps from 'template'

with_masked_annotations(biotypes[, ...])

returns an alignment with annot_types regions replaced by mask_char if shadow is False, otherwise all other regions are masked.

with_modified_termini()

Changes the termini to include termini char instead of gapmotif.

write([filename, format])

Write the alignment to a file, preserving order of sequences.

gapped_by_map

get_annotations_from_any_seq

get_annotations_from_seq

get_by_seq_annotation

get_projected_annotations