cogent3.core.alignment.Alignment#
- class Alignment(*args, **kwargs)#
An annotatable alignment class
- Attributes:
- annotation_db
- named_seqs
num_seqs
Returns the number of sequences in the alignment.
positions
Iterates over positions in the alignment, in order.
- seqs
Methods
add_feature
(*, biotype, name, spans[, ...])add feature on named sequence, or on the alignment itself
add_from_ref_aln
(ref_aln[, before_name, ...])Insert sequence(s) to self based on their alignment to a reference sequence.
add_seqs
(other[, before_name, after_name])Returns new object of class self with sequences from other added.
alignment_quality
([app_name])Computes the alignment quality using the indicated app
annotate_from_gff
(f[, seq_ids])copies annotations from a gff file to a sequence in self
apply_pssm
([pssm, path, background, ...])scores sequences using the specified pssm
coevolution
([method, segments, drawable, ...])performs pairwise coevolution measurement
copy
()Returns deep copy of self.
copy_annotations
(seq_db)copy annotations into attached annotation db
count_gaps_per_pos
([include_ambiguity])return counts of gaps per position as a DictArray
count_gaps_per_seq
([induced_by, unique, ...])return counts of gaps per sequence as a DictArray
counts
([motif_length, include_ambiguity, ...])counts of motifs
counts_per_pos
([motif_length, ...])return DictArray of counts per position
counts_per_seq
([motif_length, ...])counts of non-overlapping motifs per sequence
deepcopy
([sliced])returns deep copy of self.
degap
(**kwargs)Returns copy in which sequences have no gaps.
distance_matrix
([calc, show_progress, ...])Returns pairwise distances between sequences.
dotplot
([name1, name2, window, threshold, ...])make a dotplot between specified sequences.
entropy_per_pos
([motif_length, ...])returns shannon entropy per position
entropy_per_seq
([motif_length, ...])returns the Shannon entropy per sequence
filtered
(predicate[, motif_length, ...])The alignment positions where predicate(column) is true.
get_ambiguous_positions
()Returns dict of seq:{position:char} for ambiguous chars.
get_degapped_relative_to
(name)Remove all columns with gaps in sequence with given name.
get_drawable
(*[, biotype, width, vertical])make a figure from sequence features
get_drawables
(*[, biotype])returns a dict of drawables, keyed by type
get_features
(*[, seqid, biotype, name, ...])yields Feature instances
get_gap_array
([include_ambiguity])returns bool array with gap state True, False otherwise
get_gapped_seq
(seq_name[, recode_gaps])Return a gapped Sequence object for the specified seqname.
get_identical_sets
([mask_degen])returns sets of names for sequences that are identical
get_lengths
([include_ambiguity, allow_gap])returns {name: seq length, ...}
get_motif_probs
([alphabet, ...])Return a dictionary of motif probs, calculated as the averaged frequency across sequences.
get_position_indices
(f[, native, negate])Returns list of column indices for which f(col) is True.
get_projected_feature
(*, seqid, feature)returns an alignment feature projected onto the seqid sequence
get_projected_features
(*, seqid, **kwargs)projects all features from other sequences onto seqid
get_seq
(seqname)Return a ungapped Sequence object for the specified seqname.
get_seq_indices
(f[, negate])Returns list of keys of seqs where f(row) is True.
get_similar
(target[, min_similarity, ...])Returns new Alignment containing sequences similar to target.
get_translation
([gc, incomplete_ok, ...])translate from nucleic acid to protein
has_terminal_stop
([gc, strict])Returns True if any sequence has a terminal stop codon.
information_plot
([width, height, window, ...])plot information per position
is_ragged
()Returns True if alignment has sequences of different lengths.
iter_positions
([pos_order])Iterates over positions in the alignment, in order.
iter_selected
([seq_order, pos_order])Iterates over elements in the alignment.
iter_seqs
([seq_order])Iterates over values (sequences) in the alignment, in order.
iupac_consensus
([alphabet, allow_gaps])Returns string containing IUPAC consensus sequence of the alignment.
majority_consensus
()Returns list containing most frequent item at each position.
make_feature
(*, feature[, on_alignment])create a feature on named sequence, or on the alignment itself
matching_ref
(ref_name, gap_fraction, gap_run)Returns new alignment with seqs well aligned with a reference.
no_degenerates
([motif_length, allow_gap])returns new alignment without degenerate characters
omit_bad_seqs
([quantile])Returns new alignment without sequences with a number of uniquely introduced gaps exceeding quantile
omit_gap_pos
([allowed_gap_frac, motif_length])Returns new alignment where all cols (motifs) have <= allowed_gap_frac gaps.
omit_gap_runs
([allowed_run])Returns new alignment where all seqs have runs of gaps <=allowed_run.
omit_gap_seqs
([allowed_gap_frac])Returns new alignment with seqs that have <= allowed_gap_frac.
pad_seqs
([pad_length])Returns copy in which sequences are padded to same length.
probs_per_pos
([motif_length, ...])returns MotifFreqsArray per position
probs_per_seq
([motif_length, ...])return MotifFreqsArray per sequence
quick_tree
([calc, bootstrap, drop_invalid, ...])Returns pairwise distances between sequences.
rc
()Returns the reverse complement alignment
rename_seqs
(renamer)returns new instance with sequences renamed
replace_seqs
(seqs[, aa_to_codon])Returns new alignment with same shape but with data taken from seqs.
reverse_complement
()Returns the reverse complement alignment.
sample
([n, with_replacement, motif_length, ...])Returns random sample of positions from self, e.g.
seqlogo
([width, height, wrap, vspace, colours])returns Drawable sequence logo using mutual information
set_repr_policy
([num_seqs, num_pos, ...])specify policy for repr(self)
sliding_windows
(window, step[, start, end])Generator yielding new alignments of given length and interval.
strand_symmetry
([motif_length])returns dict of strand symmetry test results per seq
take_positions
(cols[, negate])Returns new Alignment containing only specified positions.
take_positions_if
(f[, negate])Returns new Alignment containing cols where f(col) is True.
take_seqs
(seqs[, negate])Returns new Alignment containing only specified seqs.
take_seqs_if
(f[, negate])Returns new Alignment containing seqs where f(row) is True.
to_dict
()Returns the alignment as dict of names -> strings.
to_dna
()returns copy of self as an alignment of DNA moltype seqs
to_fasta
()Return alignment in Fasta format
to_html
([name_order, wrap, limit, ref_name, ...])returns html with embedded styles for sequence colouring
to_json
()returns json formatted string
to_moltype
(moltype)returns copy of self with moltype seqs
to_nexus
(seq_type[, wrap])Return alignment in NEXUS format and mapping to sequence ids
to_phylip
()Return alignment in PHYLIP format and mapping to sequence ids
to_pretty
([name_order, wrap])returns a string representation of the alignment in pretty print format
to_protein
()returns copy of self as an alignment of PROTEIN moltype seqs
to_rich_dict
()returns detailed content including info and moltype attributes
to_rna
()returns copy of self as an alignment of RNA moltype seqs
to_type
([array_align, moltype, alphabet])returns alignment of type indicated by array_align
trim_stop_codons
([gc, strict])Removes any terminal stop codons from the sequences
variable_positions
([include_gap_motif])Return a list of variable position indexes.
with_gaps_from
(template)Same alignment but overwritten with the gaps from 'template'
with_masked_annotations
(biotypes[, ...])returns an alignment with annot_types regions replaced by mask_char if shadow is False, otherwise all other regions are masked.
with_modified_termini
()Changes the termini to include termini char instead of gapmotif.
write
([filename, format])Write the alignment to a file, preserving order of sequences.
gapped_by_map