Table Of Contents

Previous topic

4. Topology modules

Next topic

4.2. Common functions for topology building — MDAnalysis.topology.core

This Page

4.1. Topology readers — MDAnalysis.topology

This submodule contains the topology readers. A topology file supplies the list of atoms in the system, their connectivity and possibly additional information such as B-factors, partial charges, etc. The details depend on the file format and not every topology file provides all (or even any) additional data. As a minimum, a topology file has to contain the names of atoms in the order of the coordinate file and their residue names and numbers.

The following table lists the currently supported topology formats.

Table of Supported topology formats
Name extension remarks
CHARMM/XPLOR psf reads either format, atoms, bonds, angles, torsions/dihedrals information is all used; MDAnalysis.topology.PSFParser
CHARMM [1] crd “CARD” coordinate output from CHARMM; deals with either standard or EXTended format; MDAnalysis.topology.CRDParser
Brookhaven [1] pdb a simplified PDB format (as used in MD simulations) is read by default; the full format can be read by supplying the permissive=False flag to MDAnalysis.Universe; MDAnalysis.topology.PrimitivePDBParser and MDAnalysis.topology.PDBParser
XPDB pdb Extended PDB format (can use 5-digit residue numbers). To use, specify the format “XPBD” explicitly: Universe(..., topology_format="XPDB"). Module :MDAnalysis.coordinates.PDB`
PQR [1] pqr PDB-like but whitespace-separated files with charge and radius information; MDAnalysis.topology.PQRParser
PDBQT [1] pdbqt file format used by AutoDock with atom types t and partial charges q. Module: MDAnalysis.topology.PDBQTParser
GROMOS96 [1] gro GROMOS96 coordinate file; MDAnalysis.topology.GROParser
AMBER top, prmtop simple AMBER format reader (only supports a subset of flags); MDAnalysis.topology.TOPParser
DESRES [1] dms DESRES molecular sturcture reader (only supports the atom and bond records); MDAnalysis.topology.DMSParser
TPR tpr Gromacs portable run input reader (limited experimental support for some of the more recent versions of the file format); MDAnalysis.topology.TPRParser
MOL2 mol2 Tripos MOL2 molecular structure format; MDAnalysis.topology.MOL2Parser
LAMMPS data LAMMPS Data file parser MDAnalysis.topology.LAMMPSParser
XYZ [1] xyz XYZ File Parser. Reads only the labels from atoms and constructs minimal topology data. MDAnalysis.topology.XYZParser
GMS gms, log GAMESS output parser. Read only atoms of assembly section (atom, elems and coords) and construct topology. MDAnalysis.topology.GMSParser
[1](1, 2, 3, 4, 5, 6, 7) This format can also be used to provide coordinates so that it is possible to create a full Universe by simply providing a file of this format as the sole argument to Universe: u = Universe(filename)

4.1.1. Developer Notes

New in version 0.8.

Topology information consists of data that do not change over time, i.e. information that is the same for all time steps of a trajectory. This includes

  • identity of atoms (name, type, number, partial charge, ...) and to which residue and segment they belong; atoms are identified in MDAnalysis by their number, an integer number starting at 0 and incremented in the order of atoms found in the topology.
  • bonds (pairs of atoms)
  • angles (triplets of atoms)
  • dihedral angles (quadruplets of atoms) — proper and improper dihedrals should be treated separately

At the moment, only the identity of atoms is mandatory and at its most basic, the topology is simply a list of atoms to be associated with a list of coordinates.

The current implementation contains submodules for different topology file types. Each submodule must contain a function parse():

The function returns the basic MDAnalysis representation of the topology. At the moment, this is simply a dictionary with keys _atoms, _bonds, _angles, _dihe, _impr. The dictionary is stored as MDAnalysis.AtomGroup.Universe._psf.

Warning

The internal dictionary representation is subject to change. User code should not access this dictionary directly. The information provided here is solely for developers who need to work with the existing parsers.

The format of the individual keys is the following (see PSFParser for a reference implementation):

4.1.1.1. _atoms

The atoms are represented as a list of Atom instances. The parser needs to initialize the Atom objects with the data read from the topology file.

The order of atoms in the list must correspond to the sequence of atoms in the topology file. The atom’s number corresponds to its index in this list.

4.1.1.2. _bonds

Bonds are represented as a tuple of tuple. Each tuple contains two atom numbers, which indicate the atoms between which the bond is formed. Only one of the two permutations is stored, typically the one with the lower atom number first.

4.1.1.3. _bondorder

Some bonds have additional information called order. When available this is stored in a dictionary of format {bondtuple:order}. This extra information is then passed to Bond initialisation in u._init_bonds()

4.1.1.4. _angles

Angles are represented by a list of tuple. Each tuple contains three atom numbers.

Note

At the moment, the order is not defined and depends on how the topology file defines angles.

4.1.1.5. _dihe

Proper dihedral angles are represented by a list of tuple. Each tuple contains four atom numbers.

Note

At the moment, the order is not defined and depends on how the topology file defines proper dihedrals..

4.1.1.6. _impr

Improper dihedral angles are represented by a list of tuple. Each tuple contains four atom numbers.

Note

At the moment, the order is not defined and depends on how the topology file defines improper dihedrals..