PB assignation

Note

This page is initialy a jupyter notebook. You can see a notebook HTML render of it or download the notebook itself.

We hereby demonstrate how to use the API to assign PB sequences.

from __future__ import print_function, division
from pprint import pprint
import os
import pbxplore as pbx
---------------------------------------------------------------------------

ImportError                               Traceback (most recent call last)

<ipython-input-2-853cbdb98b68> in <module>()
----> 1 import pbxplore as pbx


ImportError: No module named pbxplore

Use the built-in structure parser

Assign PB for a single structure

The pbxplore.chains_from_files() function is the prefered way to read PDB and PDBx/mmCIF files using PBxplore. This function takes a list of file path as argument, and yield each chain it can read from these files. It provides a single interface to read PDB and PDBx/mmCIF files, to read single model and multimodel files, and to read a single file of a collection of files.

Here we want to read a single file with a single model and a single chain. Therefore, we need the first and only record that is yield by pbxplore.chains_from_files(). This record contains a name for the chain, and the chain itself as a pbxplore.structure.structure.Chain object. Note that, even if we want to read a single file, we need to provide it as a list to pbxplore.chains_from_files().

pdb_path = os.path.join(pbx.DEMO_DATA_PATH, '1BTA.pdb')
structure_reader = pbx.chains_from_files([pdb_path])
chain_name, chain = next(structure_reader)
print(chain_name)
print(chain)
---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

<ipython-input-3-8a9163a9b9bb> in <module>()
----> 1 pdb_path = os.path.join(pbx.DEMO_DATA_PATH, '1BTA.pdb')
      2 structure_reader = pbx.chains_from_files([pdb_path])
      3 chain_name, chain = next(structure_reader)
      4 print(chain_name)
      5 print(chain)


NameError: name 'pbx' is not defined

Protein Blocks are assigned based on the dihedral angles of the backbone. So we need to calculate them. The pbxplore.structure.structure.Chain.get_phi_psi_angles() methods calculate these angles and return them in a form that can be directly provided to the assignement function.

The dihedral angles are returned as a dictionnary. Each key of this dictionary is a residue number, and each value is a dictionary with the phi and psi angles.

dihedrals = chain.get_phi_psi_angles()
pprint(dihedrals)
---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

<ipython-input-4-9cf0a8bc4086> in <module>()
----> 1 dihedrals = chain.get_phi_psi_angles()
      2 pprint(dihedrals)


NameError: name 'chain' is not defined

The dihedral angles can be provided to the pbxplore.assign() function that assigns a Protein Block to each residue, and that returns the PB sequence as a string. Note that the first and last two residues are assigned to the Z jocker block as some dihedral angles cannot be calculated.

pb_seq = pbx.assign(dihedrals)
print(pb_seq)
---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

<ipython-input-5-71c7132535bf> in <module>()
----> 1 pb_seq = pbx.assign(dihedrals)
      2 print(pb_seq)


NameError: name 'pbx' is not defined

Assign PB for several models of a single file

A single PDB file can contain several models. Then, we do not want to read only the first chain. Instead, we want to iterate over all the chains.

pdb_path = os.path.join(pbx.DEMO_DATA_PATH, '2LFU.pdb')
for chain_name, chain in pbx.chains_from_files([pdb_path]):
    dihedrals = chain.get_phi_psi_angles()
    pb_seq = pbx.assign(dihedrals)
    print('* {}'.format(chain_name))
    print('  {}'.format(pb_seq))
---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

<ipython-input-6-378e76790064> in <module>()
----> 1 pdb_path = os.path.join(pbx.DEMO_DATA_PATH, '2LFU.pdb')
      2 for chain_name, chain in pbx.chains_from_files([pdb_path]):
      3     dihedrals = chain.get_phi_psi_angles()
      4     pb_seq = pbx.assign(dihedrals)
      5     print('* {}'.format(chain_name))


NameError: name 'pbx' is not defined

Assign PB for a set of structures

The pbxplore.chains_from_files() function can also handle several chains from several files.

import glob
files = [os.path.join(pbx.DEMO_DATA_PATH, pdb_name)
         for pdb_name in ('1BTA.pdb', '2LFU.pdb', '3ICH.pdb')]
print('The following files will be used:')
pprint(files)
for chain_name, chain in pbx.chains_from_files(files):
    dihedrals = chain.get_phi_psi_angles()
    pb_seq = pbx.assign(dihedrals)
    print('* {}'.format(chain_name))
    print('  {}'.format(pb_seq))
---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

<ipython-input-7-73672ce266c9> in <module>()
      1 import glob
      2 files = [os.path.join(pbx.DEMO_DATA_PATH, pdb_name)
----> 3          for pdb_name in ('1BTA.pdb', '2LFU.pdb', '3ICH.pdb')]
      4 print('The following files will be used:')
      5 pprint(files)


NameError: name 'pbx' is not defined

Assign PB for frames in a trajectory

PB sequences can be assigned from a trajectory. To do so, we use the pbxplore.chains_from_trajectory() function that takes the path to a trajectory and the path to the corresponding topology as argument. Any file formats readable by MDAnalysis can be used. Except for its arguments, pbxplore.chains_from_trajectory() works the same as pbxplore.chains_from_files().

** Note that MDAnalysis is required to use this feature. **

trajectory = os.path.join(pbx.DEMO_DATA_PATH, 'barstar_md_traj.xtc')
topology = os.path.join(pbx.DEMO_DATA_PATH, 'barstar_md_traj.gro')
for chain_name, chain in pbx.chains_from_trajectory(trajectory, topology):
    dihedrals = chain.get_phi_psi_angles()
    pb_seq = pbx.assign(dihedrals)
    print('* {}'.format(chain_name))
    print('  {}'.format(pb_seq))
---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

<ipython-input-8-5c7dfa551061> in <module>()
----> 1 trajectory = os.path.join(pbx.DEMO_DATA_PATH, 'barstar_md_traj.xtc')
      2 topology = os.path.join(pbx.DEMO_DATA_PATH, 'barstar_md_traj.gro')
      3 for chain_name, chain in pbx.chains_from_trajectory(trajectory, topology):
      4     dihedrals = chain.get_phi_psi_angles()
      5     pb_seq = pbx.assign(dihedrals)


NameError: name 'pbx' is not defined

Use a different structure parser

Providing the dihedral angles can be formated as expected by pbxplore.assign(), the source of these angles does not matter. For instance, other PDB parser can be used with PBxplore.

BioPython

import Bio.PDB
import math

pdb_path = os.path.join(pbx.DEMO_DATA_PATH, "2LFU.pdb")
for model in Bio.PDB.PDBParser().get_structure("2LFU", pdb_path):
    for chain in model:
        polypeptides = Bio.PDB.PPBuilder().build_peptides(chain)
        for poly_index, poly in enumerate(polypeptides):
            dihedral_list = poly.get_phi_psi_list()
            dihedrals = {}
            for resid, (phi, psi) in enumerate(dihedral_list, start=1):
                if not phi is None:
                    phi = 180 * phi / math.pi
                if not psi is None:
                    psi = 180 * psi / math.pi
                dihedrals[resid] = {'phi': phi, 'psi': psi}
        print(model, chain)
        pb_seq = pbx.assign(dihedrals)
        print(pb_seq)
---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

<ipython-input-9-7d6a63049325> in <module>()
      2 import math
      3
----> 4 pdb_path = os.path.join(pbx.DEMO_DATA_PATH, "2LFU.pdb")
      5 for model in Bio.PDB.PDBParser().get_structure("2LFU", pdb_path):
      6     for chain in model:


NameError: name 'pbx' is not defined