PB assignation¶
Table of Contents
Note
This page is initialy a jupyter notebook. You can see a notebook HTML render of it or download the notebook itself.
We hereby demonstrate how to use the API to assign PB sequences.
from __future__ import print_function, division
from pprint import pprint
import os
import pbxplore as pbx
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-2-853cbdb98b68> in <module>()
----> 1 import pbxplore as pbx
ImportError: No module named pbxplore
Use the built-in structure parser¶
Assign PB for a single structure¶
The pbxplore.chains_from_files()
function is the prefered way to
read PDB and PDBx/mmCIF files using PBxplore. This function takes a list
of file path as argument, and yield each chain it can read from these
files. It provides a single interface to read PDB and PDBx/mmCIF files,
to read single model and multimodel files, and to read a single file of
a collection of files.
Here we want to read a single file with a single model and a single
chain. Therefore, we need the first and only record that is yield by
pbxplore.chains_from_files()
. This record contains a name for
the chain, and the chain itself as a
pbxplore.structure.structure.Chain
object. Note that, even if
we want to read a single file, we need to provide it as a list to
pbxplore.chains_from_files()
.
pdb_path = os.path.join(pbx.DEMO_DATA_PATH, '1BTA.pdb')
structure_reader = pbx.chains_from_files([pdb_path])
chain_name, chain = next(structure_reader)
print(chain_name)
print(chain)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-3-8a9163a9b9bb> in <module>()
----> 1 pdb_path = os.path.join(pbx.DEMO_DATA_PATH, '1BTA.pdb')
2 structure_reader = pbx.chains_from_files([pdb_path])
3 chain_name, chain = next(structure_reader)
4 print(chain_name)
5 print(chain)
NameError: name 'pbx' is not defined
Protein Blocks are assigned based on the dihedral angles of the
backbone. So we need to calculate them. The
pbxplore.structure.structure.Chain.get_phi_psi_angles()
methods
calculate these angles and return them in a form that can be directly
provided to the assignement function.
The dihedral angles are returned as a dictionnary. Each key of this dictionary is a residue number, and each value is a dictionary with the phi and psi angles.
dihedrals = chain.get_phi_psi_angles()
pprint(dihedrals)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-4-9cf0a8bc4086> in <module>()
----> 1 dihedrals = chain.get_phi_psi_angles()
2 pprint(dihedrals)
NameError: name 'chain' is not defined
The dihedral angles can be provided to the pbxplore.assign()
function that assigns a Protein Block to each residue, and that returns
the PB sequence as a string. Note that the first and last two residues
are assigned to the Z
jocker block as some dihedral angles cannot be
calculated.
pb_seq = pbx.assign(dihedrals)
print(pb_seq)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-5-71c7132535bf> in <module>()
----> 1 pb_seq = pbx.assign(dihedrals)
2 print(pb_seq)
NameError: name 'pbx' is not defined
Assign PB for several models of a single file¶
A single PDB file can contain several models. Then, we do not want to read only the first chain. Instead, we want to iterate over all the chains.
pdb_path = os.path.join(pbx.DEMO_DATA_PATH, '2LFU.pdb')
for chain_name, chain in pbx.chains_from_files([pdb_path]):
dihedrals = chain.get_phi_psi_angles()
pb_seq = pbx.assign(dihedrals)
print('* {}'.format(chain_name))
print(' {}'.format(pb_seq))
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-6-378e76790064> in <module>()
----> 1 pdb_path = os.path.join(pbx.DEMO_DATA_PATH, '2LFU.pdb')
2 for chain_name, chain in pbx.chains_from_files([pdb_path]):
3 dihedrals = chain.get_phi_psi_angles()
4 pb_seq = pbx.assign(dihedrals)
5 print('* {}'.format(chain_name))
NameError: name 'pbx' is not defined
Assign PB for a set of structures¶
The pbxplore.chains_from_files()
function can also handle
several chains from several files.
import glob
files = [os.path.join(pbx.DEMO_DATA_PATH, pdb_name)
for pdb_name in ('1BTA.pdb', '2LFU.pdb', '3ICH.pdb')]
print('The following files will be used:')
pprint(files)
for chain_name, chain in pbx.chains_from_files(files):
dihedrals = chain.get_phi_psi_angles()
pb_seq = pbx.assign(dihedrals)
print('* {}'.format(chain_name))
print(' {}'.format(pb_seq))
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-7-73672ce266c9> in <module>()
1 import glob
2 files = [os.path.join(pbx.DEMO_DATA_PATH, pdb_name)
----> 3 for pdb_name in ('1BTA.pdb', '2LFU.pdb', '3ICH.pdb')]
4 print('The following files will be used:')
5 pprint(files)
NameError: name 'pbx' is not defined
Assign PB for frames in a trajectory¶
PB sequences can be assigned from a trajectory. To do so, we use the
pbxplore.chains_from_trajectory()
function that takes the path
to a trajectory and the path to the corresponding topology as argument.
Any file formats readable by MDAnalysis can be used. Except for its
arguments, pbxplore.chains_from_trajectory()
works the same as
pbxplore.chains_from_files()
.
** Note that MDAnalysis is required to use this feature. **
trajectory = os.path.join(pbx.DEMO_DATA_PATH, 'barstar_md_traj.xtc')
topology = os.path.join(pbx.DEMO_DATA_PATH, 'barstar_md_traj.gro')
for chain_name, chain in pbx.chains_from_trajectory(trajectory, topology):
dihedrals = chain.get_phi_psi_angles()
pb_seq = pbx.assign(dihedrals)
print('* {}'.format(chain_name))
print(' {}'.format(pb_seq))
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-8-5c7dfa551061> in <module>()
----> 1 trajectory = os.path.join(pbx.DEMO_DATA_PATH, 'barstar_md_traj.xtc')
2 topology = os.path.join(pbx.DEMO_DATA_PATH, 'barstar_md_traj.gro')
3 for chain_name, chain in pbx.chains_from_trajectory(trajectory, topology):
4 dihedrals = chain.get_phi_psi_angles()
5 pb_seq = pbx.assign(dihedrals)
NameError: name 'pbx' is not defined
Use a different structure parser¶
Providing the dihedral angles can be formated as expected by
pbxplore.assign()
, the source of these angles does not matter.
For instance, other PDB parser can be used with PBxplore.
BioPython¶
import Bio.PDB
import math
pdb_path = os.path.join(pbx.DEMO_DATA_PATH, "2LFU.pdb")
for model in Bio.PDB.PDBParser().get_structure("2LFU", pdb_path):
for chain in model:
polypeptides = Bio.PDB.PPBuilder().build_peptides(chain)
for poly_index, poly in enumerate(polypeptides):
dihedral_list = poly.get_phi_psi_list()
dihedrals = {}
for resid, (phi, psi) in enumerate(dihedral_list, start=1):
if not phi is None:
phi = 180 * phi / math.pi
if not psi is None:
psi = 180 * psi / math.pi
dihedrals[resid] = {'phi': phi, 'psi': psi}
print(model, chain)
pb_seq = pbx.assign(dihedrals)
print(pb_seq)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-9-7d6a63049325> in <module>()
2 import math
3
----> 4 pdb_path = os.path.join(pbx.DEMO_DATA_PATH, "2LFU.pdb")
5 for model in Bio.PDB.PDBParser().get_structure("2LFU", pdb_path):
6 for chain in model:
NameError: name 'pbx' is not defined