homelette.pdb_io
The homelette.pdb_io
submodule contains an object for parsing and
manipulating PDB files. There are several constructor function that can read
PDB files or download them from the internet.
Functions and classes
Functions and classes present in homelette.pdb_io are listed below:
-
homelette.pdb_io.
read_pdb
(file_name: str) → homelette.pdb_io.PdbObject Reads PDB from file.
- Parameters
file_name (str) – PDB file name
- Returns
- Return type
Notes
If a PDB file with multiple MODELs is read, only the first model will be conserved.
-
homelette.pdb_io.
download_pdb
(pdbid: str) → homelette.pdb_io.PdbObject Download PDB from the RCSB.
- Parameters
pdbid (str) – PDB identifier
- Returns
- Return type
Notes
If a PDB file with multiple MODELs is read, only the first model will be conserved.
-
class
homelette.pdb_io.
PdbObject
(lines: Iterable) Object encapsulating functionality regarding the processing of PDB files
- Parameters
lines (Iterable) – The lines of the PDB
- Variables
lines – The lines of the PDB, filtered for ATOM and HETATM records
- Returns
- Return type
None
See also
Notes
Please contruct instances of PdbObject using the constructor functions.
If a PDB file with multiple MODELs is read, only the first model will be conserved.
-
write_pdb
(file_name) → None Write PDB to file.
- Parameters
file_name (str) – The name of the file to write the PDB to.
- Returns
- Return type
None
-
parse_to_pd
() → pandas.DataFrame Parses PDB to pandas dataframe.
- Returns
- Return type
pd.DataFrame
Notes
Information is extracted according to the PDB file specification (version 3.30) and columns are named accordingly. See https://www.wwpdb.org/documentation/file-format for more information.
-
get_sequence
(ignore_missing: bool = True) → str Retrieve the 1-letter amino acid sequence of the PDB, grouped by chain.
- Parameters
ignore_missing (bool) – Changes behaviour with regards to unmodelled residues. If True, they will be ignored for generating the sequence (default). If False, they will be represented in the sequence with the character X.
- Returns
Amino acid sequence
- Return type
str
-
get_chains
() → list Extract all chains present in the PDB.
- Returns
- Return type
list
-
transform_extract_chain
(chain) → homelette.pdb_io.PdbObject Extract chain from PDB.
- Parameters
chain (str) – The chain ID to be extracted.
- Returns
- Return type
-
transform_renumber_residues
(starting_res: int = 1) → homelette.pdb_io.PdbObject Renumber residues in PDB.
- Parameters
starting_res (int) – Residue number to start renumbering at (default 1)
- Returns
- Return type
Notes
Missing residues in the PDB (i.e. unmodelled) will not be considered in the renumbering. If multiple chains are present in the PDB, numbering will be continued from one chain to the next one.
-
transform_change_chain_id
(new_chain_id) → homelette.pdb_io.PdbObject Replace chain ID for every entry in PDB.
- Parameters
new_chain_id (str) – New chain ID.
- Returns
- Return type
-
transform_remove_hetatm
() → homelette.pdb_io.PdbObject Remove all HETATM entries from PDB.
- Returns
- Return type
-
transform_filter_res_name
(selection: Iterable, mode: str = 'out') → homelette.pdb_io.PdbObject Filter PDB by residue name.
- Parameters
selection (Iterable) – For which residue names to filter
mode (str) – Filtering mode. If mode = “out”, the selection will be filtered out (default). If mode = “in”, everything except the selection will be filtered out.
- Returns
- Return type
-
transform_filter_res_seq
(lower: int, upper: int) → homelette.pdb_io.PdbObject Filter PDB by residue number.
- Parameters
lower (int) – Lower bound of range to filter with.
upper (int) – Upper bound of range to filter with, inclusive.
- Returns
- Return type
-
transform_concat
(*others: homelette.pdb_io.PdbObject) → homelette.pdb_io.PdbObject Concat PDB with other PDBs.
- Parameters
*others ('PdbObject) – Any number of PDBs.
- Returns
- Return type