Metadata-Version: 2.1
Name: hmmstuff
Version: 1.0.0
Summary: A tool to get structural information about light chain amyloids
Home-page: https://github.com/grogdrinker/hmmstuff
Author: Gabriele Orlando
Author-email: gabriele.orlando@kuleuven.be
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: <3.8
Description-Content-Type: text/markdown
Requires-Dist: pomegranate==0.14.0
Requires-Dist: numpy
Requires-Dist: scikit-learn
Requires-Dist: Bio

![logo](HMMSTUFF/logo.jpg)

---

## About HMMSTUFF

HMMSTUFF is a tool to help researchers to make the best use of the limited data available about light chain amyloids.
Given a light chain amyloid, it tells you if there is a similar chain with experimentally solved structure.

If you use HMMSTUFF in your research, please consider citing:


## Installation

Package installation should only take a few minutes with any of these methods (pip, source).

A foldX binary, which can be downloaded from https://foldxsuite.crg.eu/, is required

### Installing HMMSTUFF:

We suggest to create a local conda environment where to install HMMSTUFF. it can be done with:

```sh
conda create -n hmmstuff python=3.7
```
and activated with

```sh
conda activate hmmstuff
```

or

```sh
source activate hmmstuff
```

Finally, install hmmstuff using

```sh
pip install hmmstuff
```

The procedure will install hmmstuff in your computer.

## Using HMMSTUFF into a python script

the pip installation will install a python library that is directly usable (at least on linux and mac. Most probably on windows as well if you use a conda environemnt).

HMMSTUFF can be imported as a python module

```python
from HMMSTUFF.HMMSTUFF import HMMSTUFF # import the library
# put your input sequences in a dictionary
sequences = {"seq1":"AVSVALGQTVRITCQGDSLRSYSASWYEEKPGQAPVLVIFRAAAARFSGSSSGNTASLTITGAQAEDEADYYCNSRDSSANHQAAAAVFGGGTKLTV",
             "seq2":"AVSVALGQTVRITCQGDSLRSYSASWYQQKPGQAPVLVIFRAAAARFSGSSSGNTASLTITGAQAEDEADYYCNSRDSSANHVFGGGTKLTV",
             "seq3":"SELTQDPAVSVALGQTVRITCQGDSLRSYYASWYQQKSGQAPVLVIYSYNNRPSGIPDRFSGSNSGNTASLTITGAQAEDEADYYCNSRDSSGHHLVFGGGTKLTVLGQPKAAPS",
             "seq4":"MKYLLPTAAAGLLL"} 

hmmstuff = HMMSTUFF() # create the main HMMSTUFF object

# to get a fast evaluation of the sequences, run:
results = hmmstuff.evaluate_sequences(sequences)
```

```results``` will be a dictionary with as key the name of the input sequences (in the previous example "seq1", "seq2" ...). 
Each entry is again a dictionary with the predicted information:
- ```best_template_name```: is the best template for the input sequence (even if not good enouth to run a structure)
- ```doable```: if True, it means that the template is simmilar enough to build a model of the amyloid by homology.
- ```substrings```: is the portion of the input sequence that can be aligned with a template without gaps, with the exception of the parts missing in the PDB
- ```score```: The score of the template-input sequence alignment generated by HMMSTUFF. It goes from 0 to 1 and it tells how similar the input sequence is to the template.
- ```alignment```: is a tuple containing the aligmnet between the input sequence and the template sequence, generated with the HMM of HMMSTUFF.


To get an evaluation of the sequences, and eventually run the structure prediction, run:
```python
results = hmmstuff.predict_structures(sequences,foldx_bin="Your/FoldX/Bin/path",folder_out_pdbs="Your/output/path/")
```
where ```foldx_bin``` is the path to the foldX 4 binary file and ```folder_out_pdbs``` is the path of the folder where the homology models are going to be stored.

In this case, the results of each input sequence will feature two additional information

-```energy```: the energy of the model calculated by FoldX

-```pdb_file```: The path of the homology models PDB file

Remember that for this study, foldX 4 has been used and a Rotabase.txt file is required to be found in the same folder of the FoldX binary. The code might work with FoldX5 as well, but it has not been tested.

## Help

For bug reports, features addition and technical questions please contact orlando.gabriele89@gmail.be
