Metadata-Version: 2.4
Name: pisa-analysis
Version: 3.1.1
Summary: This python package works with PISA to analyse data for macromolecular interfaces and interactions in assemblies.
Author-email: Grisell Diaz Leines <gdiazleines@ebi.ac.uk>
License: Apache 2.0
Project-URL: Homepage, https://github.com/PDBe-KB/pisa-analysis
Project-URL: Repository, https://github.com/PDBe-KB/pisa-analysis
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: gemmi>=0.7.3
Requires-Dist: jsonschema>=4.25.1
Requires-Dist: lxml>=6.0.2
Requires-Dist: pandas>=2.3.3
Requires-Dist: pydantic>=2.12.4
Requires-Dist: xmlschema>=4.2.0
Requires-Dist: xmltodict>=1.0.2

# Assembly interfaces analysis

## Basic information

This python package works with PISA to analyze data for macromolecular interfaces and interactions in assemblies.

The code consists of the module `pisa_analysis` that will:

- Analyse macromolecular interfaces with PISA
- Create a JSON dictionary with assembly interactions/interfaces information

```shell
git clone https://github.com/PDBe-KB/pisa-analysis

cd pisa-analysis
```
## Dependencies

The pisa_analysis process runs PISA as a subprocess and requires apriori compilation of PISA.

To make your life easier when running the process, you can set two path environment variables for PISA:

An environment variable to the `pisa` binary:

```shell
export PATH="$PATH:your_path_to_pisa/pisa/build"
```

A path to the setup directory of PISA:

```shell
export PISA_SETUP_DIR="/your_path_to_pisa/pisa/setup"
```

Additionally, it is required that PISA setup directory contains a pisa configuration template named [pisa_cfg_tmp](https://github.com/PDBe-KB/pisa/tree/main/setup/pisa_cfg_tmp)

## Usage

Follow below steps to install the module **pisa_analysis** and required dependencies:

```shell
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -r requirements.txt
```

To run the modules in command line:

**pisa_analysis**:

```shell
pisa_analysis [-h] \
  -i <INPUT_CIF_FILE> \
  --pdb_id <PDB_ID> \
  --assembly_id <ASSEMBLY_CODE> \
  -o <OUTPUT_JSON> \
  --output_xml <OUTPUT_XML>
```

Required arguments are :

```
--input_cif (-i)          :  Assembly CIF file (It can also read a PDB file). Optional if --gen_full_results is used and --assembly_id not specified.
--pdb_id                  :  Entry ID
--assembly_id             :  Assembly code
--output_json (-o)        :  Output directory for JSON fille
--output_xml              :  Output directory for XML files
```

Other optional arguments are:

```
--input_updated_cif       : Updated cif for pdbid entry
--force                   : Always runs PISA calculation
--pisa_setup_dir          : Path to the 'setup' directory in PISA
--pisa_binary             : Binary file for PISA
-h, --help                : Show help message
```

The process is as follows:

For **pisa_analysis** module:

1. The process first runs PISA in a subprocess and generates two xml files:
   - interfaces.xml
   - assembly.xml

   The xml files are saved in the output directory defined by the `--output_xml` argument. If the xml files exist and are valid, the process will skip running PISA unless the `--force` is used in the arguments.

2. Next, the process parses xml files generated by PISA and creates a dictionary that contains all assembly interfaces/interactions information.

3. While creating the interfaces dictionary for the entry, the process reads UniProt accession and sequence numbers from an Updated CIF file using Gemmi.

4. The process also parses xml file `assembly.xml` generated by PISA and creates a simplified dictionary with some assembly information.

4. In the last steps, the process dumps the dictionaries into JSON files. The JSON files are saved in the output directory defined by the `-o` or `--output_json` arguments. The output json files are:

     *xxxx-assemX_interfaces.json*  and  *xxxx-assemblyX.json*

     where xxxx is the pdb id entry and X is the assembly code.

## Expected JSON files

Documentation on the assembly interfaces json file and schema can be found here:

https://pisalite.docs.apiary.io/#reference/0/pisaqualifierjson/interaction-interface-data-per-pdb-assembly-entry

The simplified assembly json output looks as follows:

```json
{
   "PISA": {
      "pdb_id": "1d2s",
      "assembly_id": "1",
      "pisa_version": "2.0",
      "assembly": {
         "id": "1",
         "size": "8",
         "macromolecular_size": "2",
         "dissociation_energy": -3.96,
         "accessible_surface_area": 15146.45,
         "buried_surface_area": 3156.79,
         "entropy": 12.09,
         "dissociation_area": 733.07,
         "solvation_energy_gain": -41.09,
         "number_of_uc": "0",
         "number_of_dissociated_elements": "2",
         "symmetry_number": "2",
         "formula": "A(2)a(4)b(2)",
         "composition": "A-2A[CA](4)[DHT](2)"
      }
   }
}
```

## Run with Docker

```shell
docker run -v <HOST_DIR>:/data_dir \
   pdbegroup/pisa-analysis \
   pisa_analysis \
   --input_cif /data_dir/<INPUT_CIF> \
   --pdb_id <PDB_ID> \
   --assembly_id <ASSEMBLY_CODE> \
   --output_json /data_dir/<OUTPUT_JSON> \
   --output_xml /data_dir/<OUTPUT_XML>
```

## Development

We use Astral's [`uv` tool](https://docs.astral.sh/uv/) for setting up the project and
managing dependencies:

```shell
curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync
source .venv/bin/activate
```

We also use pre-commit checks to ensure that `requirements.txt` and `requirements-dev.txt` are up to date and, also, to
lint the code with [Ruff](https://docs.astral.sh/ruff/).

```shell
pre-commit install
pre-commit run --all-files
```

You can also build the Docker image locally and then run it as described above:

```shell
docker build . -t pdbegroup/pisa-analysis
```

## Versioning

We use [SemVer](https://semver.org) for versioning.

## Authors
* [Grisell Diaz Leines](https://github.com/grisell) - Lead developer
* [Stephen Anyango](otienoanyango) - Review and productionising
* [Mihaly Varadi](https://github.com/mvaradi) - Review and management

See all contributors [here](https://github.com/PDBe-KB/pisa-analysis/graphs/contributors).

## License

See  [LICENSE](https://github.com/PDBe-KB/pisa-analysis/blob/main/LICENSE)
