Metadata-Version: 2.4
Name: RNADiscrepancy
Version: 0.3.3
Summary: Compute discrepancy between 3D RNA (sub)structures
Project-URL: Homepage, https://codeberg.org/vreinharz/RNADiscrepancy
Project-URL: Issues, https://codeberg.org/vreinharz/RNADiscrepancy/issues
Author-email: Vladimir Reinharz <rna@vreinharz.com>
License-Expression: GPL-3.0
License-File: LICENSE
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.10
Requires-Dist: numpy>=2.2.6
Requires-Dist: scipy>=1.15.3
Provides-Extra: all
Requires-Dist: matplotlib; extra == 'all'
Requires-Dist: pytest; extra == 'all'
Requires-Dist: pytest-cov; extra == 'all'
Requires-Dist: sphinx-autodoc-typehints; extra == 'all'
Requires-Dist: sphinx-autodoc2; extra == 'all'
Requires-Dist: sphinx-rtd-theme; extra == 'all'
Requires-Dist: sphinx<9; extra == 'all'
Provides-Extra: docs
Requires-Dist: sphinx-autodoc-typehints; extra == 'docs'
Requires-Dist: sphinx-autodoc2; extra == 'docs'
Requires-Dist: sphinx-rtd-theme; extra == 'docs'
Requires-Dist: sphinx<9; extra == 'docs'
Provides-Extra: tests
Requires-Dist: matplotlib; extra == 'tests'
Requires-Dist: pytest; extra == 'tests'
Requires-Dist: pytest-cov; extra == 'tests'
Provides-Extra: viz
Requires-Dist: matplotlib; extra == 'viz'
Description-Content-Type: text/markdown

# RNADiscrepancy


[![PyPI version](https://img.shields.io/pypi/v/RNADiscrepancy?color=blue&label=PyPI)](https://pypi.org/project/RNADiscrepancy/)
[![Forgejo Release](https://img.shields.io/gitea/v/release/vreinharz/RNADiscrepancy?gitea_url=https://codeberg.org&label=Forgejo&color=blue)](https://codeberg.org/vreinharz/RNADiscrepancy/releases)
[![Tests](https://codeberg.org/vreinharz/RNADiscrepancy/actions/workflows/main.yml/badge.svg)](https://codeberg.org/vreinharz/RNADiscrepancy/actions)
[![Documentation Status](https://img.shields.io/readthedocs/rnadiscrepancy)](https://rnadiscrepancy.readthedocs.io)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)



## Isostericity and Discrepancy

[Geometric discrepancy](https://doi.org/10.1007/s00285-007-0110-x) is a similarity measure between RNA 3D structures.  
[Isodiscrepancy](https://doi.org/10.1093/nar/gkp011) is a similarity measure between two pairs of interacting RNA nucleotides

This package aims to provide easy interfaces to compute these values and visualize their alignments.


## Requirements
    - numpy
    - scipy

## Installation
```bash
pip install RNADiscrepancy
```

## Usage exemple

### Loading from cif

Assuming that the file `2gdi.cif` is downloaded (e.g. from [here](https://www.ebi.ac.uk/pdbe/entry/pdb/2gdi))

```python
import RNADiscrepancy

path_cif = '2gdi.cif'
to_find = [('B', '4'), ('B', '78'), #a CWW
           ('B', '51'),('B', '69')  #a TWH
          ]
positions = RNADiscrepancy.cif2nucleotides(path_cif, to_find,
                                             author_chain=False,
                                             author_position=False)
first_base_pair = (positions['B', '4'], positions['B', '78'])
second_base_pair = (positions['B', '51'], positions['B', '69'])
print(RNADiscrepancy.isodiscrepancy(first_base_pair, second_base_pair))
#15.130358196427407
print(RNADiscrepancy.discrepancy(first_base_pair, second_base_pair))
#4.833960422452675
```

We can also compute the discrepancy between modules of arbitrary size

```python
import RNADiscrepancy

path_cif = '2gdi.cif'
to_find = [('B', '4'), ('B', '78'), ('B', '5'), #a CWW + neighbor
           ('B', '51'),('B', '69'), ('B', '52')  #a TWH + neighbor
          ]
positions = RNADiscrepancy.cif2nucleotides(path_cif, to_find,
                                             author_chain=False,
                                             author_position=False)
first_module = (positions['B', '4'], positions['B', '78'], positions['B', '5'], positions['B', '51'])
second_module = (positions['B', '51'], positions['B', '69'], positions['B', '52'], positions['B', '4'])
print(RNADiscrepancy.discrepancy(first_module, second_module))
#1.7206629733676666
```



### Loading raw data 

For each nucleotide we need
1. list of atom names
2. list of atom elements
3. list of atom positions
4. `RNADiscrepancy.rawdata2nucleotide` to format raw data
5. `RNADiscrepancy.isodiscrepancy` to compute isodiscrepancy

```python
import RNADiscrepancy
cwwAU_A_list_atoms = ["P","OP1","OP2","O5'","C5'","C4'","O4'","C3'","O3'","C2'","O2'","C1'","N9","C8","N7","C5","C6","N6","N1","C2","N3","C4"]
cwwAU_A_list_atoms_elements = ["P","O","O","O","C","C","O","C","O","C","O","C","N","C","N","C","C","N","N","C","N","C"]
cwwAU_A_list_positions = [(-46.6870002746582, -80.67400360107422, -42.457000732421875),(-45.249000549316406, -80.74099731445312, -42.09299850463867),(-47.303001403808594, -79.36100006103516, -42.77299880981445),(-46.92900085449219, -81.6449966430664, -43.694000244140625),(-47.27000045776367, -83.01000213623047, -43.49399948120117),(-47.70500183105469, -83.62999725341797, -44.79600143432617),(-49.13199996948242, -83.92400360107422, -44.75299835205078),(-47.60100173950195, -82.73999786376953, -46.02299880981445),(-46.290000915527344, -82.58599853515625, -46.52899932861328),(-48.512001037597656, -83.47599792480469, -46.97999954223633),(-47.96799850463867, -84.72100067138672, -47.375),(-49.6870002746582, -83.72000122070312, -46.04600143432617),(-50.590999603271484, -82.5719985961914, -46.012001037597656),(-50.654998779296875, -81.52100372314453, -45.125),(-51.59299850463867, -80.6449966430664, -45.40800094604492),(-52.19300079345703, -81.16400146484375, -46.55400085449219),(-53.26100158691406, -80.71800231933594, -47.35499954223633),(-53.959999084472656, -79.60900115966797, -47.10100173950195),(-53.59199905395508, -81.46099853515625, -48.4379997253418),(-52.902000427246094, -82.58399963378906, -48.68299865722656),(-51.887001037597656, -83.11399841308594, -48.000999450683594),(-51.58000183105469, -82.34600067138672, -46.9370002746582)]
cwwAU_U_list_atoms = ["P","OP1","OP2","O5'","C5'","C4'","O4'","C3'","O3'","C2'","O2'","C1'","N1","C2","O2","N3","C4","O4","C5","C6"]
cwwAU_U_list_atoms_elements = ["P","O","O","O","C","C","O","C","O","C","O","C","N","C","O","N","C","O","C","C"]
cwwAU_U_list_positions = [(-62.04399871826172, -78.197998046875, -52.75299835205078),(-63.35599899291992, -78.33399963378906, -53.4370002746582),(-61.986000061035156, -77.62999725341797, -51.382999420166016),(-61.347999572753906, -79.62899780273438, -52.72700119018555),(-61.56800079345703, -80.56199645996094, -53.777000427246094),(-60.268001556396484, -81.21900177001953, -54.17399978637695),(-59.19599914550781, -80.2249984741211, -54.268001556396484),(-59.689998626708984, -82.20500183105469, -53.178001403808594),(-60.340999603271484, -83.45899963378906, -53.125),(-58.26900100708008, -82.29499816894531, -53.6879997253418),(-58.19900131225586, -82.8759994506836, -54.974998474121094),(-57.983001708984375, -80.80400085449219, -53.79199981689453),(-57.66400146484375, -80.25, -52.470001220703125),(-56.564998626708984, -80.7750015258789, -51.81100082397461),(-55.832000732421875, -81.61399841308594, -52.30500030517578),(-56.35599899291992, -80.28099822998047, -50.54899978637695),(-57.10900115966797, -79.3290023803711, -49.89699935913086),(-56.82600021362305, -79.02300262451172, -48.737998962402344),(-58.20600128173828, -78.81199645996094, -50.659000396728516),(-58.439998626708984, -79.2750015258789, -51.88999938964844)]
twwUC_U_list_atoms = ["P","OP1","OP2","O5'","C5'","C4'","O4'","C3'","O3'","C2'","O2'","C1'","N1","C2","O2","N3","C4","O4","C5","C6"]
twwUC_U_list_atoms_elements = ["P","O","O","O","C","C","O","C","O","C","O","C","N","C","O","N","C","O","C","C"]
twwUC_U_list_positions = [(52.62300109863281, 163.0240020751953, 82.34400177001953),(51.7239990234375, 162.08799743652344, 81.61499786376953),(53.42900085449219, 164.01199340820312, 81.5770034790039),(51.78300094604492, 163.79200744628906, 83.45999908447266),(50.75199890136719, 163.1179962158203, 84.20899963378906),(50.0260009765625, 164.10000610351562, 85.0989990234375),(50.957000732421875, 164.63400268554688, 86.0719985961914),(49.45600128173828, 165.31300354003906, 84.38600158691406),(48.15599822998047, 165.03599548339844, 83.86499786376953),(49.45199966430664, 166.3730010986328, 85.48100280761719),(48.358001708984375, 166.2570037841797, 86.36199951171875),(50.731998443603516, 166.02000427246094, 86.24299621582031),(51.94499969482422, 166.73500061035156, 85.80999755859375),(51.97999954223633, 168.10499572753906, 85.95700073242188),(51.03900146484375, 168.75, 86.38800048828125),(53.1619987487793, 168.69700622558594, 85.58200073242188),(54.284000396728516, 168.07400512695312, 85.08399963378906),(55.29899978637695, 168.73300170898438, 84.87999725341797),(54.16299819946289, 166.66700744628906, 84.93399810791016),(53.02899932861328, 166.0590057373047, 85.28900146484375)]
twwUC_C_list_atoms = ["P","OP1","OP2","O5'","C5'","C4'","O4'","C3'","O3'","C2'","O2'","C1'","N1","C2","O2","N3","C4","N4","C5","C6"]
twwUC_C_list_atoms_elements = ["P","O","O","O","C","C","O","C","O","C","O","C","N","C","O","N","C","N","C","C"]
twwUC_C_list_positions = [(51.45399856567383, 178.2270050048828, 85.64199829101562),(51.52799987792969, 179.64500427246094, 85.23200225830078),(50.196998596191406, 177.45799255371094, 85.41799926757812),(52.66400146484375, 177.4499969482422, 84.95800018310547),(54.012001037597656, 177.9250030517578, 85.10099792480469),(54.97700119018555, 176.7689971923828, 85.08399963378906),(54.652000427246094, 175.8520050048828, 86.15599822998047),(54.93899917602539, 175.89599609375, 83.8489990234375),(55.71099853515625, 176.46200561523438, 82.81400299072266),(55.54800033569336, 174.60000610351562, 84.3550033569336),(56.957000732421875, 174.67799377441406, 84.4469985961914),(54.93899917602539, 174.52000427246094, 85.75399780273438),(53.6879997253418, 173.73199462890625, 85.80500030517578),(53.76300048828125, 172.32899475097656, 85.83999633789062),(54.87699890136719, 171.78500366210938, 85.81199645996094),(52.61399841308594, 171.60699462890625, 85.89900207519531),(51.43299865722656, 172.22999572753906, 85.91899871826172),(50.332000732421875, 171.48800659179688, 85.98400115966797),(51.32899856567383, 173.6479949951172, 85.87699890136719),(52.46799850463867, 174.35299682617188, 85.82099914550781)]
cww_AU_A = RNADiscrepancy.rawdata2nucleotide('A', cwwAU_A_list_atoms, cwwAU_A_list_atoms_elements, cwwAU_A_list_positions)
cww_AU_U = RNADiscrepancy.rawdata2nucleotide('U', cwwAU_U_list_atoms, cwwAU_U_list_atoms_elements, cwwAU_U_list_positions)
tww_UC_U = RNADiscrepancy.rawdata2nucleotide('U', twwUC_U_list_atoms, twwUC_U_list_atoms_elements, twwUC_U_list_positions)
tww_UC_C = RNADiscrepancy.rawdata2nucleotide('C', twwUC_C_list_atoms, twwUC_C_list_atoms_elements, twwUC_C_list_positions)
cwwAU_A_list_atoms = ["P","OP1","OP2","O5'","C5'","C4'","O4'","C3'","O3'","C2'","O2'","C1'","N9","C8","N7","C5","C6","N6","N1","C2","N3","C4"]
cwwAU_A_list_atoms_elements = ["P","O","O","O","C","C","O","C","O","C","O","C","N","C","N","C","C","N","N","C","N","C"]
cwwAU_A_list_positions = [(-46.6870002746582, -80.67400360107422, -42.457000732421875),(-45.249000549316406, -80.74099731445312, -42.09299850463867),(-47.303001403808594, -79.36100006103516, -42.77299880981445),(-46.92900085449219, -81.6449966430664, -43.694000244140625),(-47.27000045776367, -83.01000213623047, -43.49399948120117),(-47.70500183105469, -83.62999725341797, -44.79600143432617),(-49.13199996948242, -83.92400360107422, -44.75299835205078),(-47.60100173950195, -82.73999786376953, -46.02299880981445),(-46.290000915527344, -82.58599853515625, -46.52899932861328),(-48.512001037597656, -83.47599792480469, -46.97999954223633),(-47.96799850463867, -84.72100067138672, -47.375),(-49.6870002746582, -83.72000122070312, -46.04600143432617),(-50.590999603271484, -82.5719985961914, -46.012001037597656),(-50.654998779296875, -81.52100372314453, -45.125),(-51.59299850463867, -80.6449966430664, -45.40800094604492),(-52.19300079345703, -81.16400146484375, -46.55400085449219),(-53.26100158691406, -80.71800231933594, -47.35499954223633),(-53.959999084472656, -79.60900115966797, -47.10100173950195),(-53.59199905395508, -81.46099853515625, -48.4379997253418),(-52.902000427246094, -82.58399963378906, -48.68299865722656),(-51.887001037597656, -83.11399841308594, -48.000999450683594),(-51.58000183105469, -82.34600067138672, -46.9370002746582)]
cwwAU_U_list_atoms = ["P","OP1","OP2","O5'","C5'","C4'","O4'","C3'","O3'","C2'","O2'","C1'","N1","C2","O2","N3","C4","O4","C5","C6"]
cwwAU_U_list_atoms_elements = ["P","O","O","O","C","C","O","C","O","C","O","C","N","C","O","N","C","O","C","C"]
cwwAU_U_list_positions = [(-62.04399871826172, -78.197998046875, -52.75299835205078),(-63.35599899291992, -78.33399963378906, -53.4370002746582),(-61.986000061035156, -77.62999725341797, -51.382999420166016),(-61.347999572753906, -79.62899780273438, -52.72700119018555),(-61.56800079345703, -80.56199645996094, -53.777000427246094),(-60.268001556396484, -81.21900177001953, -54.17399978637695),(-59.19599914550781, -80.2249984741211, -54.268001556396484),(-59.689998626708984, -82.20500183105469, -53.178001403808594),(-60.340999603271484, -83.45899963378906, -53.125),(-58.26900100708008, -82.29499816894531, -53.6879997253418),(-58.19900131225586, -82.8759994506836, -54.974998474121094),(-57.983001708984375, -80.80400085449219, -53.79199981689453),(-57.66400146484375, -80.25, -52.470001220703125),(-56.564998626708984, -80.7750015258789, -51.81100082397461),(-55.832000732421875, -81.61399841308594, -52.30500030517578),(-56.35599899291992, -80.28099822998047, -50.54899978637695),(-57.10900115966797, -79.3290023803711, -49.89699935913086),(-56.82600021362305, -79.02300262451172, -48.737998962402344),(-58.20600128173828, -78.81199645996094, -50.659000396728516),(-58.439998626708984, -79.2750015258789, -51.88999938964844)]
twwUC_U_list_atoms = ["P","OP1","OP2","O5'","C5'","C4'","O4'","C3'","O3'","C2'","O2'","C1'","N1","C2","O2","N3","C4","O4","C5","C6"]
twwUC_U_list_atoms_elements = ["P","O","O","O","C","C","O","C","O","C","O","C","N","C","O","N","C","O","C","C"]
twwUC_U_list_positions = [(52.62300109863281, 163.0240020751953, 82.34400177001953),(51.7239990234375, 162.08799743652344, 81.61499786376953),(53.42900085449219, 164.01199340820312, 81.5770034790039),(51.78300094604492, 163.79200744628906, 83.45999908447266),(50.75199890136719, 163.1179962158203, 84.20899963378906),(50.0260009765625, 164.10000610351562, 85.0989990234375),(50.957000732421875, 164.63400268554688, 86.0719985961914),(49.45600128173828, 165.31300354003906, 84.38600158691406),(48.15599822998047, 165.03599548339844, 83.86499786376953),(49.45199966430664, 166.3730010986328, 85.48100280761719),(48.358001708984375, 166.2570037841797, 86.36199951171875),(50.731998443603516, 166.02000427246094, 86.24299621582031),(51.94499969482422, 166.73500061035156, 85.80999755859375),(51.97999954223633, 168.10499572753906, 85.95700073242188),(51.03900146484375, 168.75, 86.38800048828125),(53.1619987487793, 168.69700622558594, 85.58200073242188),(54.284000396728516, 168.07400512695312, 85.08399963378906),(55.29899978637695, 168.73300170898438, 84.87999725341797),(54.16299819946289, 166.66700744628906, 84.93399810791016),(53.02899932861328, 166.0590057373047, 85.28900146484375)]
twwUC_C_list_atoms = ["P","OP1","OP2","O5'","C5'","C4'","O4'","C3'","O3'","C2'","O2'","C1'","N1","C2","O2","N3","C4","N4","C5","C6"]
twwUC_C_list_atoms_elements = ["P","O","O","O","C","C","O","C","O","C","O","C","N","C","O","N","C","N","C","C"]
twwUC_C_list_positions = [(51.45399856567383, 178.2270050048828, 85.64199829101562),(51.52799987792969, 179.64500427246094, 85.23200225830078),(50.196998596191406, 177.45799255371094, 85.41799926757812),(52.66400146484375, 177.4499969482422, 84.95800018310547),(54.012001037597656, 177.9250030517578, 85.10099792480469),(54.97700119018555, 176.7689971923828, 85.08399963378906),(54.652000427246094, 175.8520050048828, 86.15599822998047),(54.93899917602539, 175.89599609375, 83.8489990234375),(55.71099853515625, 176.46200561523438, 82.81400299072266),(55.54800033569336, 174.60000610351562, 84.3550033569336),(56.957000732421875, 174.67799377441406, 84.4469985961914),(54.93899917602539, 174.52000427246094, 85.75399780273438),(53.6879997253418, 173.73199462890625, 85.80500030517578),(53.76300048828125, 172.32899475097656, 85.83999633789062),(54.87699890136719, 171.78500366210938, 85.81199645996094),(52.61399841308594, 171.60699462890625, 85.89900207519531),(51.43299865722656, 172.22999572753906, 85.91899871826172),(50.332000732421875, 171.48800659179688, 85.98400115966797),(51.32899856567383, 173.6479949951172, 85.87699890136719),(52.46799850463867, 174.35299682617188, 85.82099914550781)]
cww_AU_A = RNADiscrepancy.rawdata2nucleotide('A', cwwAU_A_list_atoms, cwwAU_A_list_atoms_elements, cwwAU_A_list_positions)
cww_AU_U = RNADiscrepancy.rawdata2nucleotide('U', cwwAU_U_list_atoms, cwwAU_U_list_atoms_elements, cwwAU_U_list_positions)
tww_UC_U = RNADiscrepancy.rawdata2nucleotide('U', twwUC_U_list_atoms, twwUC_U_list_atoms_elements, twwUC_U_list_positions)
tww_UC_C = RNADiscrepancy.rawdata2nucleotide('C', twwUC_C_list_atoms, twwUC_C_list_atoms_elements, twwUC_C_list_positions)
cww_AU = (cww_AU_A, cww_AU_U)
tww_UC = (tww_UC_U, tww_UC_C)
print(RNADiscrepancy.isodiscrepancy(cww_AU, tww_UC))
#10.189161019195952
```
