Metadata-Version: 2.1
Name: NoMoCLIP
Version: 1.0.0
Summary: Interpretable Modeling of RNA–Protein Interactions from eCLIP-Seq Profiles for Motif-Free RBPs
Author-email: "Yuning Yang, Qiongyao Yu, Xinyuan Zhao and Xiangtao Li" <yangyn533@nenu.edu.cn>
License: MIT License
        
        Copyright (c) 2025 yangyn533
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Keywords: Deep learning,RNA-binding protein,Post-transcriptional regulation
Classifier: Development Status :: 5 - Production/Stable
Classifier: Operating System :: OS Independent
Classifier: License :: OSI Approved :: MIT License
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Requires-Python: >=3.7.16
Description-Content-Type: text/markdown
License-File: LICENSE

# NoMoCLIP
Interpretable Modeling of RNA–Protein Interactions from eCLIP‑Seq Profiles for Motif‑Free RBPs

## 1. Data availability
[NoMoCLIP_dataset](https://doi.org/10.6084/m9.figshare.30178051)

## 2. Environment Setup
#### 2.1 Create and activate a new virtual environment
```
conda create -n NoMoCLIP python=3.7.16 
conda activate NoMoCLIP
```
#### 2.2 Install the package and other requirements
```
pip install NoMoCLIP
nomoclip install
```
## 3. Process data

#### 3.1 Sequential encoding
```
nomoclip run position_inf  --set_path <PATH_TO_YOUR_DATA>  --out_path <PATH_TO_YOUR_OUTPUT_DIRECTORY>
```

#### 3.2 Structural encoding
This feature requires the **RNAplfold** tool, which is executed in a **Python 2.7 environment**. Please set the --env parameter to the local RNAplfold environment.
```
nomoclip run structure_inf  --env <NAME_OF_YOUR_ENV>  --set_path <PATH_TO_YOUR_DATA>  --out_path <PATH_TO_YOUR_OUTPUT_DIRECTORY>
```

#### 3.3 Semantic encoding
```
nomoclip run attention_graph \
  --kmer 1 \
  --set_path <PATH_TO_YOUR_DATA> \
  --out_path <PATH_TO_YOUR_OUTPUT_DIRECTORY> \
  --model_type <PATH_TO_YOUR_NLP_MODEL> \ 
  --maxlen 101 \
  --device cuda:1 \
  --device1 cuda:1 \
  --device2 cuda:1 
```
#### 3.4 Functional properties

For this feature, you need to use the [corain](https://github.com/idrblab/corain?tab=readme-ov-file#requirements-and-installment). Please set the --env parameter to the local corain environment.

```
nomoclip run instinct_inf \
  --env <NAME_OF_YOUR_ENV> \
  --base_path <PATH_TO_YOUR_DATA> \
  --set_path <PATH_TO_YOUR_INTERMEDIATE_OUTPUT_DIRECTORY> \
  --out_path <PATH_TO_YOUR_OUTPUT_DIRECTORY> \
  --method_path <PATH_TO_YOUR_CORAIN_DIRECTORY> \ 
  --num 2
```
**Note:** The argument `--num` should be tested with all values in `[2, 3, 5, 7, 10]`.

## 4. Training Process
```
nomoclip run model_train \
  --base_path <PATH_TO_YOUR_DATA_DIRECTORY> \
  --set_path <PATH_TO_YOUR_FEATURE_DIRECTORY> \
  --out_path <PATH_TO_YOUR_OUTPUT_DIRECTORY> \
  --fold 5  \
  --gpu_id 1
```

## 5. Prediction
```
nomoclip run model_predict \
  --set_path <PATH_TO_YOUR_FEATURE_DIRECTORY> \
  --out_path <PATH_TO_YOUR_OUTPUT_DIRECTORY> \
  --model_path <PATH_TO_YOUR_MODEL> \
  --gpu_id 1
```
## 🧬 Motif analysis

Motif extraction requires the installation of the **[MEME Suite](https://meme-suite.org/meme/doc/download.html)** package.

#### 6.1 Sequential motifs

```
nomoclip run seq_motifs \
  --layer <THE_LAYER_OF_MODEL_YOU_SELECTED> \
  --set_path <PATH_TO_YOUR_FEATURE_DIRECTORY> \
  --out_path <PATH_TO_YOUR_OUTPUT_DIRECTORY> \
  --model_path <PATH_TO_YOUR_MODEL> \
  --pwm_path <PATH_TO_YOUR_PWM_FILE> \
  --motif_size 7 \
  --gpu_id 1
```

#### 6.2 Structural motifs

```
nomoclip run structure_motifs \
  --layer <THE_LAYER_OF_MODEL_YOU_SELECTED> \
  --set_path <PATH_TO_YOUR_FEATURE_DIRECTORY> \
  --out_path <PATH_TO_YOUR_OUTPUT_DIRECTORY> \
  --model_path <PATH_TO_YOUR_MODEL> \
  --motif_size 7 \
  --gpu_id 1
```

## 📊 High attention regions

```
nomoclip run high_attention_region \
  --set_path <PATH_TO_YOUR_FEATURE_DIRECTORY> \
  --out_path <PATH_TO_YOUR_OUTPUT_DIRECTORY> \
  --model_path <PATH_TO_YOUR_MODEL> \
  --gpu_id 1
```

