Metadata-Version: 2.4
Name: gpu-coloc
Version: 0.1.2
Summary: Ultra-fast GPU-enabled Bayesian colocalisation
Home-page: https://github.com/mjesse-github/gpu-coloc
Author: Mihkel Jesse
License: MIT
Requires-Python: >=3.12
Description-Content-Type: text/markdown
Requires-Dist: filelock>=3.17.0
Requires-Dist: fsspec>=2025.2.0
Requires-Dist: Jinja2>=3.1.5
Requires-Dist: MarkupSafe>=3.0.2
Requires-Dist: mpmath>=1.3.0
Requires-Dist: networkx>=3.4.2
Requires-Dist: numpy>=2.2.3
Requires-Dist: pandas>=2.2.3
Requires-Dist: pyarrow>=19.0.0
Requires-Dist: python-dateutil>=2.9.0.post0
Requires-Dist: pytz>=2025.1
Requires-Dist: six>=1.17.0
Requires-Dist: sympy>=1.13.1
Requires-Dist: torch>=2.0
Requires-Dist: tqdm>=4.67.1
Requires-Dist: typing_extensions>=4.12.2
Requires-Dist: tzdata>=2025.1
Dynamic: author
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# gpu-coloc

**gpu-coloc** is a GPU-accelerated Bayesian colocalization implementation (COLOC), delivering identical results to R's `coloc.bf_bf` approximately 1000 times faster.

## Citation

If you use **gpu-coloc**, please cite: *(citation placeholder)*

## Installation

Install via pip (Python ≥3.12):

```bash
pip install gpu-coloc
```

## Verify Installation

To confirm installation, clone this repository:

```bash
git clone https://github.com/mjesse-github/gpu-coloc
cd gpu-coloc
bash test.sh
```

This creates an `example/` directory containing an `example_results.tsv` file.

## Workflow

**Note:** Paths assume gpu-coloc is in the working directory; adjust paths if necessary.

### Variant Naming Convention

Variants must follow the naming format: `chr[chromosome]_[position]_[ref]_[alt]`. Ensure renaming is completed before Step 1. Use chromosome X, not 23.

### 1. Prepare Signals and Summary Files

* **Signal Files:** Save signals as `[signal].pickle`, containing variants and their log Bayes Factors (lbf).

Example format:

```
variant	chrX_153412224_C_A	chrX_153412528_C_T	...
lbf	-0.060991	-1.508802	...
```

* **Summary File:** Tab-separated, structured as:

```
signal	chromosome	location_min	location_max	signal_strength	lead_variant
QTD000141_ENSG00000013563_L1	X	153412224	155341332	12.1069377174147	chrX_154403855_T_G
...
```

Naming examples:

* Summary: `gwas_summary.tsv`
* Signals directory: `gwas_signals/[signal].pickle`

See scripts in `summary_and_signals_examples/` for reference; modifications may be necessary.

### 2. Format Data

```bash
gpu-coloc --format --input [path_to_signals] --input_summary [summary_file] --output [output_folder]
```

### 3. Run Colocalization

```bash
gpu-coloc coloc.py --run --dir1 [formatted_dataset_1] --dir2 [formatted_dataset_2] --results [results_output] --p12 1e-6 --H4 0.8
```
