Metadata-Version: 2.4
Name: graphem-rapids
Version: 0.2.1
Summary: A graph embedding library with PyTorch and RAPIDS acceleration
Home-page: https://github.com/sashakolpakov/graphem-rapids
Author: Alexander Kolpakov (UATX), Igor Rivin (Temple University)
Author-email: Alexander Kolpakov <akolpakov@uaustin.org>, Igor Rivin <rivin@temple.edu>
Maintainer-email: Alexander Kolpakov <akolpakov@uaustin.org>
License: MIT
Project-URL: Homepage, https://github.com/sashakolpakov/graphem-rapids
Project-URL: Documentation, https://sashakolpakov.github.io/graphem-rapids/
Project-URL: Repository, https://github.com/sashakolpakov/graphem-rapids
Project-URL: Bug Reports, https://github.com/sashakolpakov/graphem-rapids/issues
Project-URL: Paper, https://arxiv.org/abs/2506.07435
Project-URL: Original GraphEm, https://github.com/sashakolpakov/graphem
Keywords: graph embedding,node influence,centrality measures,network analysis,force layout,PyTorch,CUDA,RAPIDS
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch>=2.0.0
Requires-Dist: numpy>=1.21.0
Requires-Dist: matplotlib>=3.5.0
Requires-Dist: networkx>=2.6.0
Requires-Dist: pandas>=1.3.0
Requires-Dist: plotly>=5.5.0
Requires-Dist: scipy>=1.7.0
Requires-Dist: ndlib>=5.1.0
Requires-Dist: loguru>=0.6.0
Requires-Dist: requests>=2.25.0
Requires-Dist: line_profiler>=4.0.0
Requires-Dist: snakeviz>=2.2.0
Requires-Dist: tensorboard>=2.10.0
Requires-Dist: tqdm>=4.66.0
Requires-Dist: pyinstrument>=5.0.0
Requires-Dist: tabulate>=0.9.0
Provides-Extra: cuda
Requires-Dist: cupy-cuda12x>=10.0.0; extra == "cuda"
Requires-Dist: pykeops>=2.1.0; extra == "cuda"
Provides-Extra: rapids
Requires-Dist: cudf-cu12; extra == "rapids"
Requires-Dist: cuml-cu12; extra == "rapids"
Requires-Dist: cuvs-cu12; extra == "rapids"
Requires-Dist: cupy-cuda12x>=10.0.0; extra == "rapids"
Requires-Dist: pykeops>=2.1.0; extra == "rapids"
Provides-Extra: docs
Requires-Dist: sphinx>=4.0.0; extra == "docs"
Requires-Dist: sphinx_rtd_theme>=1.0.0; extra == "docs"
Requires-Dist: sphinx-autodoc-typehints>=1.12.0; extra == "docs"
Requires-Dist: myst-parser>=0.17.0; extra == "docs"
Provides-Extra: test
Requires-Dist: pytest>=6.0.0; extra == "test"
Requires-Dist: pytest-cov>=2.12.0; extra == "test"
Requires-Dist: pytest-xdist>=2.3.0; extra == "test"
Provides-Extra: dev
Requires-Dist: pytest>=6.0.0; extra == "dev"
Requires-Dist: pytest-cov>=2.12.0; extra == "dev"
Requires-Dist: pytest-xdist>=2.3.0; extra == "dev"
Requires-Dist: sphinx>=4.0.0; extra == "dev"
Requires-Dist: sphinx_rtd_theme>=1.0.0; extra == "dev"
Requires-Dist: sphinx-autodoc-typehints>=1.12.0; extra == "dev"
Requires-Dist: myst-parser>=0.17.0; extra == "dev"
Requires-Dist: black>=22.0.0; extra == "dev"
Requires-Dist: isort>=5.10.0; extra == "dev"
Requires-Dist: flake8>=4.0.0; extra == "dev"
Requires-Dist: mypy>=0.950; extra == "dev"
Provides-Extra: all
Requires-Dist: cudf-cu12; extra == "all"
Requires-Dist: cuml-cu12; extra == "all"
Requires-Dist: cuvs-cu12; extra == "all"
Requires-Dist: cupy-cuda12x>=10.0.0; extra == "all"
Requires-Dist: pykeops>=2.1.0; extra == "all"
Requires-Dist: sphinx>=4.0.0; extra == "all"
Requires-Dist: sphinx_rtd_theme>=1.0.0; extra == "all"
Requires-Dist: sphinx-autodoc-typehints>=1.12.0; extra == "all"
Requires-Dist: myst-parser>=0.17.0; extra == "all"
Requires-Dist: pytest>=6.0.0; extra == "all"
Requires-Dist: pytest-cov>=2.12.0; extra == "all"
Requires-Dist: pytest-xdist>=2.3.0; extra == "all"
Dynamic: author
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

<p align="center">
  <img src="images/logo.png" alt="graphem rapids logo" height="120"/>
</p>

<h1 align="center">GraphEm Rapids: High-Performance Graph Embedding</h1>

<p align="center">
  <a href="https://opensource.org/licenses/MIT">
    <img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="License: MIT"/>
  </a>
  <a href="https://www.python.org/downloads/">
    <img src="https://img.shields.io/badge/python-3.8+-blue.svg" alt="Python 3.8+"/>
  </a>
  <a href="https://pytorch.org/">
    <img src="https://img.shields.io/badge/PyTorch-2.0+-red.svg" alt="PyTorch 2.0+"/>
  </a>
  <a href="https://rapids.ai/">
    <img src="https://img.shields.io/badge/RAPIDS-cuVS-76B900.svg" alt="RAPIDS cuVS"/>
  </a>
  <a href="https://pepy.tech/projects/graphem-rapids">
    <img alt="Pepy Total Downloads" src="https://img.shields.io/pepy/dt/graphem-rapids">
  </a>
</p>

High-performance [GraphEm](https://github.com/sashakolpakov/graphem) implementation using PyTorch and RAPIDS cuVS. Force-directed layout with geometric intersection detection produces embeddings that correlate strongly with centrality measures.

## Features

- **Unified API**: Scipy sparse adjacency matrices, sklearn-style parameters (`n_components`, `n_neighbors`)
- **Multiple Backends**: PyTorch (1K-100K vertices), RAPIDS cuVS (100K+ vertices), automatic selection
- **GPU Acceleration**: CUDA support, memory-efficient chunking, automatic CPU fallback
- **Graph Generators**: Erdős-Rényi, scale-free, SBM, bipartite, Delaunay, and more
- **Influence Maximization**: Fast embedding-based seed selection

## Installation

```bash
pip install graphem-rapids              # PyTorch backend
pip install graphem-rapids[cuda]        # + CUDA support
pip install graphem-rapids[rapids]      # + RAPIDS cuVS
pip install graphem-rapids[all]         # Everything
```

## Quick Start

```python
import graphem_rapids as gr

# Generate graph (returns sparse adjacency matrix)
adjacency = gr.generate_er(n=1000, p=0.01)

# Create embedder (automatic backend selection)
embedder = gr.create_graphem(adjacency, n_components=3)

# Run layout
embedder.run_layout(num_iterations=50)

# Get positions and visualize
positions = embedder.get_positions()  # numpy array (n, d)
embedder.display_layout()             # 2D or 3D plot
```

## Backend Selection

### Automatic (Recommended)
```python
embedder = gr.create_graphem(adjacency, n_components=3)
```

### Explicit PyTorch
```python
embedder = gr.GraphEmbedderPyTorch(
    adjacency, n_components=3, device='cuda',
    L_min=1.0, k_attr=0.2, k_inter=0.5, n_neighbors=10,
    batch_size=None  # Automatic (or manual: 1024)
)
```

### Explicit RAPIDS cuVS
```python
embedder = gr.GraphEmbedderCuVS(
    adjacency, n_components=3,
    index_type='auto',  # 'brute_force', 'ivf_flat', 'ivf_pq'
    sample_size=1024, batch_size=None
)
```

**Index Types**: `brute_force` (<100K), `ivf_flat` (100K-1M), `ivf_pq` (>1M vertices)

### Check Backends
```python
info = gr.get_backend_info()
print(f"CUDA: {info['cuda_available']}, Recommended: {info['recommended_backend']}")
```

## Configuration

**Environment Variables:**
```bash
export GRAPHEM_BACKEND=pytorch        # Force backend
export GRAPHEM_PREFER_GPU=true        # Prefer GPU
export GRAPHEM_MEMORY_LIMIT=8         # GB
export GRAPHEM_VERBOSE=true
```

**Programmatic:**
```python
from graphem_rapids.utils.backend_selection import BackendConfig, get_optimal_backend

config = BackendConfig(n_vertices=50000, force_backend='cuvs', memory_limit=16.0)
backend = get_optimal_backend(config)
embedder = gr.create_graphem(adjacency, backend=backend)
```

## Graph Generators

All generators return scipy sparse adjacency matrices:

```python
# Random
gr.generate_er(n=1000, p=0.01, seed=42)
gr.generate_random_regular(n=100, d=3, seed=42)

# Scale-free & small-world
gr.generate_ba(n=300, m=3, seed=42)             # Barabási-Albert
gr.generate_ws(n=1000, k=6, p=0.3, seed=42)     # Watts-Strogatz
gr.generate_scale_free(n=100, seed=42)

# Community structures
gr.generate_sbm(n_per_block=75, num_blocks=4, p_in=0.15, p_out=0.01, seed=42)
gr.generate_caveman(l=10, k=10)
gr.generate_relaxed_caveman(l=10, k=10, p=0.1, seed=42)

# Bipartite
gr.generate_bipartite_graph(n_top=50, n_bottom=100, p=0.2, seed=42)
gr.generate_complete_bipartite_graph(n_top=50, n_bottom=100)

# Geometric
gr.generate_geometric(n=100, radius=0.2, dim=2, seed=42)
gr.generate_delaunay_triangulation(n=100, seed=42)
gr.generate_road_network(width=30, height=30)   # 2D grid

# Trees
gr.generate_balanced_tree(r=2, h=10)
```

## Influence Maximization

```python
adjacency = gr.generate_er(n=1000, p=0.01)
embedder = gr.create_graphem(adjacency, n_components=3)
embedder.run_layout(num_iterations=50)

# Fast: embedding-based selection
seeds = gr.graphem_seed_selection(embedder, k=10)

# Evaluate with Independent Cascade model
import networkx as nx
G = nx.from_scipy_sparse_array(adjacency)
influence, _ = gr.ndlib_estimated_influence(G, seeds, p=0.1, iterations_count=100)

# Compare with greedy (slow, optimal)
greedy_seeds, _ = gr.greedy_seed_selection(G, k=10, p=0.1)
```

## Advanced

### Memory Management
```python
from graphem_rapids.utils.memory_management import MemoryManager, get_gpu_memory_info

mem_info = get_gpu_memory_info()
print(f"GPU: {mem_info['free']:.1f}GB free / {mem_info['total']:.1f}GB total")

adjacency = gr.generate_er(n=1000, p=0.01)
with MemoryManager(cleanup_on_exit=True):
    embedder = gr.create_graphem(adjacency)
    embedder.run_layout(50)
```

### Batch Size Tuning
```python
from graphem_rapids.utils.memory_management import get_optimal_chunk_size

adjacency = gr.generate_er(n=1000, p=0.01)

# Automatic (recommended)
embedder = gr.GraphEmbedderPyTorch(adjacency, batch_size=None)

# Manual
embedder = gr.GraphEmbedderPyTorch(adjacency, batch_size=1024)

# Programmatic
optimal = get_optimal_chunk_size(n_vertices=1000000, n_components=3, backend='pytorch')
embedder = gr.GraphEmbedderPyTorch(adjacency, batch_size=optimal)
```

## Testing & Benchmarking

```bash
pytest                                          # Run all tests
pytest tests/test_pytorch_backend.py            # Specific backend
python benchmarks/run_benchmarks.py             # Performance tests
python benchmarks/compare_backends.py --sizes 1000,10000,100000
```

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, testing, and contribution guidelines.

## Citation

[![arXiv](https://img.shields.io/badge/arXiv-2506.07435-b31b1b.svg)](https://arxiv.org/abs/2506.07435)

```bibtex
@misc{kolpakov-rivin-2025fast,
  title={Fast Geometric Embedding for Node Influence Maximization},
  author={Kolpakov, Alexander and Rivin, Igor},
  year={2025},
  eprint={2506.07435},
  archivePrefix={arXiv},
  primaryClass={cs.SI},
  url={https://arxiv.org/abs/2506.07435}
}
```

## License

MIT License - see [LICENSE](LICENSE) file.
