Metadata-Version: 2.4
Name: r-bridge
Version: 0.8.0
Summary: Call a persistent R interpreter from Python
Project-URL: Homepage, https://gitlab.com/emzed3/r_bridge
Project-URL: Documentation, https://r-bridge.readthedocs.io
Project-URL: Repository, https://gitlab.com/emzed3/r_bridge
Project-URL: Issues, https://gitlab.com/emzed3/r_bridge/-/issues
Project-URL: Changelog, https://gitlab.com/emzed3/r_bridge/-/releases
Author-email: Uwe Schmitt <uwe.schmitt@id.ethz.ch>
License: MIT License
        
        Copyright (c) 2026 Uwe Schmitt
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: R,bridge,data science,interoperability,numpy,pandas,statistics,subprocess
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: numpy>=1.24
Requires-Dist: pandas>=2.0
Provides-Extra: dev
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest-timeout>=2.3; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: zensical; extra == 'dev'
Description-Content-Type: text/markdown

# 🌉 r_bridge

> Call a persistent R interpreter from Python — with a Pythonic API, full numpy/pandas support, and zero ceremony.

[![Pipeline](https://gitlab.com/emzed3/r_bridge/badges/main/pipeline.svg)](https://gitlab.com/emzed3/r_bridge/-/pipelines)
[![Release](https://gitlab.com/emzed3/r_bridge/-/badges/release.svg)](https://gitlab.com/emzed3/r_bridge/-/releases)
[![PyPI](https://img.shields.io/pypi/v/r-bridge)](https://pypi.org/project/r-bridge/)
[![Python](https://img.shields.io/pypi/pyversions/r-bridge)](https://pypi.org/project/r-bridge/)
[![License](https://img.shields.io/pypi/l/r-bridge)](LICENSE)
[![Coverage](https://gitlab.com/emzed3/r_bridge/badges/main/coverage.svg)](https://gitlab.com/emzed3/r_bridge/-/pipelines)

📚 [Documentation](https://r-bridge.readthedocs.io) · 🔍 [Example script](example.py)

```python
from r_bridge import RBridge

with RBridge() as r:
    r.x = [1, 2, 3, 4, 5]
    print(r.mean(r.x))   # 3.0
```

---

## ✨ Why r_bridge?

Python and R each have unique strengths. Rather than rewriting R code or
shelling out to `Rscript` for every call, r_bridge keeps a **single R process
alive** for the lifetime of your session. Calls are fast, state is shared, and
the API feels natural on both sides.

- 🚀 **No subprocess-per-call overhead** — R starts once, stays running
- 🔄 **Automatic type conversion** — Python lists, numpy arrays, pandas DataFrames, datetimes all round-trip transparently
- 🔒 **Thread-safe** — a single lock serialises concurrent calls; safe to use from threads
- 🛡️ **Robust** — a sentinel-prefixed protocol means stray `cat()` output never corrupts the stream

---

## 📦 Installation

Requires Python ≥ 3.11 and R installed on your PATH.

```bash
pip install r-bridge
```

> **🪟 Windows note:** R 4.2+ (UCRT build) is required for correct UTF-8 handling.

---

## 🚀 Quick start

```python
from r_bridge import RBridge
import pandas as pd
import numpy as np

with RBridge() as r:
    # Set and get variables
    r.name = "world"
    print(r.paste("hello", r.name))   # "hello world"

    # Vectors and numpy arrays
    r.v = np.linspace(0, 1, 100)
    print(r.mean(r.v))                # 0.5

    # DataFrames
    r.df = pd.DataFrame({"x": [1, 2, 3], "y": [4.0, 5.0, 6.0]})
    result = r.df                     # returns a pandas DataFrame

    # Arbitrary R expressions
    r.eval("model <- lm(y ~ x, data=df)")
    coefs = r.coef(r.model)

    # Keyword arguments become named R arguments
    r.seq(1, 10, by=2)               # [1, 3, 5, 7, 9]
```

---

## 📖 API reference

### `RBridge(...)` — constructor

| Parameter | Default | Description |
|---|---|---|
| `r_executable` | `None` | Path to `Rscript`; auto-detected from `PATH` if omitted |
| `startup_timeout` | `15.0` | Seconds to wait for R ready signal |
| `call_timeout` | `None` | Per-call timeout in seconds (`None` = no timeout) |
| `env` | `{}` | Extra environment variables for the R subprocess |
| `r_libs` | `[]` | Paths prepended to `R_LIBS_USER` |
| `log_level` | `"WARNING"` | Python logging level for the `r_bridge` logger |
| `verbose` | `False` | Print all R calls and R output to stderr in real time |
| `capture_output` | `False` | Accumulate non-protocol R stdout (from `cat()` etc.) and print to stdout; access via `last_stdout` |

### Attribute access (primary interface)

```python
r.x = value          # set a variable in R's global environment
value = r.x          # get a variable from R
r.mean([1, 2, 3])    # call an R function; returns a Python value
```

### Explicit methods

```python
r.set("x", value)                          # set with optional type_hint
r.get("x", result_type_hint="scalar")      # get with type conversion hint
r.call("mean", [1,2,3], timeout=5.0)       # call with per-call timeout
r.eval("x <- 1 + 1", result_type_hint=None)
r.ls()                                     # list names in R global env
r.ping()                                   # round-trip latency in seconds
r.last_stderr                              # accumulated R stderr as string
r.last_stdout                              # accumulated non-protocol R stdout as string
```

### `result_type_hint` values

| Hint | Effect |
|---|---|
| `"scalar"` | Return first element as a Python scalar |
| `"numpy"` | Return as `numpy.ndarray` |
| `"pandas"` | Return as `pandas.Series` or `DataFrame` |
| `"list"` | Return as Python `list` |
| `"raw"` | Return the raw tagged JSON dict, no conversion |

---

## 🔀 Type conversion

| Python → | → R |
|---|---|
| `int`, `float`, `bool`, `str`, `None` | scalar atomic / `NULL` |
| `list` (homogeneous) | atomic vector |
| `dict` | named `list()` |
| `numpy.ndarray` 1-D | atomic vector |
| `numpy.ndarray` 2-D | `matrix` |
| `pandas.DataFrame` | `data.frame` |
| `pandas.Categorical` | `factor` |
| `datetime.datetime` | `POSIXct` |
| `datetime.date` | `Date` |

↕️ All conversions are **bidirectional**. Special R values (`NA`, `NaN`, `Inf`, `factor`, `matrix`, …) all round-trip correctly.

---

## 🔍 Verbose mode

Set `verbose=True` to trace every R call and see R's output in real time — great for debugging.

```python
with RBridge(verbose=True) as r:
    r.x = [1, 2, 3]
    r.mean(r.x)
```

```
[R set]    x = {'__type__': 'integer_vector', 'value': [1, 2, 3]}
[R call]   mean([...])
[R stdout] ...    ← from cat() in R
[R stderr] ...    ← from message() or warnings in R
```

---

## ⚠️ Error handling

```python
from r_bridge.exceptions import RError, RTimeoutError

try:
    r.eval("stop('something went wrong')")
except RError as e:
    print(e.message)    # "something went wrong"
    print(e.warnings)   # list of captured R warnings
    print(e.traceback)  # R traceback as list of strings

try:
    r.eval("Sys.sleep(99)", timeout=1.0)
except RTimeoutError:
    print("R took too long")
```

### Exception hierarchy

```
RBridgeError          base class
├── RStartupError     R failed to start
├── RTimeoutError     call exceeded timeout
├── RError            R-side error (carries .message, .call, .traceback, .warnings)
└── RProtocolError    malformed response envelope
```

---

## 🔌 Custom type converters

Register serializers and deserializers without modifying the library:

```python
from r_bridge import register_serializer, register_deserializer

register_serializer(MyType, lambda obj: {"__type__": "mytype", "value": obj.to_dict()})
register_deserializer("mytype", lambda d: MyType.from_dict(d["value"]))
```

---

## ⚙️ How it works

r_bridge spawns `Rscript` once and communicates over **stdin/stdout** using
line-delimited JSON with a `__RBRIDGE__:` sentinel prefix on every protocol
line. Stray `cat()` or `print()` output is captured and logged rather than
corrupting the stream.

```
🐍 Python ──► __RBRIDGE__:{"id":"…","op":"call_func","payload":{…}}
📊 R      ──► __RBRIDGE__:{"id":"…","status":"ok","payload":{…},"warnings":[]}
```

A dedicated daemon thread drains R's stderr continuously to prevent
pipe-buffer deadlocks. A single `threading.Lock` serialises concurrent Python
calls.

---

## 🧪 Development

```bash
uv run pytest                          # all tests
uv run pytest tests/test_bridge.py -v  # integration tests (spawns real R)
uv run pytest tests/test_serializer.py # unit tests (no R needed)
```

---

## 📄 License

MIT
