Metadata-Version: 2.4
Name: safepyrun
Version: 0.1.4
Summary: Safe(ish) running of python code
Author-email: Jeremy Howard <github@jhoward.fastmail.fm>
License: Apache-2.0
Project-URL: Repository, https://github.com/AnswerDotAI/safepyrun
Project-URL: Documentation, https://AnswerDotAI.github.io/safepyrun/
Keywords: nbdev
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: restrictedpython-async
Requires-Dist: restrictedpython
Requires-Dist: fastaudit>=0.2.1
Requires-Dist: pyskills>=0.0.9
Requires-Dist: fastcore>=1.12.27
Requires-Dist: httpx
Requires-Dist: matplotlib>=3.10.8
Provides-Extra: dev
Requires-Dist: numpy; extra == "dev"
Requires-Dist: matplotlib; extra == "dev"
Dynamic: license-file

# safepyrun


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

*safepyrun* is an allowlist-based Python sandbox that lets LLMs execute
code safely(ish) in your real environment. Instead of isolating code in
a container (which cuts it off from the libraries, data, and tools it
actually needs) safepyrun runs in-process with controlled access to a
curated subset of Python’s stdlib, plus any functions you explicitly opt
in.

It’s the Python counterpart to
[safecmd](https://github.com/AnswerDotAI/safecmd), which does much the
same thing for bash.

## Installation

Install from [pypi](https://pypi.org/project/safepyrun/)

``` sh
$ pip install safepyrun
```

## Background

When an LLM needs to run code on your behalf, the standard advice is to
sandbox it in a container. The problem is that the whole reason you want
the LLM running code is so it can interact with your environment – your
files, your libraries, your running processes, your data. A
containerised sandbox either can’t access any of that, or it requires
complex volume mounts and dependency mirroring that recreate your
environment inside the container.

You could just `exec` the LLM’s code directly in your process, which
would give full access to everything… but “everything” includes
[`shutil.rmtree`](https://docs.python.org/3/library/shutil.html#shutil.rmtree),
[`os.remove`](https://docs.python.org/3/library/os.html#os.remove),
`subprocess.run("rm -rf /")`, etc!

safepyrun takes a middle path. It runs the LLM’s code in your real
Python process, with access to your real objects, but interposes an
allowlist that controls which callables are accessible. The curated
default list covers a large and useful subset of the standard library
(string manipulation, math, JSON parsing, path inspection, data
structures, and so on) while excluding anything that writes to the
filesystem, spawns processes, or modifies system state. You can extend
the list for your own functions.

The mechanism behind safepyrun is
[RestrictedPython](https://restrictedpython.readthedocs.io/), a
long-standing project that compiles Python source code into a modified
AST (Abstract Syntax Tree) where every attribute access, item access,
and iteration is routed through hook functions. This means that when the
LLM’s code does `obj.method()`, it doesn’t go directly to `method` – it
goes through a gatekeeper that checks whether that callable is on the
allowlist. The same applies to `getattr`, `getitem`, and `iter`, so
there’s no easy way to accidentally reach a dangerous function through
indirect access. safepyrun supplies these hook functions, wiring them up
to an allowlist of permitted callables.

Because a lot of modern Python code (and many LLM tool-calling
frameworks) is async, safepyrun also depends on
[restrictedpython-async](https://github.com/AnswerDotAI/restrictedpython-async),
which extends RestrictedPython to handle `await`, `async for`, and
`async with` expressions.

A lot of the online discussion around RestrictedPython suggests it’s not
really useful for sandboxing, and that’s true if you’re trying to block
a determined adversary. But an LLM is not a determined adversary. It’s a
well-meaning but occasionally clumsy collaborator. The threat model is
completely different: you don’t need to prevent deliberate escape
attempts, you need to make it very unlikely that a hallucinated cleanup
step or a misunderstood request causes damage. This is the same
“safe-ish” philosophy used in
[safecmd](https://github.com/AnswerDotAI/safecmd) for bash.

Once you internalise this, the design space opens up. It’s actually fine
for the LLM to read files, access the internet via `httpx`, parse data,
and call into your libraries. The things you want to prevent are writes
to the filesystem, spawning processes, and overwriting important state.
RestrictedPython gives us the mechanism to enforce this: it rewrites the
AST to intercept attribute access, iteration, and item access, so that
every callable goes through an allowlist check.

The allowlist has two tiers. First, a curated subset of the standard
library that has been audited once so every user doesn’t have to repeat
the work: things like `re`, `json`, `itertools`, `math`, `collections`,
`pathlib` (read-only methods), and many more. Second, user-extended
functions registered via `allow()`, so you can opt in your own project’s
functions and methods. Symbols the LLM creates are exported back to the
caller’s namespace by default, unless they would shadow an existing
callable or module. Names ending with `_` (like `result_`) are always
exported, even if they shadow. Exported callables must still be
registered with `allow()` to be callable in subsequent sandbox calls.

## Usage

``` python
from safepyrun import *
from pyskills import *
```

The main entry point is `pyrun = RunPython()`, which returns an async
function that takes a string of Python code and executes it in the
sandbox. The last expression in the code is returned as the result, and
any `print()` output is captured separately. Errors are caught and
reported rather than crashing the caller.

``` python
pyrun = RunPython()
```

``` python
await pyrun('1+1')
```

    2

You can mix `print()` output with a return value. The printed output
goes to the `stdout` key, and the last expression becomes `result`:

``` python
await pyrun('print("hello"); 1+1')
```

    hello

    2

Modules can be imported. stderr is also captured:

``` python
await pyrun('''
import warnings
warnings.warn('a warning')
"ok"
''')
```

    <pyrun_3>:2: UserWarning: a warning
      warnings.warn('a warning')

    'ok'

A large subset of the standard library is available out of the box –
things like `re`, `json`, `math`, `itertools`, `collections`, `pathlib`
(read-only methods), and many more. These have been audited once so that
every user doesn’t have to repeat the work:

``` python
await pyrun('import re; re.findall(r"\\d+", "there are 3 cats and 10 dogs")')
```

    ['3', '10']

The default allowlist covers text and data processing (`re`, `json`,
`csv`, `html`, `textwrap`, `string`, `difflib`, `unicodedata`), math and
numerics (`math`, `cmath`, `statistics`, `decimal`, `fractions`,
`random`, `operator`), data structures (`collections`, `heapq`,
`bisect`, plus methods on all the built-in types), iteration and
functional tools (`itertools`, `functools`), read-only filesystem access
(`pathlib`,
[`os.path`](https://docs.python.org/3/library/os.path.html#module-os.path),
`fnmatch`), date and time (`datetime`, `time`), URL handling and
read-only HTTP
([`urllib.parse`](https://docs.python.org/3/library/urllib.parse.html#module-urllib.parse),
`httpx.get`, `ipaddress`), encoding and serialization (`base64`,
`binascii`, `hashlib`, `zlib`, `pickle`, `struct`), introspection
(`inspect`, `ast`, `keyword`,
[`sys.getsizeof`](https://docs.python.org/3/library/sys.html#sys.getsizeof)),
XML parsing
([`xml.etree.ElementTree`](https://docs.python.org/3/library/xml.etree.elementtree.html#module-xml.etree.ElementTree)),
and various utilities (`contextlib`, `copy`, `dataclasses`, `enum`,
`secrets`, `uuid`, `pprint`, `shlex`, `colorsys`, `traceback`).

### The `allow()` function

Functions you define yourself or import from third-party packages are
not automatically available. If the sandbox encounters an unregistered
callable, it raises an error.

To make a function available, register it with `allow()`:

``` python
def greet(name): return f"Hello, {name}!"
```

``` python
allow(greet) # Or use @allow decorator
await pyrun('greet("World")')
```

    'Hello, World!'

The same applies to anything you import from PyPI. For instance, if you
wanted the LLM to be able to call
[`numpy.array`](https://numpy.org/doc/stable/reference/generated/numpy.array.html#numpy.array),
you would register it with `allow('numpy.array')`.

`allow()` accepts two forms: strings and dicts. The simplest form is a
bare string, which registers a single name. This works for standalone
functions in the caller’s namespace:

``` python
@allow
def double(x): return x * 2
await pyrun('double(21)')
```

    42

For methods on modules or classes, use dotted string syntax. The string
should match how the sandbox will look up the callable, which is
`ClassName.method` or `module.function`:

``` python
import numpy as np
```

``` python
allow(np.array, np.ndarray.sum)
await pyrun('np.array([1,2,3]).sum()')
```

    np.int64(6)

Note that the string must use the actual class or module name as it
appears in Python, not the alias. In the example above, even though the
sandbox code uses `np`, the allowlist entry is `'numpy.array'` because
`numpy` is the module’s real name.

The dict form is a convenient shorthand for registering multiple methods
on the same module or class at once. The key is the actual module or
class object, and the value is a list of method name strings:

``` python
allow({np.ndarray: ['mean', 'reshape', 'tolist']})
await pyrun('np.array([1,2,3,4]).reshape(2,2).mean()')
```

    np.float64(2.5)

The dict form does two things: it registers the class/module name itself
(so it can be called as a constructor or accessed as a namespace), and
it registers each `ClassName.method` pair. You can mix strings and dicts
in a single `allow()` call:

``` python
allow('my_func', {np.linalg: ['norm', 'det']})
```

### The `_` suffix export convention

All symbols created in the sandbox are exported back to the caller’s
namespace by default — unless the name already exists and the new value
is callable or a module (to prevent accidental shadowing). Names ending
with `_` (but not starting with `_`) are always exported regardless,
even if they shadow. Note that exported callables are **not**
automatically available to call in subsequent sandbox runs — they must
still be registered with `allow()` to be callable. Non-callable exports
(variables, data structures) are available immediately:

``` python
await pyrun('result_ = [x**2 for x in range(5)]')
```

``` python
result_
```

    [0, 1, 4, 9, 16]

The exported symbols are real objects in your namespace:

``` python
await pyrun('counts_ = {"a": 1, "b": 2}')
counts_
```

    {'a': 1, 'b': 2}

This is particularly useful in LLM tool loops where the model might need
to accumulate results across steps. The `_` suffix is only needed when
you want to force-export a name that would otherwise be blocked (because
it shadows an existing callable or module).

### Async support

The sandbox is async-native. If the code being executed contains
`await`, `async for`, or `async with` expressions, they work as
expected. Many modern Python libraries and LLM tool-calling frameworks
are async, and you want the sandbox to be able to call into them without
workarounds.

``` python
await pyrun('''
import asyncio
async def fetch(n): return n * 10
await asyncio.gather(fetch(1), fetch(2), fetch(3))
''')
```

    [10, 20, 30]

## Writable path permissions

By default,
[`RunPython()`](https://AnswerDotAI.github.io/safepyrun/core.html#runpython)
allows writes to the current working directory (`.`) and `/tmp`, and
blocks writes elsewhere. You can pass `ok_dests` to restrict writes to a
different set of directory prefixes:

``` python
pyrun2 = RunPython(ok_dests=['/tmp'])
```

``` python
from pathlib import Path
```

``` python
await pyrun2("Path('/tmp/test_write.txt').write_text('hello')")
```

    5

``` python
try: await pyrun2("Path('/etc/evil.txt').write_text('bad')")
except PermissionError as e: print(f'Blocked: {e}')
```

    Blocked: Dest '/etc/evil.txt' not allowed; permitted: ('/tmp',)

The same permission checking applies to `open()` in write mode, not just
`Path` methods:

``` python
await pyrun2("open('/tmp/test_open.txt', 'w').write('hi')")
```

    2

``` python
try: await pyrun2("open('/root/bad.txt', 'w')")
except PermissionError as e: print(f'Blocked: {e}')
```

    Blocked: Dest '/root/bad.txt' not allowed; permitted: ('/tmp',)

Read access is unaffected — only writes are gated:

``` python
await pyrun2("open('/etc/passwd', 'r').read(10)")
```

    '##\n# User '

Higher-level file operations like
[`shutil.copy`](https://docs.python.org/3/library/shutil.html#shutil.copy)
are also intercepted. The destination is checked against `ok_dests`:

``` python
await pyrun2("import shutil; shutil.copy('/tmp/test_write.txt', '/tmp/test_copy.txt')")
```

    '/tmp/test_copy.txt'

``` python
try: await pyrun2("import shutil; shutil.copy('/tmp/test_write.txt', '/root/bad.txt')")
except PermissionError as e: print(f'Blocked: {e}')
```

    Blocked: Dest '/root/bad.txt' not allowed; permitted: ('/tmp',)

By default,
[`RunPython()`](https://AnswerDotAI.github.io/safepyrun/core.html#runpython)
uses `default_ok_dests`, which allows writes in `.` and `/tmp` but
blocks writes elsewhere.

``` python
await pyrun("Path('test_default_ok.txt').write_text('ok')")
await pyrun("Path('/tmp/test_default_tmp.txt').write_text('tmp')")

try: await pyrun("Path('/etc/nope.txt').write_text('bad')")
except PermissionError as e: print(f'Default blocked: {e}')
```

    Default blocked: Dest '/etc/nope.txt' not allowed; permitted: ('.', '/tmp')

If you want to disable write protection entirely, pass `ok_dests=None`:

``` python
pyrun_unrestricted = RunPython(ok_dests=None)
unrestricted_path = Path.home()/'safepyrun-unrestricted.txt'
await pyrun_unrestricted(f"Path({str(unrestricted_path)!r}).write_text('ok')")
```

    2

You can use `'.'` to allow writes relative to the current working
directory. Path traversal attempts (`../`, `subdir/../../`) are detected
and blocked, so the sandbox can’t escape the permitted directory:

``` python
pyrun_cwd = RunPython(ok_dests=['.'])

# Writing to cwd should work
await pyrun_cwd("Path('test_cwd_ok.txt').write_text('hello')")
```

    5

``` python
Path('test_cwd_ok.txt').unlink(missing_ok=True)
```

Writing to /tmp is blocked here since it’s not in ok_dests:

``` python
try: await pyrun_cwd("Path('/tmp/nope.txt').write_text('bad')")
except PermissionError: print("Blocked /tmp as expected")
```

    Blocked /tmp as expected

Parent traversal is blocked if it resolves to a location outside
ok_dests:

``` python
try: await pyrun_cwd("Path('../escape.txt').write_text('bad')")
except PermissionError: print("Blocked ../ as expected")
```

    Blocked ../ as expected

### Write policies

When `ok_dests` is set, safepyrun uses write policies to determine how
to validate each callable’s destination arguments. Three built-in policy
classes cover common patterns: checking a positional or keyword argument
(`PosAllowPolicy`), checking the `Path` object itself
(`PathWritePolicy`), and checking `open()` calls only when the mode is
writable (`OpenWritePolicy`). You can also subclass `AllowPolicy` to
create custom checks.

The simplest, `PosAllowPolicy`, checks a specific positional or keyword
argument against the allowed destinations. Here, position 1 (or keyword
`dst`) is validated — writing to `/tmp` is allowed, but `/root` is
blocked:

``` python
pp = PosAllowPolicy(1, 'dst')
pp(None, ['src', '/tmp/ok'], {}, ['/tmp'])
try: pp(None, ['src', '/root/bad'], {}, ['/tmp'])
except PermissionError: print("PosAllowPolicy correctly blocked /root/bad")
```

    PosAllowPolicy correctly blocked /root/bad

You can create custom write policies by subclassing `AllowPolicy` and
implementing `__call__`. For example, here we show a policy that only
allows writes to files with specific extensions — useful if you want the
LLM to create `.csv` or `.json` files but not arbitrary scripts.

The `__call__` signature receives `(obj, args, kwargs, ok_dests)` where
`obj` is the object the method is called on (e.g. a `Path` instance),
`args`/`kwargs` are the method’s arguments, and `ok_dests` is the list
of permitted directory prefixes. Calling `chk_dest` first handles the
directory check, then the custom logic adds the extension constraint on
top.

``` python
class ExtWritePolicy(AllowPolicy):
    "Only allow writes to paths with specified extensions"
    def __init__(self, exts): self.exts = set(exts)
    def __call__(self, obj, args, kwargs, ok_dests):
        chk_dest(obj, ok_dests)
        if Path(str(obj)).suffix not in self.exts: raise PermissionError(f"{Path(str(obj)).suffix!r} not allowed")
```

``` python
ep = ExtWritePolicy(['.csv', '.json'])
ep(Path('/tmp/data.csv'), [], {}, ['/tmp'])
try: ep(Path('/tmp/script.sh'), [], {}, ['/tmp'])
except PermissionError: print("ExtWritePolicy correctly blocked .sh")
```

    ExtWritePolicy correctly blocked .sh

You can register it with `allow` just like the built-in policies:

``` python
allow({Path: [('write_text', ExtWritePolicy(['.csv', '.json', '.txt']))]})
```

## Configuration

`safepyrun` loads an optional user config from
`{xdg_config_home}/safepyrun/config.py` at import time, after all
defaults are registered. This lets you permanently extend the sandbox
allowlists without modifying the package. The config file is executed
with all `safepyrun.core` globals already available, so no imports are
needed. This includes `allow`,
[`allow_write_types`](https://AnswerDotAI.github.io/safepyrun/core.html#allow_write_types),
`AllowPolicy`, `PathWritePolicy`, `PosAllowPolicy`, `OpenWritePolicy`,
and all standard library modules already imported by the module.

Example `~/.config/safepyrun/config.py` (Linux) or
`~/Library/Application Support/safepyrun/config.py` (macOS):

``` python
import pandas

# Add pandas tools
allow({pandas.DataFrame: ['head', 'describe', 'info', 'shape']})

# Allow pandas to write CSV to ~/data
allow({pandas.DataFrame: [('to_csv', PosAllowPolicy(0, 'path_or_buf'))]})
```

If the config file has errors, a warning is emitted and the defaults
remain intact.

## CLI

safepyrun ships with a command-line tool that runs a Python script file
in the sandbox. You can pass a file path, or pipe code in via stdin:

``` sh
# Run a script file
$ safepyrun myscript.py

# Pipe code via stdin
$ echo "1+1" | safepyrun
```

The result of the last expression is printed to stdout, matching the
behaviour of `pyrun` in Python. Errors are reported to stderr.
