Metadata-Version: 2.4
Name: nbpipes
Version: 0.0.4
Summary: Powerful pipeline syntax for IPython and Jupyter
Home-page: https://github.com/smacke/nbpipes
Author: Stephen Macke
Author-email: stephen.macke@gmail.com
License: BSD-3-Clause
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: BSD License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Python: >=3.9
Description-Content-Type: text/markdown; charset=UTF-8
License-File: docs/LICENSE.txt
Requires-Dist: ipyflow-core>=0.0.221
Requires-Dist: pyccolo>=0.0.77
Provides-Extra: test
Requires-Dist: black; extra == "test"
Requires-Dist: hypothesis; extra == "test"
Requires-Dist: isort; extra == "test"
Requires-Dist: mypy; extra == "test"
Requires-Dist: pytest; extra == "test"
Requires-Dist: pytest-cov; extra == "test"
Requires-Dist: ruff; extra == "test"
Provides-Extra: dev
Requires-Dist: build; extra == "dev"
Requires-Dist: pycln; extra == "dev"
Requires-Dist: twine; extra == "dev"
Requires-Dist: versioneer; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: hypothesis; extra == "dev"
Requires-Dist: isort; extra == "dev"
Requires-Dist: mypy; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Dynamic: license-file

nbpipes
=======

[![CI Status](https://github.com/smacke/nbpipes/workflows/nbpipes/badge.svg)](https://github.com/smacke/nbpipes/actions)
[![Checked with mypy](http://www.mypy-lang.org/static/mypy_badge.svg)](http://mypy-lang.org/)
[![License: BSD3](https://img.shields.io/badge/License-BSD3-maroon.svg)](https://opensource.org/licenses/BSD-3-Clause)
[![Python Versions](https://img.shields.io/pypi/pyversions/nbpipes.svg)](https://pypi.org/project/nbpipes)
[![PyPI Version](https://img.shields.io/pypi/v/nbpipes.svg)](https://pypi.org/project/nbpipes)

nbpipes is an IPython extension that brings a pipe operator `|>` and
powerful placeholder and macro expansion syntax extensions to IPython and Jupyter.

If you're familiar with the [magrittr](https://magrittr.tidyverse.org/) package
for R, then you'll be right at home with nbpipes.


## Getting Started

Run the following in IPython or Jupyter to install nbpipes and load
the extension:

```python
%pip install nbpipes
%load_ext nbpipes
```

The `%load_ext nbpipes` invocation is what enables the new pipe syntax
in your current session.

## Features by Example

Let's look at a few examples to give a flavor of what you can do with nbpipes:

```python
# Render a sorted version of a tuple
>>> tup = (3, 4, 1, 5, 6)
>>> tup |> sorted |> tuple
(1, 3, 4, 5, 6)
```
The above example showcases the `|>`, or "pipe", operator, which is a much-loved
feature of functional programming that has become increasingly mainstream. Its
primary benefit is that the flow of execution follows natural left-to-right
reading / writing order of the code. Whether or not such pipeline syntax is
available, it's not uncommon for programmers to execute pipelines like the above
multiple times during to verify the computation at each step, particularly in
interactive programming environments like Jupyter. With `|>`, this type of
incremental verification becomes a breeze: first execute `tup |> sorted`, then
append ` |> tuple` to execute the full chain `tup |> sorted |> tuple`, each time
using the last-expression rendering capabilities of the notebook or REPL to
inspect and verify the result.

### Placeholders

The power of the `|>` operator is amplified via placeholder syntax for implicit
function construction: for nbpipes, we use `$` to stand in for function arguments
and induce function creation:

```python
# Sort a list in reverse order
>>> lst = [3, 4, 1, 5, 6]
>>> lst |> sorted($, reverse=True)
[6, 5, 4, 3, 1]
```

`$` is analogous to magrittr's `.` placeholder. It can also be used outside
of pipeline contexts:

```python
# Sort a list in reverse order and print the result
lst = [3, 4, 1, 5, 6]
reverse_sorter = sorted($, reverse=True)

# The following are equivalent:
print(reverse_sorter(lst))
lst |> reverse_sorter |> print
```

Each time `$` appears, it represents a new argument, so `sorted($, reverse=$)`
represents a function with two arguments:

```python
import random

# Sort a list in either ascending or descending order with probablility 0.5:
lst = [3, 4, 1, 5, 6]
sorter = sorted($, reverse=$)
reverse = random.random() < 0.5

# The following are equivalent:
print(sorter(lst, reverse))
lst |> sorter($, reverse) |> print
```

Placeholders can appear anywhere -- not just as arguments to function calls:

```python
# Sort a list and find the position of element 4:
>>> lst = [3, 4, 1, 5, 6]
>>> lst |> sorted |> $.index(3)
1
```

### Named Placeholders

There are situations that would benefit from referencing the same placeholder multiple times, for which
nbpipes permits *named placeholders* by prefixing `$` to an identifier:

```python
# Pair even entries from a range with their adjacent odd entry
range(6) |> list |> zip($v[::2], $v[1::2]) |> list
>>> [(0, 1), (2, 3), (4, 5)]
```

In the above example, we could have used any name for `$v`, the important
thing is that the same name was used -- otherwise nbpipes would have
induced a function with two arguments instead of one.

### Undetermined Pipelines

Similar to magrittr's behavior, if any number of placeholders appear in the first
step of an nbpipes pipeline, this *undetermined pipeline* will represent a function:

```python
>>> second_largest_value = $ |> sorted($, reverse=True) |> $[1]
>>> [3, 8, 6, 5, 1] |> second_largest_value
6
```

### Macros and Partial Function Syntax

In some cases, it may be desirable to curry a function with parameters at its start,
akin to the typical usage of `functools.partial`. For example:

```python
>>> add_reducer = reduce(lambda x, y: x + y, $, $)
>>> add_reducer([1, 2, 3], 0)
6
>>> add_reducer([[1, 2, 3], [4, 5, 6]], [])
[1, 2, 3, 4, 5, 6]
```

To avoid writing out a `$` placeholder for each and every tail argument, you can
prefix the call itself with a `$` and omit subsequent arguments, just like in coconut:

```python
>>> add_reducer = reduce$(lambda x, y: x + y)
>>> add_reducer([1, 2, 3], 0)
6
>>> add_reducer([[1, 2, 3], [4, 5, 6]], [])
[1, 2, 3, 4, 5, 6]
```

Or even more simply, since the induced partial function retains all the same
argument defaults as the original `reduce`, we can omit the base case:

```python
>>> add_reducer = reduce$(lambda x, y: x + y)
>>> add_reducer([1, 2, 3])
6
>>> add_reducer([[1, 2, 3], [4, 5, 6]])
[1, 2, 3, 4, 5, 6]
```

For common functional programming tools like `map`, `reduce`, and `filter`, the above
pattern is so common that nbpipes provides corresponding macros, in which the function used
to curry each higher order function is specified between brackets:

```python
>>> add_reducer = reduce[lambda x, y: x + y]
>>> [1, 2, 3] |> add_reducer
6
>>> [[1, 2, 3], [4, 5, 6]] |> add_reducer
[1, 2, 3, 4, 5, 6]
```

We're still writing out `lambda x, y: x + y`, which is kind of tedious -- for these
kinds of simple lambda constructions, nbpipes provides a *quick lambda macro*, `f`:

```python
>>> add_reducer = reduce[f[$ + $]]
>>> [1, 2, 3] |> add_reducer
6
>>> [[1, 2, 3], [4, 5, 6]] |> add_reducer
[1, 2, 3, 4, 5, 6]
```

`f` can also be used on its own:

```python
>>> f[$ + $](2, 3)
5

>>> f[$a*$b + $b*$c + $a*$c](2, 3, 4)
26
```

Furthermore, nbpipes allows you to omit the `f` from higher order
functional macros, so that you can simply do `add_reducer = reduce[$ + $]` instead.
Here are a couple of nifty constructions utilizing this compact syntax:

```python
# factorial
>>> range(1, 5) |> reduce[$ * $]
24

# compute a number from decimal digits
>>> [2, 3, 4] |> reduce[10*$ + $]
234
```

### Additional Pipe Operators

There are a few other variants of the `|>` operator offered by
nbpipes, covered in this section.

#### Assignment Pipe

The *assignment pipe*, `|>>`, writes the left hand side value to the variable
whose name is specified on the right hand side. Furthermore, it evaluates to
the left hand side value. For example:

```python
>>> 2 |> $ + 2 |>> two_plus_two |> $ + 3 |>> two_plus_two_plus_three
7
>>> (two_plus_two, two_plus_two_plus_three)
(4, 7)
```

#### Varargs Pipe

The *varargs pipe*, `*|>`, unpacks the iterable on the left hand side before
passing its values as inputs to the function on the right hand side. For
example:

```python
# Add two numbers:
>>> (2, 3) *|> $ + $
5
```

A common pattern is using `*|>` to expand an undetermined pipeline
appearing inside of a `map[...]`:

```python
# Take the product of consecutive pairs of even-odd integers
>>> consecutive_pairs = range(10) |> list |> ($v[::2], $v[1::2]) *|> zip
>>> consecutive_pairs |> map[$ *|> $ * $] |> list
[0, 6, 20, 42, 72]
```

#### Function Pipe

The other commonly used pipe is the *function pipe*, `.>`, which is used to compose
the functions specified on the left hand side and right hand side together, with the
function on the left hand side being applied first in the composition (note that this
behavior is reversed from normal function composition, but follows the flow of data better).
For example:

```python
>>> reverse = reversed .> list
>>> [1, 2, 3] |> reverse
[3, 2, 1]
```

#### Other Pipes

Besides `|>>`, `*|>`, and `.>`, nbpipes offers a few less commonly used operators as well. The below
table describes the complete set of forward pipe operators available:

| Operator           | nbpipes Syntax                                     | Python Syntax                           |
|--------------------|----------------------------------------------------|-----------------------------------------|
| <code>\|></code>   | <code>y = x \|> f</code>                           | `y = f(x)`                              |
| <code>\|>></code>  | <code>x \|>> y</code>                              | `y = x; y`                              |
| <code>*\|></code>  | <code>y = x *\|> f</code> where `x` is an iterable | `y = f(*x)`                             |
| <code>**\|></code> | <code>y = x **\|> f</code> where `x` is a dict     | `y = f(**x)`                            |
| `.>`               | `h = g .> f`                                       | `h = lambda *a, **kw: g(f(*a, **kw))`   |
| `*.>`              | `h = g *.> f`                                      | `h = lambda *a, **kw: g(*f(*a, **kw))`  |
| `**.>`             | `h = g **.> f`                                     | `h = lambda *a, **kw: g(**f(*a, **kw))` |
| `?>`               | `y = x ?> f`                                       | `y = None if x is None else f(x)`       |
| `*?>`              | `y = x *?> f` where `x` is an iterable, or `None`  | `y = None if x is None else f(*x)`      |
| `**?>`             | `y = x **?> f` where `x` is a dict, or `None`      | `y = None if x is None else f(**x)`     |
| `$>`               | `g = x $> f`                                       | `g = functools.partial(f, x)`           |
| `*$>`              | `g = x *$> f` where `x` is an iterable             | `g = functools.partial(f, *x)`          |
| `**$>`             | `g = x **$> f` where `x` is a dict                 | `g = functools.partial(f, **x)`         |

Except for `|>>`, each and every operator has a corresponding *backward* variant; e.g. `<|` is the backward variant
of `|>` and is a low-precedence apply. For example:

```python
>>> reversed .> list <| [1, 2, 3]
[3, 2, 1]
```

All pipe operators are applied in order from left to right (including backward pipes).
Furthermore, all pipe operators are left associative and operate at the same precedence
as `|` (bitwise or), meaning that any pipeline steps that include an `|` binary operation
must be wrapped in parentheses.

### Additional Macros and Helper Utilities

#### `do` macro

Similar to [toolz](https://github.com/pytoolz/toolz), nbpipes offers a `do` macro
implementing something similar to the following higher order function:

```python
def do(func, obj):
    func(obj)
    return obj
```

In the case of nbpipes, the input function `func` is specified inside of brackets,
just as with other functional macros:

```python
>>> 2 |> $ + 2 |> do[print] |> $ + 2 |>> result
4
6
```

While any function expression, including undetermined pipelines, can appear inside `do[...]` brackets,
`do[print]` is so common that nbpipes provides a `peek` utility that implements the very same:

```python
>>> 2 |> $ + 2 |> peek |> $ + 2 |>> result
4
6
```

To suppress the automatic expression rendering of a pipeline result, nbpipes also offers a `null` utility function
(as in `/dev/null`), which essentially swallows its input:

```python
>>> 2 |> $ + 2 |> peek |> $ + 2 |>> result |> null
4
```

#### `fork` and `parallel` macros

If you wish to move beyond linear chains and apply the same input to multiple pipelines,
nbpipes provides `fork` and `parallel` macros, which return the results of each function
as a tuple:

```python
>>> range(10) |> list |> fork[
    map[2 * $] .> filter[$ % 3 == 0],
    map[3 * $] .> filter[$ % 2 == 0],
]
([0, 6, 12, 18], [0, 6, 12, 18, 24])
```

`parallel` does the same thing as `fork` but executes each function passed to it concurrently.

#### `when` macro

The `when` macro takes as input a value and conditional expression that, upon passing,
forwards the value, and upon failing, terminates computation with `None`. It is particularly powerful
when combined with `fork` and `collapse` (the latter of which extracts the non-null value out of
the tuple that results from the `fork`):

```python
# Define a `collatz` utility and run it up to 20 times on 42
>>> collatz = when[$ != 1] .> fork[
    when[$ % 2 == 0] .> $ // 2,
    when[$ % 2 == 1] .> $ * 3 + 1,
] .> collapse .> peek
>>> 42 |> collatz ** 20
21
64
32
16
8
4
2
1
```

Right, I forgot to mention that you can exponentiate single-argument functions in nbpipes,
so that we don't need to write out `42 |> collatz |> collatz |> ... |> collatz`.

#### `future` macro

Finally, to schedule a function to run in another thread and immediately
return a future to the eventual result, nbpipes provides a `future` macro:

```python
>>> 2 |> future[$ + 2] |> $.result()
4
>>> [1, 2, 3] |> future[sum] |> $.result()
6
```

## Placeholder Scope

A natural question is: how does nbpipes know what part of the code should
be included in the body of the function induced by placeholder use? The
rules are as follows:

1. If there is a macro or pipeline step enclosing the placeholder, the induced
   function body includes the "smallest" such enclosing macro or pipeline step.
2. Otherwise, the function body expands to include the nearest "chain"
   of function calls, attribute accesses, and / or subscript accesses.

An example of a "chain" would be something like `np.array($).T.astype(int)`,
which induces a lambda that converts its argument to a numpy array,
transposes it, and then converts the result to use `int64` dtype. That is,
the lambda body expands to include not just `np.array($)`, but the entire
"chain" in the expression.

To see a concrete example of where this matters, consider the following
two placeholder expressions:

```python
# The following sorters do different things!
sorter1 = sorted($, key=$[1])
sorter2 = sorted($, key=f[$[1]])
```

`sorter1` is a function that takes two arguments: a sequence, and a list of
functions, the second of which will be used to compute the sort key, which it then
uses to sort the first argument.
`sorter2`, on the other hand, is a function that takes a single argument, which
is a sequence that it sorts using the second element of each value in said
sequence value as sort key. In most cases, `sorter2` probably gives the desired
behavior.

## Performance Overhead

Because nbpipes is implemented using instrumentation (see [How it works](#how-it-works)),
it does incur overhead. For top-level code written in a Jupyter cell (e.g.,
code that doesn't have any indentation), the additional overhead generally doesn't matter,
as it tends to be insignificant when compared to data-intensive dataframe operations
and SQL queries common in data science workloads. For code invoked repeatedly in loop
bodies or function calls, however, this overhead can become noticeable; as such, nbpipes
syntax is not enabled by default in these contexts. To opt into nbpipes syntax in loops
and bodies, use the `allow_pipelines_in_loops_and_calls` context manager / decorator.

Example of how to embed an nbpipes pipeline in a function body:

```python
@allow_pipelines_in_loops_and_calls
def compute_first_k_sums_of_squares(k):
    lst = []
    for i in range(1, k + 1):
        i |> $ ** 2 |> $ + sum(lst) |> do[lst.append($)]
    return lst
```

Such functions can be used as normal:

```python
>>> 10 |> compute_first_k_sums_of_squares
[1, 5, 15, 37, 83, 177, 367, 749, 1515, 3049]
```

Example of embedding a pipeline in a loop:

```python
lst = []
with allow_pipelines_in_loops_and_calls():
    for i in range(1, 101):
        i |> $ ** 2 |> $ + sum(lst) |> do[lst.append($)]
```

## More Examples
I developed nbpipes while working on
[Advent of Code 2025](https://adventofcode.com/2025) in parallel,
and used it for most of the input processesing portions of my solutions.
You can find these solutions at https://github.com/smacke/aoc2025. In particular,
the [solution for day 6](https://github.com/smacke/aoc2025/blob/main/aoc6.ipynb)
showcases the upper limits of what is possible with nbpipes. Note however that it is
optimized for nbpipes usage and not readability, which I generally wouldn't recommend.

## What nbpipes is and is not

For now, nbpipes is not a general purpose functional programming language on top of
Python. It is very much not intended for production use cases, and instead
caters toward quick-and-dirty one-off / scratchpad type computations in IPython
and Jupyter specifically. In short, nbpipes aims to provide simple but powerful
pipeline and placeholder syntax to interactive Python programming environments.

Particularly, nbpipes is:
- Currently only for interactive Python environments built on top of IPython, such as
  Jupyter, or IPython itself
- Just a library you can install from PyPI, compatible with a wide range of Python 3
  versions -- no fancy installation instructions, no complicated language distribution
  to install
- Fully compatible with all existing Python standard and third-party libraries that
  you already know and love, since it's just Python function calls under the hood

All the different pipeline operators like `|>`, `<|`, `*|>`, etc. essentially
transpile down to an instrumented variant of the bitwise-or (`|`) operator, and
therefore every new operator left-associates at the same level of precedence,
meaning that pipeline steps run from left to right in the order that they
appear. nbpipes aims to optimize for simplicity, readability / writability, and
predictability over feature completeness (though I'd like to think it strikes a
fairly good balance in this regard). nbpipes may be expanded beyond IPython / Jupyter
depending on traction.

## How it works

nbpipes works by transforming syntax in two stages. First, it rewrites token spans
like `|>` and `*|>` that are illegal in Python to legal ones -- for the previous
examples, both spans are rewritten to bitwise or, `|`. After these transformations,
the resulting code is valid (but likely not runnable) Python syntax. nbpipes uses
the [pyccolo](https://github.com/smacke/pyccolo) library to perform these rewrites,
which remembers the positions of the rewrites where they occurred, so that the eventual
`ast.BinOp` AST node can be associated with the `|>` operator.

Pyccolo is an event-based AST transformation library I developed during my PhD
which allows you to layer multiple AST transformations on top of each other in a
composable fashion. In short, you specify handlers for different AST nodes such
as `ast.BinOp`, and pyccolo instruments these nodes by emitting events for them,
so that when the code runs, all the handlers for a particular event are run.
Such event handlers are what allow us to change the behavior of `ast.BinOp`
nodes that have been associated with various custom operators like `|>`.

Because the same event emission transformation can be leveraged by multiple
associated handlers, you generally don't need to worry about said
transformations rewriting the AST in ways that conflict with each other. This
composability lies in stark contrast with the challenges you would face if you
were to just create a bunch of `ast.NodeTransformer` instances to perform
transformations. The strategy employed by pyccolo therefore allows for
incremental and iterative feature development without requiring large rewrites
as new features are introduced.

To summarize, nbpipes rewrites its syntax to valid Python, and then runs this Python in
an instrumented fashion using pyccolo. Because everything is just running in
Python, nbpipes is effectively a Python superset, and because the transformed
Python that is instrumented is fairly similar visually to nbpipes syntax,
various Jupyter ergonomical features like readable stack traces and jedi-based
autocomplete can continue to function as normal (for the most part).

Implementation-wise, thanks to pyccolo's heavy lifting, I was able to implement
the initial release of nbpipes entirely over the course of time off during the
2025 holiday season. At the time of this writing, nbpipes occupies fewer than
2000 lines of code (excluding tests), each of which was produced *without* the
help of any AI agents.

## Inspiration

nbpipes draws inspiration largely from
[magrittr](https://magrittr.tidyverse.org/), but also from efforts like
[coconut](https://coconut-lang.org/) (a functional superset of Python),
as well as from libraries like [Pipe](https://github.com/JulienPalard/Pipe) and [toolz](https://github.com/pytoolz/toolz) which
fill some of Python's pipe and functional programming gaps with elegant APIs.

## License
Code in this project licensed under the [BSD-3-Clause License](https://opensource.org/licenses/BSD-3-Clause).
