Metadata-Version: 2.4
Name: OrderPulse
Version: 0.2.13
Summary: High-performance exchange feed parser and orderflow analytics engine with Rust and Python bindings
Requires-Python: >=3.9
Description-Content-Type: text/markdown

# OrderPulse / `fastreader`

`OrderPulse` packages a Rust parser and order book engine as a Python extension module named `fastreader`.
The library is built for binary exchange feed files containing:

- order messages: `N` new, `M` modify, `X` cancel
- trade messages: `T`

The Python API exposed from [src/lib.rs](src/lib.rs) centers on three classes:

- `ReadMsgFromBinary`: load and query parsed messages
- `MessageBatch`: hold a selected subset of messages
- `OrderBookBuilder`: replay messages into a top-of-book snapshot stream

## Install and Import

Build and install the extension locally with `maturin`:

```bash
maturin develop
```

Then import it in Python:

```python
from fastreader import ReadMsgFromBinary, OrderBookBuilder
```

## Quick Start

```python
from fastreader import ReadMsgFromBinary, OrderBookBuilder

reader = ReadMsgFromBinary("/path/to/feed.bin")

print(reader.total_messages())
print(reader.total_orders())
print(reader.total_trades())

first_ten = reader.get_all_messages(limit=10)
for line in first_ten:
	print(line)

orders = reader.select_order_messages()
print(orders.len())
print(orders.to_list(limit=5))

builder = OrderBookBuilder()
rows_printed = builder.create_orderbook_all_messages(reader)
print(rows_printed)
```

To keep only one instrument token while loading:

```python
reader = ReadMsgFromBinary("/path/to/feed.bin", token=26000)
```

## Public API

### `ReadMsgFromBinary(path, token=None)`

Loads the full binary file once, parses every recognized packet, and stores the result in memory.

Parameters:

- `path`: path to the binary feed file
- `token`: optional instrument token filter applied after parsing

Behavior:

- opens the file and memory-maps it for fast sequential access
- scans packet-by-packet and recognizes `T`, `N`, `M`, and `X`
- converts little-endian numeric fields into native Rust values
- stores parsed packets in an internal `Vec<Message>`
- if `token` is provided, keeps only messages whose token matches
- initializes an internal cursor used by incremental read helpers

#### `total_messages()`

Returns the total number of parsed messages currently stored.

#### `total_orders()`

Counts messages whose variant is `Message::Order`.

This includes all order-side packet types represented by the parser: new, modify, and cancel.

#### `total_trades()`

Counts messages whose variant is `Message::Trade`.

#### `summary()`

Prints three lines to standard output:

- total messages
- total order messages
- total trade messages

Use this when you want a simple console summary instead of a returned Python value.

#### `reset_cursor()`

Resets the internal cursor back to the first stored message.

This matters for the incremental helpers:

- `get_next_msg()`
- `select_next_messages(limit)`

#### `get_all_messages(limit=None)`

Returns formatted strings for all stored messages, optionally capped by `limit`.

Use this when you want a Python list immediately and do not need to keep a reusable batch object.

#### `get_order_messages(limit=None)`

Returns formatted strings for order messages only.

Internally it filters the in-memory message list for `Message::Order`, then formats each item.

#### `get_trade_messages(limit=None)`

Returns formatted strings for trade messages only.

Internally it filters for `Message::Trade`, then formats each item.

#### `get_next_msg()`

Returns the next formatted message string and advances the internal cursor by one.

When the cursor reaches the end, it returns the literal string `"END"`.

Use this for sequential consumption from Python without slicing the entire data set each time.

#### `select_all_messages()`

Returns a `MessageBatch` containing every stored message.

Use this when you want to keep a reusable selection and convert it later with `to_list()`.

#### `select_order_messages()`

Returns a `MessageBatch` containing only order messages.

#### `select_trade_messages()`

Returns a `MessageBatch` containing only trade messages.

#### `select_next_messages(limit)`

Returns the next `limit` messages as a `MessageBatch` and advances the internal cursor.

This is the batch-oriented counterpart to `get_next_msg()`.

### `MessageBatch`

`MessageBatch` is a lightweight container around a selected slice of parsed messages.

It is useful when you want to:

- keep a filtered subset
- page through data in chunks
- pass only a subset into `OrderBookBuilder`

#### `len()`

Returns the number of messages held by the batch.

#### `is_empty()`

Returns `True` when the batch contains no messages.

#### `to_list(limit=None)`

Formats the stored messages into Python strings, optionally capped by `limit`.

This does not re-read the file. It only formats the messages already stored in the batch.

### `OrderBookBuilder()`

Creates a builder that replays parsed messages through the in-memory order book engine.

The builder itself is stateless. Each build call creates a fresh `OrderBookManager` internally.

#### `create_orderbook_all_messages(reader)`

Consumes all messages stored in a `ReadMsgFromBinary` instance.

Behavior:

- replays order messages into the order book
- applies trade messages as quantity reductions against buy and sell order ids
- after each message, asks for the top 5 bid and ask levels of the affected token
- prints one CSV row only when both sides have at least 5 populated levels
- returns the number of printed rows

#### `create_orderbook(batch)`

Same behavior as `create_orderbook_all_messages`, but the input is a `MessageBatch`.

Use this when you want to build the book from:

- only order messages
- only trade messages
- a cursor window from `select_next_messages()`
- any user-selected subset

## Formatted Message Output

Most reader methods return strings produced by the internal `format_message` function in [src/lib.rs](src/lib.rs).

Order messages are formatted like:

```text
Order Message: SeqNo..., msg_len..., Msg_Type'...', Exch_ts..., local_ts..., order_id..., Token..., order_Type'...', Price..., Quantity..., missed...
```

Trade messages are formatted like:

```text
Trade Message: SeqNo..., msg_len..., Msg_Type'...', Exch_ts..., local_ts..., order_id_buy..., order_id_sell..., Token..., Price..., Quantity..., missed...
```

Field meanings:

- `SeqNo`: stream sequence number from the packet header
- `msg_len`: encoded packet byte length
- `Msg_Type`: raw feed message code such as `N`, `M`, `X`, or `T`
- `Exch_ts`: exchange timestamp inside the payload
- `local_ts`: local timestamp attached to the packet
- `order_id`: order identifier for order messages
- `order_id_buy`: buy order identifier for trade messages
- `order_id_sell`: sell order identifier for trade messages
- `Token`: instrument token
- `order_Type`: side code for order messages, typically buy or sell
- `Price`: order price or trade price
- `Quantity`: order quantity or traded quantity
- `missed`: gap flag rendered as `0` or `1`

## Architecture

### 1. Binary parser

The parser lives in [src/read_trd_ord_only.rs](src/read_trd_ord_only.rs).

Flow:

- the file is opened and memory-mapped using `memmap2`
- the buffer is scanned from left to right
- spaces are skipped
- a small `PeekStructure` is read first to inspect `msg_type`
- based on the type, the parser reads either an `OrderPacket` or `TradePacket`
- little-endian fields are converted to host-endian values
- the parsed packet is wrapped in the `Message` enum from [src/structure.rs](src/structure.rs)
- unknown bytes trigger one-byte resynchronization and the parser keeps scanning

Recognized message types:

- `T`: trade packet
- `N`: new order
- `M`: modify order
- `X`: cancel order

Debugging support:

- `ORDERPULSE_DEBUG=1` enables parser debug logs
- `ORDERPULSE_DEBUG_LIMIT=<n>` limits how many debug lines are emitted

### 2. In-memory message model

The packet structures are defined in [src/structure.rs](src/structure.rs).

Key types:

- `StreamHeader`: packet header containing length, stream id, and sequence number
- `OrderMessage`: payload for order events
- `TradeMessage`: payload for trade events
- `OrderPacket` and `TradePacket`: payload plus header, local timestamp, and flags
- `Message`: enum wrapping either packet type

`ReadMsgFromBinary` stores `Vec<Message>`, and every Python-facing query method works from this in-memory vector.

### 3. Order book engine

The order book engine lives in [src/orderbook.rs](src/orderbook.rs).

Core ideas:

- one `OrderBook` is maintained per token
- active orders are indexed by `order_id`
- price levels are aggregated into bid and ask arrays
- the dynamic price range expands when a new price falls outside the current window
- order messages add, modify, or cancel orders
- trade messages reduce quantity on both the buy and sell order ids

Top-of-book extraction:

- bids are scanned from highest price downward
- asks are scanned from lowest price upward
- the midpoint is computed from the best bid and best ask
- the builder prints a row only if at least 5 bid levels and 5 ask levels exist

### 4. Python binding layer

The PyO3 bindings are declared in [src/lib.rs](src/lib.rs).

The exported Python module is named `fastreader`, not `OrderPulse`.

That means Python code should use:

```python
from fastreader import ReadMsgFromBinary, MessageBatch, OrderBookBuilder
```

## Typical Usage Patterns

### Inspect a file quickly

```python
from fastreader import ReadMsgFromBinary

reader = ReadMsgFromBinary("feed.bin")
reader.summary()
print(reader.get_all_messages(limit=3))
```

### Work on a single token

```python
reader = ReadMsgFromBinary("feed.bin", token=12345)
print(reader.total_orders())
print(reader.get_trade_messages(limit=10))
```

### Stream through messages incrementally

```python
reader = ReadMsgFromBinary("feed.bin")

while True:
	msg = reader.get_next_msg()
	if msg == "END":
		break
	print(msg)
```

### Build an order book from a subset

```python
from fastreader import ReadMsgFromBinary, OrderBookBuilder

reader = ReadMsgFromBinary("feed.bin")
batch = reader.select_next_messages(50_000)

builder = OrderBookBuilder()
rows = builder.create_orderbook(batch)
print(rows)
```

## Notes and Limitations

- `ReadMsgFromBinary` loads the full parsed result into memory
- methods returning strings are convenience views, not raw packet objects
- `summary()` and order book builders print to standard output
- order book row generation is currently hard-coded to 5 bid and 5 ask levels
- rows are skipped until both sides have at least 5 populated levels
- trade token is stored as `i32` in the wire struct and converted to `u32` for filtering and book lookup

## Internal Modules

Besides the main parser and order book classes, [src/lib.rs](src/lib.rs) also exposes `orderbook_processing` as a Rust module. That module contains cycle-count benchmarking helpers in [src/orderbook_processing.rs](src/orderbook_processing.rs) and low-level timing utilities in [src/tsc.rs](src/tsc.rs).

Those helpers are separate from the main Python `fastreader` message-reading API.

