Metadata-Version: 2.4
Name: OrderPulse
Version: 0.2.26
Summary: High-performance exchange feed parser and orderflow analytics engine with Rust and Python bindings
Requires-Python: >=3.9
Description-Content-Type: text/markdown

# fastreader

`fastreader` is a high-performance Python extension written in Rust using PyO3.

It is designed to read binary market-data files containing order messages and trade messages, then build an in-memory order book snapshot for a selected token.

This library is mainly useful for:

- market-data research
- order book reconstruction
- backtesting
- binary feed debugging
- snapshot generation
- low-latency data processing pipelines

The library exposes three Python-facing classes:

1. `MessageCacheReader`
2. `StreamingBinaryLoader`
3. `OrderbookBuilder`

---

# 1. Why this library exists

Market-data files are usually stored in binary format.

Binary files are compact and fast for machines, but they are not easy to read directly from Python.

If you try to parse very large binary files in pure Python, you may face problems like:

- slow file reading
- high memory usage
- difficult byte-level parsing
- endian conversion issues
- complicated order book reconstruction logic

This library solves that problem by moving the heavy work into Rust.

Rust handles:

- file reading
- binary packet parsing
- order/trade message detection
- endian conversion
- memory-safe processing
- order book state construction

Python users get a simple interface.

---

# 2. High-level architecture

```text
Binary Market Data File
        |
        | contains raw order/trade packets
        v
+-----------------------------+
| MessageCacheReader          |
| Loads full file into RAM    |
+-----------------------------+
        |
        | or
        v
+-----------------------------+
| StreamingBinaryLoader       |
| Reads one message at a time |
+-----------------------------+
        |
        v
+-----------------------------+
| OrderbookBuilder            |
| Applies business logic      |
| Builds order book state     |
+-----------------------------+
        |
        v
Python Dictionary / CSV Row
```

There are two reading strategies.

| Reader | Storage Strategy | Best Use Case |
| --- | --- | --- |
| `MessageCacheReader` | Loads all messages into RAM | Backtesting, debugging, repeated analysis |
| `StreamingBinaryLoader` | Reads from disk one message at a time | Huge files, memory-safe processing, production pipelines |

The `OrderbookBuilder` can consume data from both readers.

---

# 3. Installation

After publishing this package to PyPI, users can install it using:

```bash
pip install fastreader
```

Then import it in Python:

```python
import fastreader
```

Or import classes directly:

```python
from fastreader import MessageCacheReader, StreamingBinaryLoader, OrderbookBuilder
```

---

# 4. Exposed Python classes

The Rust module exposes these classes to Python:

```python
fastreader.MessageCacheReader
fastreader.StreamingBinaryLoader
fastreader.OrderbookBuilder
```

Each class has a different responsibility.

| Class | Main Responsibility |
| --- | --- |
| `MessageCacheReader` | Load full binary file into memory |
| `StreamingBinaryLoader` | Read binary file one message at a time |
| `OrderbookBuilder` | Build order book from parsed messages |

---

# 5. Supported message types

From the current `lib.rs`, the reader understands the following message type bytes:

| Message Type | Meaning | Packet Type |
| --- | --- | --- |
| `T` | Trade message | `TradePacket` |
| `N` | Order message | `OrderPacket` |
| `M` | Order modification message | `OrderPacket` |
| `X` | Order cancel/delete message | `OrderPacket` |

The reader also skips leading space bytes:

```rust
if first[0] == b' ' {
    continue;
}
```

This means if the file contains padding spaces before a message, the reader ignores them and continues.

---

# 6. Class: `MessageCacheReader`

## Purpose

`MessageCacheReader` is a RAM-based reader.

It loads the complete binary file into memory once and stores all parsed messages.

Use this when:

- the file fits in RAM
- you want to inspect all messages
- you want repeated access to the same data
- you are debugging binary parsing
- you are doing backtesting

---

## 6.1 Create a reader

```python
from fastreader import MessageCacheReader

reader = MessageCacheReader()
```

### What happens internally

The Rust object is created with:

```rust
Self {
    file_path: None,
    messages: Arc::new(Vec::new()),
}
```

Meaning:

- no file is loaded yet
- the message vector is empty
- `Arc` is used for shared ownership of messages

### Expected output

Creating the object does not print anything.

```python
print(reader)
```

Example output:

```text
<builtins.MessageCacheReader object at 0x...>
```

---

## 6.2 `load_to_cache(file_path)`

```python
count = reader.load_to_cache("data/orders_trades.bin")
print(count)
```

### Purpose

Loads the complete binary file into RAM.

### Internal logic

```rust
let messages = read_trd_ord_only::read_trd_ord_only(&file_path)?;
let count = messages.len();

self.file_path = Some(file_path);
self.messages = Arc::new(messages);

Ok(count)
```

### Step-by-step explanation

1. The function receives a file path from Python.
2. It calls `read_trd_ord_only::read_trd_ord_only()`.
3. That function reads only valid order and trade messages.
4. All parsed messages are stored in `self.messages`.
5. The file path is saved in `self.file_path`.
6. The total number of loaded messages is returned.

### Example

```python
from fastreader import MessageCacheReader

reader = MessageCacheReader()

count = reader.load_to_cache("data/orders_trades.bin")

print("Loaded messages:", count)
```

### Expected output

```text
Loaded messages: 152340
```

The exact count depends on your binary file.

---

## 6.3 `get_all_messages()`

```python
messages = reader.get_all_messages()
```

### Purpose

Returns all cached messages as readable Python strings.

This is useful for:

- debugging
- checking parsed values
- validating order IDs
- validating token values
- checking prices and quantities
- confirming trade messages

### Internal logic

```rust
self.messages.iter().map(format_message).collect()
```

For every cached message, Rust calls `format_message()` and returns a list of strings to Python.

### Example

```python
from fastreader import MessageCacheReader

reader = MessageCacheReader()
reader.load_to_cache("data/orders_trades.bin")

messages = reader.get_all_messages()

for msg in messages[:5]:
    print(msg)
```

### Expected output

```text
Order Message: SeqNo 1, MsgLen 40, MsgType 'N', ExchTs 1710000000000000000, LocalTs 1710000000000000100, OrderId 10001, Token 26000, Side 'B', Price 2250000, Quantity 75, Missed 0
Trade Message: SeqNo 2, MsgLen 48, MsgType 'T', ExchTs 1710000000000000500, LocalTs 1710000000000000600, BuyOrderId 10001, SellOrderId 10002, Token 26000, Price 2250100, Quantity 50, Missed 0
Order Message: SeqNo 3, MsgLen 40, MsgType 'M', ExchTs 1710000000000000800, LocalTs 1710000000000000900, OrderId 10001, Token 26000, Side 'B', Price 2250050, Quantity 100, Missed 0
```

---

## 6.4 `get_cache_summary()`

```python
summary = reader.get_cache_summary()
print(summary)
```

### Purpose

Returns metadata about the cached messages.

### Internal logic

The Rust code calculates:

```rust
total_messages = self.messages.len()
total_orders = count of Message::Order
total_trades = count of Message::Trade
memory_usage_bytes = total_messages * size_of::<Message>()
```

Then it returns a Python dictionary.

### Example

```python
from fastreader import MessageCacheReader

reader = MessageCacheReader()
reader.load_to_cache("data/orders_trades.bin")

summary = reader.get_cache_summary()

print(summary)
```

### Expected output

```python
{
    "file_source": "data/orders_trades.bin",
    "total_messages": 152340,
    "total_orders": 120000,
    "total_trades": 32340,
    "memory_usage_bytes": 7312320
}
```

---

# 7. Class: `StreamingBinaryLoader`

## Purpose

`StreamingBinaryLoader` is a disk-based lazy reader.

It does not load the full file into RAM.

Instead, it opens the binary file and reads one message at a time.

Use this when:

- the file is very large
- you want low memory usage
- you want to process only part of a file
- you want production-style streaming
- you do not want to keep all messages in RAM

---

## 7.1 Create a streaming loader

```python
from fastreader import StreamingBinaryLoader

stream = StreamingBinaryLoader()
```

### What happens internally

Rust creates:

```rust
Self {
    file_path: None,
    file: None,
}
```

Meaning:

- no file path is stored yet
- no file handle is opened yet
- memory usage is very small

---

## 7.2 `open_stream(file_path)`

```python
count = stream.open_stream("data/orders_trades.bin")
print(count)
```

### Purpose

Opens a binary file for streaming and returns the number of valid messages.

### Internal logic

```rust
let count = count_messages_in_file(&file_path)?;
let file = File::open(&file_path)?;

self.file_path = Some(file_path);
self.file = Some(file);

Ok(count)
```

### Important point

`open_stream()` first scans the file to count messages.

Then it opens the file again for actual message-by-message reading.

### Example

```python
from fastreader import StreamingBinaryLoader

stream = StreamingBinaryLoader()

count = stream.open_stream("data/orders_trades.bin")

print("Total messages:", count)
```

### Expected output

```text
Total messages: 152340
```

---

## 7.3 `get_next_message()`

```python
msg = stream.get_next_message()
```

### Purpose

Reads the next available message and returns it as a formatted string.

When the file ends, it returns:

```text
END
```

### Internal logic

```rust
match self.get_next_message_raw()? {
    Some(message) => Ok(format_message(&message)),
    None => Ok("END".to_string()),
}
```

So:

- if a valid message exists, it returns a formatted string
- if file reading is finished, it returns `"END"`

### Example

```python
from fastreader import StreamingBinaryLoader

stream = StreamingBinaryLoader()
stream.open_stream("data/orders_trades.bin")

while True:
    msg = stream.get_next_message()

    if msg == "END":
        break

    print(msg)
```

### Expected output

```text
Order Message: SeqNo 1, MsgLen 40, MsgType 'N', ExchTs 1710000000000000000, LocalTs 1710000000000000100, OrderId 10001, Token 26000, Side 'B', Price 2250000, Quantity 75, Missed 0
Trade Message: SeqNo 2, MsgLen 48, MsgType 'T', ExchTs 1710000000000000500, LocalTs 1710000000000000600, BuyOrderId 10001, SellOrderId 10002, Token 26000, Price 2250100, Quantity 50, Missed 0
END
```

---

## 7.4 `reset_cursor()`

```python
stream.reset_cursor()
```

### Purpose

Moves the file cursor back to the start of the file.

After calling this method, `get_next_message()` starts reading again from the first message.

### Internal logic

```rust
file.seek(SeekFrom::Start(0))?;
```

### Example

```python
from fastreader import StreamingBinaryLoader

stream = StreamingBinaryLoader()
stream.open_stream("data/orders_trades.bin")

print(stream.get_next_message())
print(stream.get_next_message())

stream.reset_cursor()

print(stream.get_next_message())
```

### Expected output

```text
Order Message: SeqNo 1, ...
Trade Message: SeqNo 2, ...
Order Message: SeqNo 1, ...
```

---

# 8. Class: `OrderbookBuilder`

## Purpose

`OrderbookBuilder` is the engine of the library.

It takes parsed messages from:

- `MessageCacheReader`
- `StreamingBinaryLoader`

Then it applies order book business logic.

It updates the internal order book and can return:

- best bid
- best ask
- spread
- mid price
- bid levels
- ask levels
- CSV snapshot row

---

## 8.1 Create an order book builder

```python
from fastreader import OrderbookBuilder

builder = OrderbookBuilder()
```

### Internal logic

Rust creates:

```rust
Self {
    manager: OrderBookManager::new(),
    allowed_message_types: None,
}
```

Meaning:

- a new empty `OrderBookManager` is created
- no message filter is active
- all recognized messages are allowed

---

## 8.2 `apply_filter(logic_criteria=None)`

```python
builder.apply_filter(["N", "M", "X"])
```

### Purpose

Filters which message types should be processed by the builder.

### Examples

Process only new order messages:

```python
builder.apply_filter(["N"])
```

Process order-related messages:

```python
builder.apply_filter(["N", "M", "X"])
```

Process only trade messages:

```python
builder.apply_filter(["T"])
```

Clear filter and process all messages:

```python
builder.apply_filter(None)
```

### Internal logic

```rust
self.allowed_message_types = logic_criteria.map(|items| {
    items
        .into_iter()
        .filter_map(|item| item.as_bytes().first().copied())
        .collect()
});
```

### Important behavior

Only the first character of each string is used.

So this:

```python
builder.apply_filter(["New", "Trade"])
```

is interpreted as:

```python
builder.apply_filter(["N", "T"])
```

---

## 8.3 `build_from_list(reader)`

```python
processed = builder.build_from_list(reader)
```

### Purpose

Builds the order book from a `MessageCacheReader`.

This means the messages are already loaded in RAM.

### Internal logic

For every cached message:

1. Check whether it is an order message or trade message.
2. Read the message type.
3. Apply filter if filter exists.
4. Send order messages to:

```rust
self.manager.process_order_message(order_packet);
```

5. Send trade messages to:

```rust
self.manager.process_trade_message(trade_packet);
```

6. Increase processed count.

### Example

```python
from fastreader import MessageCacheReader, OrderbookBuilder

reader = MessageCacheReader()
reader.load_to_cache("data/orders_trades.bin")

builder = OrderbookBuilder()

processed = builder.build_from_list(reader)

print("Processed:", processed)
```

### Expected output

```text
Processed: 152340
```

---

## 8.4 `build_from_source(source, limit=None)`

```python
processed = builder.build_from_source(source)
```

### Purpose

Generic build method.

It accepts either:

- `MessageCacheReader`
- `StreamingBinaryLoader`

This is the most flexible method.

---

## Example with `MessageCacheReader`

```python
from fastreader import MessageCacheReader, OrderbookBuilder

reader = MessageCacheReader()
reader.load_to_cache("data/orders_trades.bin")

builder = OrderbookBuilder()

processed = builder.build_from_source(reader)

print("Processed:", processed)
```

### Expected output

```text
Processed: 152340
```

---

## Example with `StreamingBinaryLoader`

```python
from fastreader import StreamingBinaryLoader, OrderbookBuilder

stream = StreamingBinaryLoader()
stream.open_stream("data/orders_trades.bin")

builder = OrderbookBuilder()

processed = builder.build_from_source(stream)

print("Processed:", processed)
```

### Expected output

```text
Processed: 152340
```

---

## Example with limit

```python
from fastreader import StreamingBinaryLoader, OrderbookBuilder

stream = StreamingBinaryLoader()
stream.open_stream("data/orders_trades.bin")

builder = OrderbookBuilder()

processed = builder.build_from_source(stream, limit=10000)

print("Processed:", processed)
```

### Expected output

```text
Processed: 10000
```

---

## 8.5 `get_snapshot(token, levels=None)`

```python
snapshot = builder.get_snapshot(token=26000, levels=5)
```

### Purpose

Returns current order book state for one token.

If `levels` is not provided, default depth is 5.

### Example

```python
from fastreader import MessageCacheReader, OrderbookBuilder

reader = MessageCacheReader()
reader.load_to_cache("data/orders_trades.bin")

builder = OrderbookBuilder()
builder.build_from_list(reader)

snapshot = builder.get_snapshot(token=26000, levels=5)

print(snapshot)
```

### Expected output when token exists

```python
{
    "token": 26000,
    "found": True,
    "mid_price": 2250050,
    "best_bid": (2250000, 150),
    "best_ask": (2250100, 75),
    "spread": 100,
    "bids": [
        (2250000, 150),
        (2249900, 200),
        (2249800, 100),
        (2249700, 50),
        (2249600, 25)
    ],
    "asks": [
        (2250100, 75),
        (2250200, 125),
        (2250300, 100),
        (2250400, 80),
        (2250500, 60)
    ]
}
```

### Expected output when token does not exist

```python
{
    "token": 999999,
    "found": False,
    "mid_price": 0,
    "best_bid": None,
    "best_ask": None,
    "spread": None,
    "bids": [],
    "asks": []
}
```

---

## 8.6 `snapshot_header()`

```python
header = builder.snapshot_header()
print(header)
```

### Purpose

Returns CSV header for a 5-level snapshot row.

### Example output

```text
local_ts,exch_ts,mid_price,bid_price_0,bid_qty_0,ask_price_0,ask_qty_0,bid_price_1,bid_qty_1,ask_price_1,ask_qty_1,bid_price_2,bid_qty_2,ask_price_2,ask_qty_2,bid_price_3,bid_qty_3,ask_price_3,ask_qty_3,bid_price_4,bid_qty_4,ask_price_4,ask_qty_4
```

### Important note

In the current code, `snapshot_header()` is fixed for 5 levels.

---

## 8.7 `get_snapshot_row(token, levels=None)`

```python
row = builder.get_snapshot_row(token=26000, levels=5)
print(row)
```

### Purpose

Returns a CSV-style row for the selected token.

### Important internal behavior

The current Rust implementation sets:

```rust
let local_ts = 0u64;
let exch_ts = 0u64;
```

So the snapshot row currently starts with:

```text
0,0
```

### Example output

```text
0,0,2250050,2250000,150,2250100,75,2249900,200,2250200,125,2249800,100,2250300,100,2249700,50,2250400,80,2249600,25,2250500,60
```

---

# 9. Complete examples

## 9.1 Read full file into memory

```python
from fastreader import MessageCacheReader

reader = MessageCacheReader()

count = reader.load_to_cache("data/orders_trades.bin")

print("Loaded:", count)

summary = reader.get_cache_summary()
print("Summary:", summary)

messages = reader.get_all_messages()

for msg in messages[:3]:
    print(msg)
```

### Expected output

```text
Loaded: 152340
Summary: {'file_source': 'data/orders_trades.bin', 'total_messages': 152340, 'total_orders': 120000, 'total_trades': 32340, 'memory_usage_bytes': 7312320}
Order Message: SeqNo 1, MsgLen 40, MsgType 'N', ExchTs ..., LocalTs ..., OrderId ..., Token 26000, Side 'B', Price 2250000, Quantity 75, Missed 0
Trade Message: SeqNo 2, MsgLen 48, MsgType 'T', ExchTs ..., LocalTs ..., BuyOrderId ..., SellOrderId ..., Token 26000, Price 2250100, Quantity 50, Missed 0
Order Message: SeqNo 3, MsgLen 40, MsgType 'M', ExchTs ..., LocalTs ..., OrderId ..., Token 26000, Side 'S', Price 2250200, Quantity 100, Missed 0
```

---

## 9.2 Stream file message by message

```python
from fastreader import StreamingBinaryLoader

stream = StreamingBinaryLoader()

count = stream.open_stream("data/orders_trades.bin")

print("Total messages:", count)

for _ in range(5):
    print(stream.get_next_message())
```

### Expected output

```text
Total messages: 152340
Order Message: SeqNo 1, ...
Trade Message: SeqNo 2, ...
Order Message: SeqNo 3, ...
Order Message: SeqNo 4, ...
Trade Message: SeqNo 5, ...
```

---

## 9.3 Build order book from cached reader

```python
from fastreader import MessageCacheReader, OrderbookBuilder

reader = MessageCacheReader()
reader.load_to_cache("data/orders_trades.bin")

builder = OrderbookBuilder()
processed = builder.build_from_list(reader)

print("Processed:", processed)

snapshot = builder.get_snapshot(26000, levels=5)

print(snapshot)
```

### Expected output

```text
Processed: 152340
{
    'token': 26000,
    'found': True,
    'mid_price': 2250050,
    'best_bid': (2250000, 150),
    'best_ask': (2250100, 75),
    'spread': 100,
    'bids': [(2250000, 150), (2249900, 200), ...],
    'asks': [(2250100, 75), (2250200, 125), ...]
}
```

---

## 9.4 Build order book from streaming reader

```python
from fastreader import StreamingBinaryLoader, OrderbookBuilder

stream = StreamingBinaryLoader()
stream.open_stream("data/orders_trades.bin")

builder = OrderbookBuilder()

processed = builder.build_from_source(stream)

print("Processed:", processed)

snapshot = builder.get_snapshot(26000, levels=5)

print(snapshot)
```

### Expected output

```text
Processed: 152340
{
    'token': 26000,
    'found': True,
    'mid_price': 2250050,
    'best_bid': (2250000, 150),
    'best_ask': (2250100, 75),
    'spread': 100,
    'bids': [(2250000, 150), (2249900, 200), ...],
    'asks': [(2250100, 75), (2250200, 125), ...]
}
```

---

## 9.5 Export snapshot to CSV

```python
from fastreader import MessageCacheReader, OrderbookBuilder

reader = MessageCacheReader()
reader.load_to_cache("data/orders_trades.bin")

builder = OrderbookBuilder()
builder.build_from_list(reader)

header = builder.snapshot_header()
row = builder.get_snapshot_row(26000, levels=5)

with open("snapshot.csv", "w") as f:
    f.write(header + "\n")
    f.write(row + "\n")
```

### Expected `snapshot.csv`

```csv
local_ts,exch_ts,mid_price,bid_price_0,bid_qty_0,ask_price_0,ask_qty_0,bid_price_1,bid_qty_1,ask_price_1,ask_qty_1,bid_price_2,bid_qty_2,ask_price_2,ask_qty_2,bid_price_3,bid_qty_3,ask_price_3,ask_qty_3,bid_price_4,bid_qty_4,ask_price_4,ask_qty_4
0,0,2250050,2250000,150,2250100,75,2249900,200,2250200,125,2249800,100,2250300,100,2249700,50,2250400,80,2249600,25,2250500,60
```

---

# 10. Internal Rust logic explanation

## 10.1 `format_message(message)`

### Purpose

Converts a parsed Rust `Message` enum into a readable string.

### Why it is needed

Python users cannot directly understand Rust structs or raw binary packet fields.

So `format_message()` creates a human-readable string.

### Important Rust detail

The code uses:

```rust
std::ptr::addr_of!(field).read_unaligned()
```

### Why this is important

Binary packet structs can be packed.

Packed structs may not be aligned normally in memory.

Direct field access can be unsafe or invalid for packed data.

`read_unaligned()` safely reads the value even if it is not stored at a naturally aligned memory address.

---

## 10.2 `parse_order_packet(bytes)`

### Purpose

Converts raw bytes into an `OrderPacket`.

### Internal logic

```rust
let mut packet: OrderPacket = unsafe {
    std::ptr::read_unaligned(bytes.as_ptr() as *const _)
};
```

This copies raw bytes directly into the Rust packet struct.

Then endian conversion is applied:

```rust
packet.hdr.msg_len = u16::from_le(packet.hdr.msg_len);
packet.hdr.stream_id = u16::from_le(packet.hdr.stream_id);
packet.hdr.seq_no = u32::from_le(packet.hdr.seq_no);
packet.ord.exch_ts = u64::from_le(packet.ord.exch_ts);
packet.ord.order_id = u64::from_le(packet.ord.order_id);
packet.ord.token = u32::from_le(packet.ord.token);
packet.ord.price = u32::from_le(packet.ord.price);
packet.ord.quantity = u32::from_le(packet.ord.quantity);
packet.local_ts = u64::from_le(packet.local_ts);
```

### Why endian conversion is needed

Market-data files usually store numeric values in a specific byte order.

This code assumes little-endian format.

Without `from_le()`, values like price, token, quantity, and timestamp may become incorrect.

Example:

Correct value:

```text
2250000
```

Wrong endian interpretation may become:

```text
3765370880
```

---

## 10.3 `parse_trade_packet(bytes)`

### Purpose

Converts raw bytes into a `TradePacket`.

It performs the same kind of work as `parse_order_packet()`, but for trade fields.

Trade-specific fields include:

- buy order ID
- sell order ID
- token
- trade price
- trade quantity
- local timestamp

### Important detail

Trade fields such as token, price, and quantity are converted using signed integer conversion:

```rust
packet.trd.token = i32::from_le(packet.trd.token);
packet.trd.trade_price = i32::from_le(packet.trd.trade_price);
packet.trd.trade_quantity = i32::from_le(packet.trd.trade_quantity);
```

Order fields use unsigned integer conversion.

---

## 10.4 `read_next_message_from_file(file)`

### Purpose

Reads one valid message from the binary file.

### Step-by-step logic

1. Read the first byte.
2. If first byte is a space, skip it.
3. Read enough bytes for `PeekStructure`.
4. Inspect the message type.
5. If message type is `T`, read a `TradePacket`.
6. If message type is `N`, `M`, or `X`, read an `OrderPacket`.
7. Parse the packet.
8. Return `Some(Message)`.
9. If file ends, return `None`.

### Why `PeekStructure` is needed

Before reading the full packet, the reader must know whether the packet is an order packet or trade packet.

`PeekStructure` helps inspect the message type first.

Then the reader knows which struct size to read.

---

## 10.5 `count_messages_in_file(path)`

### Purpose

Counts valid messages in the binary file.

### Internal logic

```rust
let mut file = File::open(path)?;
let mut count = 0usize;

while let Some(_msg) = read_next_message_from_file(&mut file)? {
    count += 1;
}

Ok(count)
```

It repeatedly calls `read_next_message_from_file()` until no more messages are found.

---

# 11. Recommended workflows

## Workflow 1: Debug file contents

```python
from fastreader import MessageCacheReader

reader = MessageCacheReader()
reader.load_to_cache("data/orders_trades.bin")

print(reader.get_cache_summary())

for msg in reader.get_all_messages()[:20]:
    print(msg)
```

Use this when you want to confirm what messages exist in the binary file.

---

## Workflow 2: Process huge file with low memory

```python
from fastreader import StreamingBinaryLoader, OrderbookBuilder

stream = StreamingBinaryLoader()
stream.open_stream("huge_file.bin")

builder = OrderbookBuilder()
processed = builder.build_from_source(stream)

print("Processed:", processed)
```

Use this when the file is too large to load fully into RAM.

---

## Workflow 3: Backtest one token

```python
from fastreader import MessageCacheReader, OrderbookBuilder

reader = MessageCacheReader()
reader.load_to_cache("day_2026_05_15.bin")

builder = OrderbookBuilder()
builder.build_from_list(reader)

snapshot = builder.get_snapshot(token=26000, levels=5)

if snapshot["found"]:
    print("Best bid:", snapshot["best_bid"])
    print("Best ask:", snapshot["best_ask"])
    print("Spread:", snapshot["spread"])
else:
    print("Token not found")
```

---

## Workflow 4: Process only selected message types

```python
from fastreader import MessageCacheReader, OrderbookBuilder

reader = MessageCacheReader()
reader.load_to_cache("data/orders_trades.bin")

builder = OrderbookBuilder()

builder.apply_filter(["N", "M", "X"])

processed = builder.build_from_list(reader)

print("Processed order messages:", processed)
```

---

# 12. Performance notes

## `MessageCacheReader`

### Advantages

- fast repeated access
- easy debugging
- useful for backtesting
- all messages available immediately after loading

### Disadvantages

- high RAM usage
- not ideal for extremely large files

---

## `StreamingBinaryLoader`

### Advantages

- low memory usage
- suitable for huge files
- can process only first N messages
- good for production pipelines

### Disadvantages

- repeated analysis requires re-reading from disk
- random access is not available
- only moves forward unless `reset_cursor()` is called

---

## `OrderbookBuilder`

### Advantages

- works with both reader types
- centralizes order book logic
- supports message filtering
- returns Python dictionary snapshots
- returns CSV-style rows

### Disadvantages

- current snapshot timestamps are fixed at `0`
- current snapshot header is hardcoded for 5 levels

---

# 13. Current limitations

## 13.1 Snapshot timestamps are currently zero

In `get_snapshot_row()`:

```rust
let local_ts = 0u64;
let exch_ts = 0u64;
```

So CSV rows currently start with:

```text
0,0
```

If real timestamps are required, the builder should track the latest processed exchange timestamp and local timestamp.

---

## 13.2 CSV header is fixed for 5 levels

`snapshot_header()` currently returns columns for exactly 5 levels.

If you generate 10 levels using:

```python
builder.get_snapshot_row(token=26000, levels=10)
```

then the row may not match the fixed 5-level header.

---

## 13.3 Unknown message types stop reading

The current logic does this:

```rust
_ => return Ok(None),
```

That means an unknown message type behaves like end of stream.

If your binary file contains other packet types, this behavior should be changed to skip unknown packets safely.

---

## 13.4 `apply_filter()` uses only first character

This:

```python
builder.apply_filter(["New", "Trade"])
```

is treated as:

```python
builder.apply_filter(["N", "T"])
```

Only the first byte of each string is used.

---

# 14. Error handling

Rust errors are converted into Python exceptions.

## Missing file example

```python
reader = MessageCacheReader()
reader.load_to_cache("missing.bin")
```

Possible output:

```text
RuntimeError: No such file or directory
```

## Wrong source type example

```python
builder = OrderbookBuilder()
builder.build_from_source("wrong object")
```

Expected output:

```text
TypeError: build_from_source expects MessageCacheReader or StreamingBinaryLoader
```

---

# 15. Developer mental model

Think of this library in three layers.

---

## Layer 1: Binary parser

Responsible for:

- opening binary files
- reading bytes
- identifying message type
- parsing order packets
- parsing trade packets
- converting endian format

Important functions:

```rust
parse_order_packet()
parse_trade_packet()
read_next_message_from_file()
count_messages_in_file()
```

---

## Layer 2: Reader API

Responsible for exposing file-reading behavior to Python.

Classes:

```python
MessageCacheReader
StreamingBinaryLoader
```

`MessageCacheReader` loads everything.

`StreamingBinaryLoader` reads one message at a time.

---

## Layer 3: Order book engine

Responsible for:

- filtering messages
- processing order messages
- processing trade messages
- maintaining book state
- returning snapshots

Class:

```python
OrderbookBuilder
```

Internally it uses:

```rust
OrderBookManager::new()
process_order_message()
process_trade_message()
get_top_levels()
```

---

# 16. Quick reference

## `MessageCacheReader`

```python
reader = MessageCacheReader()

count = reader.load_to_cache(path)

messages = reader.get_all_messages()

summary = reader.get_cache_summary()
```

---

## `StreamingBinaryLoader`

```python
stream = StreamingBinaryLoader()

count = stream.open_stream(path)

message = stream.get_next_message()

stream.reset_cursor()
```

---

## `OrderbookBuilder`

```python
builder = OrderbookBuilder()

builder.apply_filter(["N", "M", "X", "T"])

processed = builder.build_from_list(reader)

processed = builder.build_from_source(stream, limit=10000)

snapshot = builder.get_snapshot(token, levels=5)

header = builder.snapshot_header()

row = builder.get_snapshot_row(token, levels=5)
```

---

# 17. Full minimal example

```python
from fastreader import MessageCacheReader, OrderbookBuilder

FILE_PATH = "data/orders_trades.bin"
TOKEN = 26000

reader = MessageCacheReader()

loaded = reader.load_to_cache(FILE_PATH)

print("Loaded messages:", loaded)
print("Cache summary:", reader.get_cache_summary())

builder = OrderbookBuilder()

processed = builder.build_from_list(reader)

print("Processed messages:", processed)

snapshot = builder.get_snapshot(TOKEN, levels=5)

if snapshot["found"]:
    print("Token:", snapshot["token"])
    print("Best bid:", snapshot["best_bid"])
    print("Best ask:", snapshot["best_ask"])
    print("Spread:", snapshot["spread"])
    print("Mid price:", snapshot["mid_price"])
    print("Bids:", snapshot["bids"])
    print("Asks:", snapshot["asks"])
else:
    print("No order book found for token:", TOKEN)
```

### Expected output

```text
Loaded messages: 152340
Cache summary: {'file_source': 'data/orders_trades.bin', 'total_messages': 152340, 'total_orders': 120000, 'total_trades': 32340, 'memory_usage_bytes': 7312320}
Processed messages: 152340
Token: 26000
Best bid: (2250000, 150)
Best ask: (2250100, 75)
Spread: 100
Mid price: 2250050
Bids: [(2250000, 150), (2249900, 200), (2249800, 100), (2249700, 50), (2249600, 25)]
Asks: [(2250100, 75), (2250200, 125), (2250300, 100), (2250400, 80), (2250500, 60)]
```

---

# 18. Summary

`fastreader` gives Python users a simple interface over a Rust-powered binary market-data engine.

Use:

- `MessageCacheReader` when you want full-file loading and repeated access.
- `StreamingBinaryLoader` when you want memory-efficient one-by-one reading.
- `OrderbookBuilder` when you want to convert parsed messages into an order book snapshot.

The full design is:

```text
Read binary data
        ↓
Parse order/trade packets
        ↓
Apply filtering
        ↓
Build order book
        ↓
Return Python snapshot or CSV row
```

This makes the library suitable for:

- binary market-data parsing
- order book reconstruction
- market microstructure research
- trading backtests
- production data processing
- PyPI-based Python usage
