Metadata-Version: 2.4
Name: hft-lob
Version: 0.1.9
Summary: Python wrapper for a high-performance Rust orderbook CLI
Home-page: https://github.com/pratima/hft_lob
Author: Pratima Kumari
Author-email: Pratima Kumari <pratimarmoney@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/pratima/hft_lob
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Dynamic: author
Dynamic: home-page
Dynamic: requires-python

# hft-lob

**hft-lob** is a Python library that wraps a high-performance Rust binary to read and decode NSE (National Stock Exchange) binary market feed files and reconstruct a **Limit Order Book (LOB)** for any instrument token. It gives you a clean Python API — no need to understand the binary format, no C extensions, no complex setup.

---

## Features

| # | Feature | Description |
|---|---------|-------------|
| 1 | [**get_all_messages**](#1-get_all_messages) | Load all LOB messages at once into a Python list |
| 2 | [**get_next_message**](#2-get_next_message) | Stream messages one at a time using a cursor |
| 3 | [**is_end_of_file**](#3-is_end_of_file) | Check whether all messages have been read |

---

## Installation

```bash
pip install hft-lob
```

**Requirements:**
- Linux x86_64
- Python >= 3.7
- No additional dependencies

---

## Quick Start

```python
from hft_lob.cli import Reader

# Step 1: Create a Reader with your file and token
reader = Reader("/path/to/feed.bin", tokens=1333)

# Step 2: Use any of the three features
messages = reader.get_all_messages()   # Feature 1
msg      = reader.get_next_message()   # Feature 2
done     = reader.is_end_of_file()     # Feature 3
```

---

## Feature Details

### 1. `get_all_messages`

**What it does:** Runs the Rust binary once, loads every LOB message from the file into memory, and returns them all as a Python list. Best when you want to process the entire dataset at once.

```python
from hft_lob.cli import Reader

reader = Reader("/path/to/feed.bin", tokens=1333)

messages = reader.get_all_messages()

print(f"Total messages: {len(messages)}")   # e.g. 493000
print(f"Header: {reader.header}")            # column names
print(f"First row: {messages[0]}")           # first CSV row
print(f"Last row:  {messages[-1]}")          # last CSV row
```

**Returns:** `list[str]` — each item is one full CSV row of LOB data  
**After calling:** `is_end_of_file()` becomes `True`

---

### 2. `get_next_message`

**What it does:** Returns one message at a time, advancing an internal cursor on each call. Returns `None` when there are no more messages. Best when you want to process data row by row or stop early.

```python
from hft_lob.cli import Reader

reader = Reader("/path/to/feed.bin", tokens=1333)

# Get just the first message
msg = reader.get_next_message()
print(msg)

# Or loop through all messages one by one
while True:
    msg = reader.get_next_message()
    if msg is None:
        break
    fields = msg.split(",")
    print(f"Mid price: {fields[2]}")
```

**Returns:** `str` (one CSV row) on each call, or `None` when all messages are exhausted  
**After last message:** `is_end_of_file()` becomes `True`

---

### 3. `is_end_of_file`

**What it does:** Tells you whether all messages have been read. Returns `True` in two cases — after `get_all_messages()` is called, or after `get_next_message()` reaches the last message. Returns `False` if messages are still available.

```python
from hft_lob.cli import Reader

reader = Reader("/path/to/feed.bin", tokens=1333)

print(reader.is_end_of_file())   # False — nothing read yet

msg = reader.get_next_message()
# ... keep reading ...

print(reader.is_end_of_file())   # True — when all messages are done
```

**Returns:** `bool` — `True` if finished, `False` if messages remain

---

## How It Works (Architecture)

```
Your Python Code
      |
      v
 hft_lob.cli.Reader          <-- Python class (this library)
      |
      v
 Rust Binary (subprocess)    <-- orderbook-linux-x86_64
      |
      v
 NSE Binary Feed File        <-- e.g., Feed_CM_StreamID_2_29_12_2025.bin
      |
      v
 CSV Output (parsed LOB)     <-- returned back to your Python code
```

The Rust binary is bundled **inside** the Python package. When you call `Reader(...)`, Python internally runs the Rust binary as a subprocess, passes your file path and token, captures the CSV output, and hands it back to you as Python strings. You never have to touch the binary directly.

---

## Package Structure

```
hft-lob/                        ← Python package root
│
├── hft_lob/                    ← Main Python package (installed into site-packages)
│   ├── __init__.py             ← Package initializer (marks this as a Python package)
│   ├── cli.py                  ← ALL the logic: Reader class + helper functions
│   └── bin/
│       └── orderbook-linux-x86_64   ← Compiled Rust binary (bundled with the package)
│
├── pyproject.toml              ← Package metadata: name, version, dependencies, entry points
├── setup.py                    ← Legacy build config (used by setuptools)
├── README.md                   ← This file
└── dist/                       ← Built packages (generated by `python -m build`)
    ├── hft_lob-0.1.8.tar.gz   ← Source distribution
    └── hft_lob-0.1.8-py3-none-any.whl  ← Wheel (what pip installs)
```

| File | Purpose |
|------|---------|
| `hft_lob/__init__.py` | Marks `hft_lob` as a Python package. Empty by design — import directly from `hft_lob.cli` |
| `hft_lob/cli.py` | Contains `Reader` class, `get_lob_for_token()` function, and `main()` CLI entry point |
| `hft_lob/bin/orderbook-linux-x86_64` | The Rust binary that does the heavy computation. Bundled so users don't install anything extra |
| `pyproject.toml` | Tells pip how to install the package, what Python version is needed, and what CLI command to register |

---

## Full API Reference

### `Reader(file_path, tokens)`

| Parameter | Type | Description |
|-----------|------|-------------|
| `file_path` | `str` | Path to the NSE binary feed `.bin` file |
| `tokens` | `int` or `list[int]` | Instrument token(s) to extract |

**What happens internally:**
1. Validates the Rust binary exists inside the installed package
2. Validates your `file_path` exists on disk
3. For each token, runs the Rust binary as a subprocess
4. Captures CSV output, strips the header row (stored in `reader.header`)
5. Stores all data rows in memory — ready to read

**Extra attributes:**

| Attribute | Description |
|-----------|-------------|
| `reader.header` | CSV column names string e.g. `local_ts,exch_ts,mid_price,...` |
| `reader.close()` | No-op, safe to call — no file handle to close |

### `get_lob_for_token(file_path, token)`

Lower-level helper — directly runs the Rust binary and returns raw CSV as a single string (includes header).

```python
from hft_lob.cli import get_lob_for_token

raw = get_lob_for_token("/path/to/feed.bin", 1333)
print(raw)
```

---

## CSV Output Format

Each message row has **23 comma-separated fields**:

```
local_ts, exch_ts, mid_price,
bid_price_0, bid_qty_0, ask_price_0, ask_qty_0,
bid_price_1, bid_qty_1, ask_price_1, ask_qty_1,
bid_price_2, bid_qty_2, ask_price_2, ask_qty_2,
bid_price_3, bid_qty_3, ask_price_3, ask_qty_3,
bid_price_4, bid_qty_4, ask_price_4, ask_qty_4
```

| Field | Description |
|-------|-------------|
| `local_ts` | Local timestamp (nanoseconds) when message was received |
| `exch_ts` | Exchange timestamp (nanoseconds) from NSE |
| `mid_price` | (best bid + best ask) / 2 |
| `bid_price_N` | Bid price at depth level N (0 = best bid) |
| `bid_qty_N` | Total quantity at bid level N |
| `ask_price_N` | Ask price at depth level N (0 = best ask) |
| `ask_qty_N` | Total quantity at ask level N |

---

## More Examples

### Parse into a dict

```python
from hft_lob.cli import Reader

reader = Reader("/path/to/feed.bin", tokens=1333)
messages = reader.get_all_messages()
columns = reader.header.split(",")

first = dict(zip(columns, messages[0].split(",")))
print(f"Best bid: {first['bid_price_0']} x {first['bid_qty_0']}")
print(f"Best ask: {first['ask_price_0']} x {first['ask_qty_0']}")
```

### Multiple tokens

```python
reader = Reader("/path/to/feed.bin", tokens=[1333, 2885, 5900])
all_msgs = reader.get_all_messages()
print(f"Combined messages: {len(all_msgs)}")
```

### Load into pandas

```python
import io, pandas as pd
from hft_lob.cli import Reader

reader = Reader("/path/to/feed.bin", tokens=1333)
messages = reader.get_all_messages()

df = pd.read_csv(io.StringIO(reader.header + "\n" + "\n".join(messages)))
print(df[["local_ts", "mid_price", "bid_price_0", "ask_price_0"]].head())
```

---

## CLI Usage

```bash
hft-lob /path/to/feed.bin 1333
```

Prints raw CSV to stdout.

---

## Platform Support

| Platform | Supported |
|----------|-----------|
| Linux x86_64 | Yes |
| macOS | No |
| Windows | No |

---

## License

MIT
