Metadata-Version: 2.4
Name: hft-lob
Version: 0.1.8
Summary: Python wrapper for a high-performance Rust orderbook CLI
Home-page: https://github.com/pratima/hft_lob
Author: Pratima Kumari
Author-email: Pratima Kumari <pratimarmoney@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/pratima/hft_lob
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Dynamic: author
Dynamic: home-page
Dynamic: requires-python

# hft-lob

**hft-lob** is a Python library that wraps a high-performance Rust binary to read and decode NSE (National Stock Exchange) binary market feed files and reconstruct a **Limit Order Book (LOB)** for any instrument token. It gives you a clean Python API — no need to understand the binary format, no C extensions, no complex setup.

---

## How It Works (Architecture)

```
Your Python Code
      |
      v
 hft_lob.cli.Reader          <-- Python class (this library)
      |
      v
 Rust Binary (subprocess)    <-- orderbook-linux-x86_64
      |
      v
 NSE Binary Feed File        <-- e.g., Feed_CM_StreamID_2_29_12_2025.bin
      |
      v
 CSV Output (parsed LOB)     <-- returned back to your Python code
```

The Rust binary is bundled **inside** the Python package. When you call `Reader(...)`, Python internally runs the Rust binary as a subprocess, passes your file path and token, captures the CSV output, and hands it back to you as Python strings. You never have to touch the binary directly.

---

## Package Structure

```
hft-lob/                        ← Python package root
│
├── hft_lob/                    ← Main Python package (installed into site-packages)
│   ├── __init__.py             ← Package initializer (marks this as a Python package)
│   ├── cli.py                  ← ALL the logic: Reader class + helper functions
│   └── bin/
│       └── orderbook-linux-x86_64   ← Compiled Rust binary (bundled with the package)
│
├── pyproject.toml              ← Package metadata: name, version, dependencies, entry points
├── setup.py                    ← Legacy build config (used by setuptools)
├── README.md                   ← This file
└── dist/                       ← Built packages (generated by `python -m build`)
    ├── hft_lob-0.1.6.tar.gz   ← Source distribution
    └── hft_lob-0.1.6-py3-none-any.whl  ← Wheel (what pip installs)
```

### What each file does

| File | Purpose |
|------|---------|
| `hft_lob/__init__.py` | Marks `hft_lob` as a Python package. Empty by design — import directly from `hft_lob.cli` |
| `hft_lob/cli.py` | Contains `Reader` class, `get_lob_for_token()` function, and `main()` CLI entry point |
| `hft_lob/bin/orderbook-linux-x86_64` | The Rust binary that does the heavy computation. Bundled so users don't install anything extra |
| `pyproject.toml` | Tells pip how to install the package, what Python version is needed, and what CLI command to register |

---

## Installation

```bash
pip install hft-lob
```

**Requirements:**
- Linux x86_64 (the bundled Rust binary is compiled for this platform)
- Python >= 3.7
- No additional dependencies

---

## Full API Reference

### `Reader` class

The main class for reading LOB data from an NSE binary feed file.

```python
from hft_lob.cli import Reader

reader = Reader(file_path, tokens)
```

**Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `file_path` | `str` | Absolute or relative path to the NSE binary feed `.bin` file |
| `tokens` | `int` or `list[int]` | Instrument token(s) to extract. Single int or list of ints |

**What happens internally when you create a Reader:**
1. Validates that the Rust binary exists inside the installed package
2. Validates that your `file_path` exists
3. For each token, runs the Rust binary as a subprocess
4. Captures all CSV output, strips the header row (stores it in `reader.header`)
5. Stores all data rows in memory as a list of strings
6. Ready for you to read

---

#### `reader.get_all_messages()`

Returns **all LOB messages** at once as a Python list of CSV strings.

```python
messages = reader.get_all_messages()

print(f"Total messages: {len(messages)}")
print(f"First message: {messages[0]}")
print(f"Last message:  {messages[-1]}")
```

**Returns:** `list[str]` — each string is one CSV row of LOB data  
**Side effect:** Sets `is_end_of_file()` to `True`

---

#### `reader.get_next_message()`

Returns **one message at a time**, advancing an internal cursor. Useful for streaming/processing one row at a time without loading everything.

```python
msg = reader.get_next_message()
if msg is not None:
    print(msg)
```

**Returns:** `str` (next CSV row) or `None` when all messages are exhausted  
**Side effect:** Advances internal index. Sets `is_end_of_file()` to `True` when done

---

#### `reader.is_end_of_file()`

Check whether all messages have been read.

```python
if reader.is_end_of_file():
    print("No more messages")
```

**Returns:** `bool` — `True` if all messages read or no messages exist, `False` otherwise

---

#### `reader.header`

The CSV column header row, automatically extracted from the Rust binary output.

```python
print(reader.header)
# local_ts,exch_ts,mid_price,bid_price_0,bid_qty_0,ask_price_0,ask_qty_0,...
```

---

#### `reader.close()`

Included for API completeness. No-op since the Rust binary runs as a subprocess (no persistent file handle).

```python
reader.close()  # safe to call, does nothing
```

---

### `get_lob_for_token()` function

Lower-level helper — directly runs the Rust binary and returns the raw CSV output as a single string.

```python
from hft_lob.cli import get_lob_for_token

raw = get_lob_for_token("/path/to/feed.bin", 1333)
print(raw)  # full CSV output including header
```

**Parameters:** `file_path (str)`, `token (int)`  
**Returns:** `str` — full CSV output from Rust binary  
**Raises:** `FileNotFoundError` if binary or file not found, `RuntimeError` if binary exits with error

---

## CSV Output Format

Each message row has **23 comma-separated fields**:

```
local_ts, exch_ts, mid_price,
bid_price_0, bid_qty_0, ask_price_0, ask_qty_0,
bid_price_1, bid_qty_1, ask_price_1, ask_qty_1,
bid_price_2, bid_qty_2, ask_price_2, ask_qty_2,
bid_price_3, bid_qty_3, ask_price_3, ask_qty_3,
bid_price_4, bid_qty_4, ask_price_4, ask_qty_4
```

| Field | Description |
|-------|-------------|
| `local_ts` | Local timestamp (nanoseconds) when the message was received |
| `exch_ts` | Exchange timestamp (nanoseconds) from NSE |
| `mid_price` | Mid price = (best bid + best ask) / 2 |
| `bid_price_N` | Bid price at depth level N (0 = best bid) |
| `bid_qty_N` | Total quantity available at bid level N |
| `ask_price_N` | Ask price at depth level N (0 = best ask) |
| `ask_qty_N` | Total quantity available at ask level N |

Levels 0–4 give you **5 levels of order book depth** on each side.

---

## Usage Examples

### Example 1: Read all messages, parse with split

```python
from hft_lob.cli import Reader

reader = Reader("/path/to/Feed_CM_StreamID_2_29_12_2025.bin", tokens=1333)

messages = reader.get_all_messages()
columns = reader.header.split(",")

print(f"Columns: {columns}")
print(f"Total messages: {len(messages)}")

# Parse first message into a dict
first = dict(zip(columns, messages[0].split(",")))
print(f"Best bid: {first['bid_price_0']} x {first['bid_qty_0']}")
print(f"Best ask: {first['ask_price_0']} x {first['ask_qty_0']}")
```

### Example 2: Stream messages one by one

```python
from hft_lob.cli import Reader

reader = Reader("/path/to/feed.bin", tokens=1333)

while not reader.is_end_of_file():
    msg = reader.get_next_message()
    if msg:
        fields = msg.split(",")
        mid_price = fields[2]
        print(f"Mid price: {mid_price}")
```

### Example 3: Multiple tokens in one Reader

```python
from hft_lob.cli import Reader

# Read data for 3 instruments in one call
reader = Reader("/path/to/feed.bin", tokens=[1333, 2885, 5900])

all_msgs = reader.get_all_messages()
print(f"Combined messages across all tokens: {len(all_msgs)}")
```

### Example 4: Load into pandas DataFrame

```python
import io
import pandas as pd
from hft_lob.cli import Reader

reader = Reader("/path/to/feed.bin", tokens=1333)
messages = reader.get_all_messages()

csv_data = reader.header + "\n" + "\n".join(messages)
df = pd.read_csv(io.StringIO(csv_data))

print(df.head())
print(df[["local_ts", "mid_price", "bid_price_0", "ask_price_0"]])
```

---

## CLI Usage

After installation, you can also call the Rust binary directly from the terminal via the registered entry point:

```bash
hft-lob /path/to/feed.bin 1333
```

This prints raw CSV to stdout — same output as the Rust binary.

---

## Platform Support

| Platform | Supported |
|----------|-----------|
| Linux x86_64 | Yes |
| macOS |  No (binary not compiled for macOS) |
| Windows |  No |

---

## License

MIT
