Metadata-Version: 2.3
Name: defistream
Version: 2.7.0
Summary: Python client for the DeFiStream API
Project-URL: Homepage, https://defistream.dev
Project-URL: Documentation, https://docs.defistream.dev
Project-URL: Repository, https://github.com/Eren-Nevin/DeFiStream_PythonClient
Author-email: DeFiStream <support@defistream.dev>
License: MIT
Keywords: api,blockchain,crypto,defi,ethereum,web3
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.9
Requires-Dist: httpx>=0.25.0
Requires-Dist: pandas>=2.0.0
Requires-Dist: pyarrow>=14.0.0
Requires-Dist: pydantic>=2.0.0
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: python-dotenv>=1.0.0; extra == 'dev'
Provides-Extra: polars
Requires-Dist: polars>=0.20.0; extra == 'polars'
Description-Content-Type: text/markdown

# DeFiStream Python Client

Official Python client for the [DeFiStream API](https://defistream.dev).

## Getting an API Key

To use the DeFiStream API, you need to sign up for an account at [defistream.dev](https://defistream.dev) to obtain your API key.

## Installation

```bash
pip install defistream
```

This includes pandas and pyarrow by default for DataFrame support.

With polars support (in addition to pandas):
```bash
pip install defistream[polars]
```

## Quick Start

```python
from defistream import DeFiStream

# Initialize client (reads DEFISTREAM_API_KEY from environment if not provided)
client = DeFiStream()

# Or with explicit API key
client = DeFiStream(api_key="dsk_your_api_key")

# Query ERC20 transfers using builder pattern
df = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21010000)
    .as_df()
)

print(df.head())
```

## Features

- **Builder pattern**: Fluent query API with chainable methods
- **Aggregate queries**: Bucket events into time or block intervals with summary statistics
- **Type-safe**: Full type hints and Pydantic models
- **Multiple formats**: DataFrame (pandas/polars), CSV, Parquet, JSON
- **Async support**: Native async/await with `AsyncDeFiStream`
- **Chain namespaces**: `client.evm.*`, `client.tron.*`, `client.bitcoin.*`, `client.exchange.*`
- **All protocols**: ERC20, AAVE, Uniswap, Lido, Native tokens, TRC20, Hyperliquid

## Supported Protocols

| Chain | Protocol | Events |
|-------|----------|--------|
| EVM | ERC20 | `transfers` |
| EVM | Native Token | `transfers` |
| EVM | AAVE V3 | `deposits`, `withdrawals`, `borrows`, `repays`, `flashloans`, `liquidations` |
| EVM | Uniswap V3 | `swaps`, `deposits`, `withdrawals`, `collects` |
| EVM | Lido | `deposits`, `withdrawal_requests`, `withdrawals_claimed`, `l2_deposits`, `l2_withdrawal_requests` |
| Tron | TRC20 | `transfers` |
| Tron | Native (TRX) | `transfers` |
| Bitcoin | Native (BTC) | `transfers`, `mined` |
| Exchange | Binance | `raw_trades` (tick data), `ohlcv` (candle data), `book_depth`, `open_interest`, `long_short_ratios`, `funding_rate` |
| Exchange | Hyperliquid | `fills`, `trades`, `ohlcv`, `positions`, `funding`, `transfers`, `vaults`, `sends`, `spot_transfers` |

## Usage Examples

### Builder Pattern

The client uses a fluent builder pattern. The query is only executed when you call a terminal method like `as_df()`, `as_file()`, or `as_dict()`.

```python
from defistream import DeFiStream

client = DeFiStream()

# Build query step by step
query = client.evm.erc20.transfers("USDT")
query = query.network("ETH")
query = query.block_range(21000000, 21010000)
query = query.min_amount(1000)

# Execute and get DataFrame
df = query.as_df()

# Or chain everything
df = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21010000)
    .min_amount(1000)
    .as_df()
)
```

### ERC20 Transfers

```python
# Get USDT transfers over 10,000 USDT
df = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21010000)
    .min_amount(10000)
    .as_df()
)

# Query multiple tokens at once (known symbols only, not contract addresses)
df = (
    client.evm.erc20.transfers("USDT", "USDC", "DAI")
    .network("ETH")
    .block_range(21000000, 21010000)
    .as_df()
)

# Or set multiple tokens via chain method
df = (
    client.evm.erc20.transfers()
    .token("USDT", "USDC")
    .network("ETH")
    .block_range(21000000, 21010000)
    .as_df()
)

# Query same token list across networks — skip tokens not on a chain
df = (
    client.evm.erc20.transfers("USDT", "USDC", "BTCB")
    .network("BSC")
    .block_range(48000000, 48010000)
    .ignore_non_existing()
    .as_df()
)

# Filter by sender
df = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21010000)
    .sender("0x28c6c06298d514db089934071355e5743bf21d60")
    .as_df()
)
```

### AAVE Events

```python
# Get deposits
df = (
    client.evm.aave_v3.deposits()
    .network("ETH")
    .block_range(21000000, 21010000)
    .as_df()
)

# Use a specific market type on ETH (Core, Prime, or EtherFi)
df = (
    client.evm.aave_v3.deposits()
    .network("ETH")
    .block_range(21000000, 21010000)
    .eth_market_type("Prime")
    .as_df()
)
```

### Uniswap Swaps

```python
# Get swaps for WETH/USDC pool with 0.05% fee tier
df = (
    client.evm.uniswap_v3.swaps("WETH", "USDC", 500)
    .network("ETH")
    .block_range(21000000, 21010000)
    .as_df()
)

# Or build with chain methods
df = (
    client.evm.uniswap_v3.swaps()
    .symbol0("WETH")
    .symbol1("USDC")
    .fee(500)
    .network("ETH")
    .block_range(21000000, 21010000)
    .as_df()
)
```

### Native Token Transfers

```python
# Get ETH transfers >= 1 ETH
df = (
    client.evm.native.transfers()
    .network("ETH")
    .block_range(21000000, 21010000)
    .min_amount(1.0)
    .as_df()
)
```

### Tron Transfers

Sender/receiver/involving filters accept **both** `T...` (Base58) and `0x...` (hex).
Response addresses are `T...` by default — call `.address_format("hex")` for the
legacy 0x output.

```python
# TRC20 token transfers (e.g. USDT on Tron) — responses in Base58
df = (
    client.tron.trc20.transfers("USDT")
    .block_range(70000000, 70010000)
    .min_amount(1000)
    .as_df()
)

# Filter by a TRON Base58 wallet — just pass the T-address
df = (
    client.tron.trc20.transfers("USDT")
    .sender("TMuA6YqfCeX8EhbfYEg5y7S4DqzSJireY9")
    .block_range(70000000, 70010000)
    .as_df()
)

# Native TRX transfers
df = (
    client.tron.native.transfers()
    .block_range(70000000, 70010000)
    .min_amount(100)
    .as_df()
)

# Legacy 0x-hex output for back-compat (not recommended for new code)
df = (
    client.tron.native.transfers()
    .address_format("hex")
    .block_range(70000000, 70010000)
    .as_df()
)
```

### Bitcoin Transfers and Mined Payouts

```python
# Native BTC transfers (value flows between addresses — excludes coinbase)
df = (
    client.bitcoin.native.transfers()
    .block_range(880000, 880100)
    .min_amount(1.0)
    .as_df()
)

# Coinbase (block-reward) payouts — no sender field
df = (
    client.bitcoin.native.mined()
    .block_range(928000, 928144)
    .as_df()
)

# Filter by a mining pool's payout address
df = (
    client.bitcoin.native.mined()
    .receiver("bc1qm34lsc65zpw79lxes69zkqmk6ee3ewf0j77s3h")
    .block_range(928000, 928144)
    .as_df()
)

# Daily reward volume
df = (
    client.bitcoin.native.mined()
    .block_range(927000, 928000)
    .aggregate(group_by="block_number", period="144")
    .as_df()
)
```

### Hyperliquid Exchange Data

```python
# Hyperliquid trades
df = (
    client.exchange.hyperliquid.trades()
    .tokens("BTC")
    .date_range("2025-01-01", "2025-01-08")
    .as_df()
)

# Hyperliquid funding
df = (
    client.exchange.hyperliquid.funding()
    .tokens("ETH")
    .date_range("2025-01-01", "2025-02-01")
    .as_df()
)

# Hyperliquid fills with multiple tokens
df = (
    client.exchange.hyperliquid.fills()
    .tokens("BTC", "ETH")
    .date_range("2025-01-01", "2025-02-01")
    .as_df()
)

# Hyperliquid position history — carry-forward per-minute position snapshots
# for one or more wallets. At least one of tokens/wallets is required;
# `window` must be >= 5m. Columns: time, wallet, token, side, amount,
# avg_entry, opened_at, mark_price, size, unrealized_pnl, funding, fee,
# exact_avg_price.
df = (
    client.exchange.hyperliquid.position_history()
    .wallets("0x2c8b4450e6e1eb177423841f4435ac917e3b076c")
    .date_range("2025-10-01", "2025-10-02")
    .window("1h")
    .as_df()
)
```

> **Funding sign convention**: in both `funding()` and `position_history()`,
> positive = user received funding, negative = user paid. Matches HL's
> `userFunding.delta.usdc` API. Note that HL's `clearinghouseState.cumFunding.sinceOpen`
> uses the opposite sign — same magnitude, flipped — so cross-checks against
> that field must negate.
>
> `position_history().exact_avg_price` is a boolean: `true` means `avg_entry`
> is derived from a clean open observed within our data window;
> `false` means the wallet was already in a position before HL's fills
> export began (2025-07-27) and has stayed continuously in position since —
> in that case `avg_entry` is approximate (seeded from the first observed
> fill price). `amount`, `side`, `size`, `funding`, and `fee` remain exact
> regardless.

### My Wallets Filter

Use `.my_wallets()` to filter events to only those involving your pre-configured wallet addresses:

```python
# Get USDT transfers involving your wallets
df = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21010000)
    .my_wallets()
    .as_df()
)

# Works with any chain namespace
df = (
    client.tron.trc20.transfers("USDT")
    .block_range(70000000, 70010000)
    .my_wallets()
    .as_df()
)
```

### Wallet Label Management

Manage your wallet labels with `client.wallets`. Labels and categories are **list columns** (arrays of strings), and the client automatically serializes DataFrames to Parquet for the API.

```python
import pandas as pd
from defistream import DeFiStream

client = DeFiStream()

# Upsert labels — DataFrame with list columns
df = pd.DataFrame([
    {"address": "0x28c6c06298d514db089934071355e5743bf21d60",
     "labels": ["Binance", "Hot Wallet"], "categories": ["exchange", "cex"], "entity": "Binance"},
    {"address": "bc1qxy2kgdygjrsqtzq2n0yrf2493p83kkfjhx0wlh",
     "labels": ["My BTC Wallet"], "categories": ["personal"], "entity": "Me"},
])
client.wallets.upsert(df)  # auto-converts to parquet

# List all your labels
labels_df = client.wallets.list()

# Look up a single address
label = client.wallets.get("0x28c6c06298d514db089934071355e5743bf21d60")

# Delete a label
client.wallets.delete("0x28c6c06298d514db089934071355e5743bf21d60")
```

Addresses can be EVM (`0x...`), Bitcoin (`bc1...`), or Tron hex format. Max 1,000,000 rows per upsert.

### Label & Category Filters

```python
# Get USDT transfers involving Binance wallets
df = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21010000)
    .involving_label("Binance")
    .as_df()
)

# Get USDT transfers FROM exchanges TO DeFi protocols
df = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21010000)
    .sender_category("exchange")
    .receiver_category("defi")
    .as_df()
)

# Get AAVE deposits involving exchange addresses
df = (
    client.evm.aave_v3.deposits()
    .network("ETH")
    .block_range(21000000, 21010000)
    .involving_category("exchange")
    .as_df()
)

# Get native ETH transfers FROM Binance or Coinbase (multi-value)
df = (
    client.evm.native.transfers()
    .network("ETH")
    .block_range(21000000, 21010000)
    .sender_label("Binance,Coinbase")
    .as_df()
)
```

### Exclude Filters

Exclude filters let you remove events matching certain addresses, labels, or categories. They can be freely combined with positive filters.

```python
# Get USDT transfers from exchanges, excluding Binance
df = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21010000)
    .sender_category("exchange")
    .exclude_sender_label("Binance")
    .as_df()
)

# Get all AAVE deposits except those from exchange addresses
df = (
    client.evm.aave_v3.deposits()
    .network("ETH")
    .block_range(21000000, 21010000)
    .exclude_involving_category("exchange")
    .as_df()
)

# Exclude a specific address from results
df = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21010000)
    .exclude_receiver("0x28c6c06298d514db089934071355e5743bf21d60")
    .as_df()
)
```

### Trade Data (Binance)

Query exchange-sourced tick and OHLCV data from Binance via `client.exchange.binance`. Unlike on-chain events, trade data uses time ranges rather than block ranges.

```python
# Raw tick trades — CSV and Parquet only (no JSON/as_dict)
df = (
    client.exchange.binance.raw_trades()
    .token("BTC")
    .start_time("2024-01-01")
    .end_time("2024-02-01")
    .as_df()
)

# OHLCV candles — all formats supported
df = (
    client.exchange.binance.ohlcv()
    .token("BTC")
    .window("4h")
    .start_time("2024-01-01")
    .end_time("2024-02-01")
    .as_df()
)

# as_dict() also supported for OHLCV
candles = (
    client.exchange.binance.ohlcv()
    .token("ETH")
    .window("1h")
    .start_time("2024-01-01")
    .end_time("2024-01-02")
    .as_dict()
)

# Get a download link
link_info = (
    client.exchange.binance.raw_trades()
    .token("BTC")
    .start_time("2024-01-01")
    .end_time("2024-02-01")
    .as_link("parquet")
)
import polars as pl
df = pl.read_parquet(link_info.link)
```

> **Note:** `raw_trades()` only supports `as_df()`, `as_file()` (CSV/Parquet), and `as_link()`. Calling `as_dict()` or using a `.json` file extension raises a `ValidationError`.
>
> **Time range limit:** Raw trades queries are limited to a maximum range of **7 days**. Exceeding this raises a `ValidationError`. OHLCV queries are limited to **31 days**.

Valid OHLCV window sizes: `1m`, `5m`, `15m`, `30m`, `1h`, `4h`, `1d`.

### Exchange Data (Binance)

Query exchange data including order book depth, open interest, funding rates, and long/short ratios. All endpoints use time ranges and support CSV/Parquet only (no JSON).

```python
# Book depth snapshots (31-day max range)
df = (
    client.exchange.binance.book_depth()
    .token("BTC")
    .time_range("2025-01-01", "2025-02-01")
    .as_df()
)

# Open interest (no range limit)
df = (
    client.exchange.binance.open_interest()
    .token("ETH")
    .time_range("2024-01-01", "2025-01-01")
    .as_df("polars")
)

# Funding rate saved to file
(
    client.exchange.binance.funding_rate()
    .token("BTC")
    .time_range("2024-01-01", "2025-01-01")
    .as_file("funding_rate.parquet")
)

# Long/short ratios as shareable link
link_info = (
    client.exchange.binance.long_short_ratios()
    .token("SOL")
    .time_range("2024-06-01", "2025-01-01")
    .as_link("parquet")
)
```

> **Note:** Exchange data endpoints only support `as_df()`, `as_file()` (CSV/Parquet), and `as_link()`. Calling `as_dict()` raises a `ValidationError`.
>
> **Time range limit:** `book_depth()` is limited to a maximum range of **31 days**. The other endpoints (`open_interest`, `long_short_ratios`, `funding_rate`) are also limited to **31 days**.

### Aggregate Queries

Use `.aggregate()` to bucket raw events into time or block intervals with summary statistics. All existing filters work before `.aggregate()` is called.

```python
# Aggregate USDT transfers into 2-hour buckets
df = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21100000)
    .aggregate(group_by="time", period="2h")
    .as_df()
)

# Aggregate by block intervals
df = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21100000)
    .aggregate(group_by="block", period="100b")
    .as_df()
)

# Combine with filters — large transfers from exchanges, bucketed hourly
df = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21100000)
    .sender_category("exchange")
    .min_amount(10000)
    .aggregate(group_by="time", period="1h")
    .as_df()
)

# Aggregate Uniswap swaps
df = (
    client.evm.uniswap_v3.swaps("WETH", "USDC", 500)
    .network("ETH")
    .block_range(21000000, 21100000)
    .aggregate(group_by="time", period="1h")
    .as_df()
)

# Tron TRC20 USDT aggregates (hourly buckets) — note group_by="time" or "block_number"
df = (
    client.tron.trc20.transfers("USDT")
    .block_range(80000000, 80001000)
    .aggregate(group_by="time", period="1h")
    .as_df()
)

# Tron native TRX aggregates by 5000-block buckets
df = (
    client.tron.native.transfers()
    .block_range(80000000, 80010000)
    .aggregate(group_by="block_number", period="5000")
    .as_df()
)

# Bitcoin native aggregates (hourly BTC volume)
df = (
    client.bitcoin.native.transfers()
    .block_range(880000, 881000)
    .aggregate(group_by="time", period="1h")
    .as_df()
)
```

Tron and Bitcoin aggregate endpoints return rows of `{time | block_number, token, agg_amount, count}` — `agg_amount` is the SUM of amounts in the bucket. Aggregate queries cost `block_range × 0.5` (min 100) — half the raw event cost.

You can also discover what aggregate fields are available for a protocol:

```python
schema = client.aggregate_schema("erc20")
print(schema)
```

### Verbose Mode

By default, responses omit metadata fields to reduce payload size. Use `.verbose()` to include all fields:

```python
# Default: compact response (no tx_hash, tx_id, log_index, network, name)
df = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21010000)
    .as_df()
)

# Verbose: includes all metadata fields
df = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21010000)
    .verbose()
    .as_df()
)
```

### Value Enrichment

Use `.with_value()` to enrich events with USD value data. This adds a `value_usd` (amount × price) column to individual events. On aggregate endpoints, it produces an `agg_value_usd` (sum) column.

Supported protocols: AAVE, Uniswap, Lido, ERC20, Native Token.

```python
# Individual events with value data
df = (
    client.evm.aave_v3.deposits()
    .network("ETH")
    .block_range(21000000, 21010000)
    .with_value()
    .as_df()
)
# df now includes 'value_usd' column

# Aggregate with value — adds agg_value_usd column
df = (
    client.evm.aave_v3.deposits()
    .network("ETH")
    .block_range(21000000, 21100000)
    .with_value()
    .aggregate(group_by="time", period="2h")
    .as_df()
)
# df now includes 'agg_value_usd' column
```

### Return as DataFrame

```python
# As pandas DataFrame (default)
df = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21010000)
    .as_df()
)

# As polars DataFrame
df = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21010000)
    .as_df("polars")
)
```

### Save to File

Format is automatically determined by file extension:

```python
# Save as Parquet (recommended for large datasets)
(
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21100000)
    .as_file("transfers.parquet")
)

# Save as CSV
(
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21100000)
    .as_file("transfers.csv")
)

# Save as JSON
(
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21010000)
    .as_file("transfers.json")
)
```

### Get Download Link

Get a shareable download link instead of the data directly. Useful for passing to other tools or libraries:

```python
from defistream import DeFiStream

client = DeFiStream()

# Get a download link (CSV format by default)
link_info = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21100000)
    .as_link()
)

print(link_info.filename)  # erc20_transfer_ETH_21000000_21100000.csv
print(link_info.link)      # https://dl.defistream.dev/dh/abc123/...
print(link_info.expiry)    # 2026-02-03 15:30:00
print(link_info.size)      # 1.29 MB

# Get as Parquet link
link_info = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21100000)
    .as_link(format="parquet")
)

# Use with polars (reads directly from URL)
import polars as pl
df = pl.read_parquet(link_info.link)

# Use with pandas
import pandas as pd
df = pd.read_parquet(link_info.link)
```

> **Note:** Links expire after 1 hour. The `as_link()` method only supports `csv` and `parquet` formats.

### Calculate Query Cost

Preview how many blocks a query will cost **before** executing it. No quota is deducted.

```python
# Build a query as usual, then call calculate_cost() instead of as_df()
cost = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21010000)
    .calculate_cost()
)

print(cost.cost)                  # 10000
print(cost.quota_remaining)       # 500000
print(cost.quota_remaining_after) # 490000

# Also works on aggregate queries
cost = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21100000)
    .aggregate(group_by="time", period="1h")
    .calculate_cost()
)
```

### Return as Dictionary (JSON)

For small queries, you can get results as a list of dictionaries:

```python
transfers = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21010000)
    .as_dict()
)

for transfer in transfers:
    print(f"{transfer['sender']} -> {transfer['receiver']}: {transfer['amount']}")
```

> **Note:** Range limits by format — JSON (`as_dict()`, `as_file("*.json")`) and CSV (`as_file("*.csv")`): max **10,000 blocks or 1 day** per request. Parquet (`as_df()`, `as_file("*.parquet")`): max **100,000 blocks or 7 days** per request (1,000,000 blocks for ARB). For larger ranges, page your calls in weekly windows.

### Context Manager

Both sync and async clients support context managers to automatically close connections:

```python
# Sync
with DeFiStream() as client:
    df = (
        client.evm.erc20.transfers("USDT")
        .network("ETH")
        .block_range(21000000, 21010000)
        .as_df()
    )
```

### Async Usage

```python
import asyncio
from defistream import AsyncDeFiStream

async def main():
    async with AsyncDeFiStream() as client:
        df = await (
            client.evm.erc20.transfers("USDT")
            .network("ETH")
            .block_range(21000000, 21010000)
            .as_df()
        )
        print(f"Found {len(df)} transfers")

asyncio.run(main())
```

### List Available Protocols

```python
client = DeFiStream()
protocols = client.protocols()
print(protocols)
# {'evm': ['native', 'erc20', 'aave_v3', 'uniswap_v3', 'lido'],
#  'tron': ['native', 'trc20'],
#  'bitcoin': ['native'],
#  'exchange': ['binance', 'hyperliquid']}
```

## Configuration

### Environment Variables

```bash
export DEFISTREAM_API_KEY=dsk_your_api_key
export DEFISTREAM_BASE_URL=https://api.defistream.dev/v1  # optional
```

```python
from defistream import DeFiStream

# API key from environment
client = DeFiStream()

# Or explicit
client = DeFiStream(api_key="dsk_...", base_url="https://api.defistream.dev/v1")
```

### Timeout and Retries

```python
client = DeFiStream(
    api_key="dsk_...",
    timeout=60.0,  # seconds
    max_retries=3
)
```

## Error Handling

```python
from defistream import DeFiStream
from defistream.exceptions import (
    DeFiStreamError,
    AuthenticationError,
    QuotaExceededError,
    RateLimitError,
    ValidationError
)

client = DeFiStream()

try:
    df = (
        client.evm.erc20.transfers("USDT")
        .network("ETH")
        .block_range(21000000, 21010000)
        .as_df()
    )
except AuthenticationError:
    print("Invalid API key")
except QuotaExceededError as e:
    print(f"Quota exceeded. Remaining: {e.remaining}")
except RateLimitError as e:
    print(f"Rate limited. Retry after: {e.retry_after}s")
except ValidationError as e:
    print(f"Invalid request: {e.message}")
except DeFiStreamError as e:
    print(f"API error: {e}")
```

## Response Headers

Access rate limit and quota information:

```python
df = (
    client.evm.erc20.transfers("USDT")
    .network("ETH")
    .block_range(21000000, 21010000)
    .as_df()
)

# Access response metadata
print(f"Rate limit: {client.last_response.rate_limit}")
print(f"Remaining quota: {client.last_response.quota_remaining}")
print(f"Request cost: {client.last_response.request_cost}")
```

## Builder Methods Reference

### Common Methods (all protocols)

| Method | Description |
|--------|-------------|
| `.network(net)` | Set network (ETH, ARB, BASE, OP, POLYGON, etc.) |
| `.start_block(n)` | Set starting block number |
| `.end_block(n)` | Set ending block number |
| `.block_range(start, end)` | Set both start and end blocks |
| `.start_time(ts)` | Set starting time (ISO format or Unix timestamp) |
| `.end_time(ts)` | Set ending time (ISO format or Unix timestamp) |
| `.time_range(start, end)` | Set both start and end times |
| `.verbose()` | Include all metadata fields |
| `.with_value()` | Enrich events with USD value data (`value_usd` column) |
| `.ignore_non_existing()` | Skip missing tokens in multi-token queries instead of erroring |
| `.my_wallets()` | Filter to events involving your pre-configured wallet addresses |

### Protocol-Specific Parameters

| Method | Protocols | Description |
|--------|-----------|-------------|
| `.token(*symbols)` | ERC20 | Token symbol(s) (USDT, USDC) or contract address. Accepts multiple known symbols for multi-token queries (multi-value). |
| `.sender(*addrs)` | ERC20, Native | Filter by sender address (multi-value) |
| `.receiver(*addrs)` | ERC20, Native | Filter by receiver address (multi-value) |
| `.involving(*addrs)` | All | Filter by any involved address (multi-value) |
| `.from_address(*addrs)` | ERC20, Native | Alias for `.sender()` |
| `.to_address(*addrs)` | ERC20, Native | Alias for `.receiver()` |
| `.min_amount(amt)` | ERC20, Native | Minimum transfer amount |
| `.max_amount(amt)` | ERC20, Native | Maximum transfer amount |
| `.eth_market_type(type)` | AAVE | Market type for ETH: 'Core', 'Prime', 'EtherFi' |
| `.symbol0(sym)` | Uniswap | First token symbol (required) |
| `.symbol1(sym)` | Uniswap | Second token symbol (required) |
| `.fee(tier)` | Uniswap | Fee tier: 100, 500, 3000, 10000 (required) |
| `.window(size)` | Binance OHLCV | Candle window: `1m`, `5m`, `15m`, `30m`, `1h`, `4h`, `1d` |
| `.skip_id(id)` | Binance raw trades | Pagination: skip trades with ID <= id |

### Address Label & Category Filters

Filter events by entity names or categories using the labels database. Available on all protocols.

| Method | Protocols | Description |
|--------|-----------|-------------|
| `.involving_label(label)` | All | Filter where any involved address matches a label substring (e.g., "Binance") |
| `.involving_category(cat)` | All | Filter where any involved address matches a category (e.g., "exchange") |
| `.sender_label(label)` | ERC20, Native | Filter sender by label substring |
| `.sender_category(cat)` | ERC20, Native | Filter sender by category |
| `.receiver_label(label)` | ERC20, Native | Filter receiver by label substring |
| `.receiver_category(cat)` | ERC20, Native | Filter receiver by category |

**Exclude filters** — the negative counterparts that exclude matching events:

| Method | Protocols | Description |
|--------|-----------|-------------|
| `.exclude_involving(*addrs)` | All | Exclude events where any involved address matches (multi-value) |
| `.exclude_involving_label(label)` | All | Exclude events where any involved address matches a label substring |
| `.exclude_involving_category(cat)` | All | Exclude events where any involved address matches a category |
| `.exclude_sender(*addrs)` | ERC20, Native | Exclude events where sender matches (multi-value) |
| `.exclude_sender_label(label)` | ERC20, Native | Exclude events where sender matches a label substring |
| `.exclude_sender_category(cat)` | ERC20, Native | Exclude events where sender matches a category |
| `.exclude_receiver(*addrs)` | ERC20, Native | Exclude events where receiver matches (multi-value) |
| `.exclude_receiver_label(label)` | ERC20, Native | Exclude events where receiver matches a label substring |
| `.exclude_receiver_category(cat)` | ERC20, Native | Exclude events where receiver matches a category |

**Multi-value support:** Pass multiple values as separate arguments (e.g., `.sender_label("Binance", "Coinbase")`) or as a comma-separated string (e.g., `.sender_label("Binance,Coinbase")`). Both forms are equivalent.

**Mutual exclusivity:** Within each slot (involving/sender/receiver), only one of address/label/category can be set. `involving*` filters cannot be combined with `sender*`/`receiver*` filters. The same rules apply to exclude slots. Positive and negative filters are independent and can be freely combined.

### Aggregate Methods

| Method | Description |
|--------|-------------|
| `.aggregate(group_by, period)` | Transition to aggregate query. `group_by`: `"time"` or `"block"`. `period`: bucket size (e.g. `"1h"`, `"100b"`). Returns an `AggregateQueryBuilder` that supports all the same terminal and filter methods. |
| `client.aggregate_schema(protocol)` | Get available aggregate fields for a protocol (e.g. `"erc20"`, `"aave_v3"`). |

### Terminal Methods

| Method | Description |
|--------|-------------|
| `.as_df()` | Execute and return pandas DataFrame |
| `.as_df("polars")` | Execute and return polars DataFrame |
| `.as_file(path)` | Execute and save to file (format from extension) |
| `.as_file(path, format="csv")` | Execute and save with explicit format |
| `.as_dict()` | Execute and return list of dicts (JSON, 10K block limit) |
| `.as_link()` | Execute and return download link (CSV, 1hr expiry) |
| `.as_link(format="parquet")` | Execute and return download link (Parquet) |
| `.calculate_cost()` | Estimate query cost without executing (no quota deducted) |

## Changelog

### 2.5.0 (2026-04-16)

- `as_df("polars")` and `as_df("pandas")` now return timezone-aware
  `datetime` for every time-like column (`time`, `block_time`, `window`)
  regardless of how the server encoded them. Previously, Hyperliquid and
  Binance parquet responses shipped `time` as a `str`, which crashed the
  polars branch with `InvalidOperationError: arithmetic on string and
  numeric not allowed` and silently produced `object`/`NaT` in pandas.
- Server-side fix: parquet output now carries native `TIMESTAMP(μs, UTC)`
  instead of stringified datetimes. Downstream consumers (DuckDB, Spark,
  BigQuery) can read it without re-parsing.
- Client works against both old and new server builds — it dispatches on
  the dtype it sees (native Datetime passthrough, Utf8 string parse, or
  epoch-seconds `from_epoch`) and normalizes all three to the same
  result.

## License

MIT License
