Metadata-Version: 2.4
Name: neudata-api-client
Version: 0.2.0
Summary: Official Python client for the Neudata Developer API.
Project-URL: Homepage, https://gitlab.com/neudata1/scout-api-client
Author: Neudata API Client contributors
License: MIT
License-File: LICENSE
Keywords: alternative-data,api,client,datasets,finance,neudata
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Office/Business :: Financial
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.9
Requires-Dist: click>=8.1
Requires-Dist: requests>=2.31
Description-Content-Type: text/markdown

# neudata-api-client

A Python client for the [Neudata Developer API](https://www.neudata.co/api). It
lets you search and retrieve dataset metadata, additional datasets, filters, and
DDQs — and ships with resilient defaults: it **retries dropped connections**,
**backs off and honours `Retry-After` on rate limits (`429`)**, and
**auto-paginates** list endpoints.

## Installation

```bash
pip install neudata-api-client
```

For development (using [uv](https://docs.astral.sh/uv/)):

```bash
git clone https://github.com/neudata/neudata-api-client
cd neudata-api-client
uv sync
```

## Authentication

The API authenticates with two headers, `Email` and `Api-Token`. Your token is
available at <https://www.neudata.co/users/edit>.

Provide them via environment variables (recommended):

```bash
export NEUDATA_EMAIL="you@example.com"
export NEUDATA_API_TOKEN="your-api-token"
```

…or pass them directly to the client:

```python
from neudata import NeudataClient

client = NeudataClient(email="you@example.com", api_token="your-api-token")
```

A `.env.example` is included as a template.

## Quickstart

```python
from neudata import NeudataClient

with NeudataClient() as client:
    # Search a single page of datasets
    results = client.search_datasets(q="satellite", per_page=10)

    # Or iterate over *every* matching dataset (pagination handled for you)
    for dataset in client.iter_datasets(equity_region=["uk", "europe"]):
        print(dataset["id"], dataset["title"])

    # Fetch the full detail record for one dataset
    detail = client.get_dataset(1234)
```

## API reference

| Method | Endpoint | Description |
| --- | --- | --- |
| `search_datasets(**filters)` | `GET /datasets` | One page of dataset search results |
| `iter_datasets(per_page=99, **filters)` | `GET /datasets` | Iterate all datasets across pages |
| `get_dataset(id)` | `GET /datasets/{id}` | Full dataset detail |
| `list_filters()` | `GET /filters` | Available filter parameter names |
| `get_filter_options(name)` | `GET /filters/{name}` | Allowable options for a filter |
| `search_additional_datasets(**filters)` | `GET /additional-datasets` | One page of additional datasets |
| `iter_additional_datasets(per_page=99, **filters)` | `GET /additional-datasets` | Iterate all additional datasets |
| `get_additional_dataset(id)` | `GET /additional-datasets/{id}` | Full additional-dataset detail |
| `list_ddqs(page, per_page)` | `GET /ddqs` | List DDQs (Sentry subscribers) |
| `iter_ddqs(per_page=99)` | `GET /ddqs` | Iterate all DDQs |
| `get_ddq(id)` | `GET /ddqs/{id}` | DDQ detail + 10-minute download URL |

### Filters

Search methods accept the documented query parameters as keyword arguments, e.g.
`q`, `page`, `per_page`, `sort`, `sort_by`, `source`, `from_date`,
`data_provider_id`, and the array filters `equity_region`, `asset_class`,
`combined_dataset_types`, `frequency`, `history`, `subscription_price`,
`report_type`, `feed_type`, `output_format`, `esg_topics`, `filter_labels`,
`ticker_indices`, and more.

Array filters are passed as Python lists and serialized into the repeated
`name[]` form the API expects:

```python
client.search_datasets(asset_class=["public-equity", "fixed-income"])
# -> GET /datasets?asset_class[]=public-equity&asset_class[]=fixed-income
```

Use `client.list_filters()` and `client.get_filter_options(name)` to discover the
valid values for each filter at runtime.

## Resilience: retries, backoff & rate limits

The client is configured on construction:

```python
NeudataClient(
    timeout=30,                  # per-request timeout (seconds)
    max_retries=5,               # retries for dropped connections / 5xx
    backoff_factor=0.5,          # exponential backoff base
    max_backoff=60,              # cap on any single backoff sleep
    max_rate_limit_retries=10,   # retries specifically for 429 responses
    requests_per_minute=90,      # client-side throttle (0 disables it)
)
```

- **Dropped/closed connections** (and timeouts and `5xx` server errors) are
  retried up to `max_retries` times with jittered exponential backoff.
- **`429` rate limits** are retried up to `max_rate_limit_retries` times. The
  `Retry-After` header is honoured when present; otherwise exponential backoff is
  used. When retries are exhausted a `RateLimitError` is raised.
- A built-in client-side throttle keeps the sustained request rate below
  `requests_per_minute` (the API guideline is ~100/min) so bulk jobs avoid
  tripping the limiter in the first place.

Enable logging to see retry/backoff activity:

```python
import logging
logging.getLogger("neudata").setLevel(logging.INFO)
```

### Exceptions

All exceptions derive from `neudata.NeudataError`:

- `AuthenticationError` — `401`/`403` (bad credentials; not retried)
- `NotFoundError` — `404`
- `RateLimitError` — `429` retries exhausted (carries `retry_after`)
- `NeudataAPIError` — other unexpected API responses (carries `status_code`,
  `response_body`)

## Example: download the whole catalogue

A console script is installed with the package:

```bash
# Dump all datasets + additional datasets to neudata_export.json
neudata-download

# Custom output, only datasets, with full per-item detail records
neudata-download --output datasets.json --datasets-only --details

# Filter by date or free text
neudata-download --from-date 2026-01-01 --query "credit card"
```

Run `neudata-download --help` for all options. A minimal, readable version using
the library directly is in [`examples/download_all.py`](examples/download_all.py):

```bash
python examples/download_all.py out.json
```

The output JSON has the shape:

```json
{
  "datasets": [ ... ],
  "additional_datasets": [ ... ]
}
```

## Development

```bash
uv sync             # create the venv and install deps + dev tools
uv run pytest       # run the test suite (no network calls)
```

Tests use [`responses`](https://github.com/getsentry/responses) to mock the API
and a monkeypatched sleep, so they run instantly and offline.

There is also an opt-in **live smoke test** that hits every endpoint against the
real API. It is excluded from the default run and requires credentials:

```bash
export NEUDATA_EMAIL="you@example.com"
export NEUDATA_API_TOKEN="your-api-token"
uv run pytest -m smoke -v      # hits the real Neudata API
```

(The DDQ checks skip automatically if your account isn't a Sentry subscriber.)

## Releasing to PyPI

```bash
uv build                          # build sdist + wheel into dist/
uvx twine check dist/*            # validate metadata / long description

# Test upload first
uv publish --publish-url https://test.pypi.org/legacy/ --token "$TEST_PYPI_TOKEN"

# Then the real thing
uv publish --token "$PYPI_TOKEN"
```

## License

[MIT](LICENSE)
