Metadata-Version: 2.4
Name: alphainfo
Version: 1.5.31
Summary: Python client for alphainfo.io — Structure-aware analysis for any time series
Home-page: https://www.alphainfo.io
Author: alphainfo.io
Author-email: "alphainfo.io" <contato@alphainfo.io>
License: MIT
Project-URL: Homepage, https://www.alphainfo.io
Project-URL: Documentation, https://www.alphainfo.io/quickstart
Project-URL: Recipes, https://www.alphainfo.io/recipes
Project-URL: Walkthroughs, https://github.com/info-dev-13/alphainfo-walkthroughs
Project-URL: Notebooks, https://github.com/qgidev/alphainfo-notebooks
Project-URL: Changelog, https://www.alphainfo.io/changelog
Keywords: alphainfo,structural-analysis,regime-detection,time-series,anomaly-detection,signal-processing
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Healthcare Industry
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Office/Business :: Financial :: Investment
Classifier: Typing :: Typed
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: httpx>=0.25.0
Provides-Extra: http2
Requires-Dist: h2>=4.0; extra == "http2"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: mypy>=1.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: h2>=4.0; extra == "dev"
Dynamic: author
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

# alphainfo

[![PyPI version](https://img.shields.io/pypi/v/alphainfo.svg)](https://pypi.org/project/alphainfo/)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/qgidev/alphainfo-notebooks/blob/main/quickstart.ipynb)

**Python client for the [alphainfo](https://www.alphainfo.io) Structural Intelligence API.**

Detect structural regime changes in any time series — biomedical signals, financial markets, energy grids, seismic data, IoT sensors, network traffic, ML drift. One API, no model training, no per-domain tuning. Every analysis ships with an audit trail.

**[▶ Try it in Google Colab](https://colab.research.google.com/github/qgidev/alphainfo-notebooks/blob/main/quickstart.ipynb)** (2 min, no install) — fetches real SPY data, detects the March 2020 regime change, visualizes the result.

## 30-second try

**Step 1 — [Get a free API key](https://www.alphainfo.io/register)** (50 analyses/month, no credit card).

**Step 2** — Install and ask the most common question — *"does the current signal still behave like a known-good baseline?"*:

```bash
pip install alphainfo
```

```python
from alphainfo import AlphaInfo
import math

client = AlphaInfo(api_key="ai_...")  # your free key

# Baseline: a normal/healthy period (last week, commissioned machine, calm market...).
baseline = [math.sin(i/10) for i in range(200)]

# Signal: what you want to evaluate NOW. Same source, recent window.
# Here we triple the amplitude halfway through to simulate a regime change.
signal   = [math.sin(i/10) for i in range(100)] + [math.sin(i/10) * 3 for i in range(100)]

result = client.compare(signal=signal, baseline=baseline, sampling_rate=100.0)
print(result.confidence_band)   # 'stable' | 'transition' | 'unstable'
print(result.structural_score)  # 0.0 (changed) → 1.0 (preserved)
print(result.analysis_id)       # UUID for audit replay
```

That's it. You just compared a signal against a baseline. 🚀

**No baseline yet?** Use `client.detect_internal_change(signal=..., sampling_rate=...)` for triage on a single signal — the engine looks for change inside the series itself.

**What to try next:** `client.analyze_vector(channels={...})` for multi-channel data (12-lead ECG, 3-axis IMU, multi-asset finance), `client.fingerprint()` for a 5D similarity vector you can index, or `client.guide()` for the full encoding guide (no key needed).

---

## What input does the API accept?

**Native input is a numeric array** — `list[float]` or `list[int]`. That's
the only thing the engine consumes directly. The "universal" part is the
encoding layer in front: anything that can become a numeric structure can
be analysed by the API. Concrete bridge points:

| Your data | Encoding | Recipe |
|---|---|---|
| Raw time series (sensor, returns, price) | None — pass it directly | `client.analyze` / `client.compare` |
| JSON / configs / events | path + type frequency vector | `recipes.schema_drift` |
| Logs / token sequences | n-gram or transition frequency vector | `recipes.event_grammar` |
| Multi-channel (multi-axis sensor, multi-lead ECG) | dict of named numeric channels | `client.analyze_vector` |
| Mixed dataset / table | one column per channel, or feature ensemble | `recipes.feature_ensemble` |
| Don't know which encoder fits | meta-recipe — heuristic recommends ranked encoders | `recipes.encoding_guide.discover_encoding` |

What the SDK rejects as a hard error before the round-trip:

- `None`, `bytes`, `bytearray`, `dict`, nested lists, non-numeric values
- lists with **any** `NaN` or `Inf` values (JSON can't serialise them)
- empty lists

Each rejection message names the specific recipe that addresses the
failure. If you genuinely have raw bytes / images / graphs, the encoding
layer is where the engineering happens — see `recipes/` in the public
repo.

---

## Installation

```bash
pip install alphainfo

# Optional: enable HTTP/2 for better throughput on concurrent calls
pip install alphainfo[http2]
```

Requires Python 3.8+. Core dependency: [httpx](https://www.python-httpx.org/).

## Full examples

### 1. Get your API key

[alphainfo.io/register](https://www.alphainfo.io/register) — free tier: 50 analyses/month, no credit card required. Starter paid plans from $49/mo.

### 2. The 3 canonical SDK verbs

Cover 90% of use cases. Each verb maps to a **question** the developer is asking — not a data shape.

| Question | Method | When |
|---|---|---|
| "Does this signal still behave like normal?" | `client.compare(signal, baseline)` | You have a known-good period of reference. **Most common.** |
| "Does this signal alone show internal change?" | `client.detect_internal_change(signal)` | No baseline yet — initial triage. |
| "Did multiple sensors correlate?" | `client.analyze_vector(channels={...})` | Multi-channel: 12-lead ECG, 3-axis IMU, multi-asset OHLCV. |

```python
from alphainfo import AlphaInfo

client = AlphaInfo(api_key="ai_your_key")

# Most common case: compare current observation against a known-good baseline.
result = client.compare(
    signal=current_window,        # what you want to evaluate now
    baseline=last_known_normal,   # a healthy/calm reference window
    sampling_rate=250.0,
    domain="biomedical",          # optional: vertical calibration
)

if result.change_detected:
    print(f"Regime change detected! Band: {result.confidence_band}")
    print(f"Structural score: {result.structural_score:.3f}")
    print(f"Audit ID: {result.analysis_id}")
```

`client.analyze(...)` is the omnibus method that all three verbs above wrap — it accepts both
single-signal and signal+baseline inputs. Use it directly when you want raw control over every
parameter; reach for one of the three named verbs when you want code that reads at a glance.

### 3. Structural fingerprint (fast path)

```python
# Extract the 5D structural fingerprint — skips semantic + multiscale for speed
fp = client.fingerprint(signal=data, sampling_rate=250.0, domain="biomedical")

print(fp.structural_score)    # 0.0 to 1.0
print(fp.confidence_band)     # 'stable', 'transition', 'unstable'

# Always guard before indexing — the fingerprint is None for signals
# the engine can't decompose (too short, constant, etc).
if fp.is_complete:
    print(fp.vector)          # 5D list of floats, each in [0, 1]
else:
    print(f"unavailable: {fp.fingerprint_reason}")

# Use .vector for nearest-neighbor search / ANN indexing — skip incomplete ones
from sklearn.neighbors import NearestNeighbors
vectors = [fp.vector for s in signal_corpus
           if (fp := client.fingerprint(s, 250.0)).is_complete]
nn = NearestNeighbors(n_neighbors=5).fit(vectors)
```

**Minimum signal length for a complete fingerprint:**

| Case | Minimum samples | Constant |
|---|---|---|
| No baseline | 192 | `alphainfo.MIN_FINGERPRINT_SAMPLES` |
| With baseline | 50 | `alphainfo.MIN_FINGERPRINT_SAMPLES_WITH_BASELINE` |

Below those thresholds, `fingerprint_available` comes back False with
`fingerprint_reason="signal_too_short"`, and the SDK emits a `UserWarning`
at call time. For shorter inputs, use `client.analyze()` — it still
returns a `structural_score` and `confidence_band`, just not the 5D vector.

See [`examples/fingerprint_handling.py`](examples/fingerprint_handling.py) for
a fuller pattern (falls back to the semantic layer when a fingerprint is
unavailable).

### 4. Batch analysis

```python
# Analyze up to 100 signals in one call
batch = client.analyze_batch(
    signals=[signal_1, signal_2, signal_3],
    sampling_rate=1000.0,
    domain="sensors",
)

for item in batch.results:
    if item.success:
        print(f"Signal {item.index}: {item.confidence_band} ({item.structural_score:.3f})")
    else:
        print(f"Signal {item.index}: error — {item.error}")
```

### 5. Semantic layer (severity, trend, alerts)

```python
result = client.compare(
    signal=data,
    baseline=calm_period,
    sampling_rate=1.0,
    include_semantic=True,
)

if result.semantic:
    print(result.semantic.alert_level)       # 'normal', 'attention', 'alert', 'critical'
    print(result.semantic.severity)          # 'none', 'low', 'moderate', 'high', 'critical'
    print(result.semantic.severity_score)    # 0-100 (higher = more severe)
    print(result.semantic.trend)             # 'stable', 'diverging', 'monitoring'
    print(result.semantic.summary)           # "⚠️ Structural divergence detected (severity: high)"
    print(result.semantic.recommended_action)  # 'log_only', 'monitor', 'human_review', 'immediate_human_review'

# Short signal warning (< 100 samples)
if result.warning:
    print(result.warning)  # "Signal has only 30 samples..."
```

**Severity thresholds:**

| severity | severity_score | Meaning |
|----------|---------------|---------|
| `none` | 0-15 | No structural degradation |
| `low` | 16-35 | Minor deviation, monitor |
| `moderate` | 36-65 | Notable change, investigate |
| `high` | 66-85 | Significant regime shift |
| `critical` | 86-100 | Severe structural breakdown |

### 6. Multi-channel (vector) analysis with per-channel baselines

```python
# Multi-lead ECG, multi-axis accelerometer, cross-asset finance...
vector = client.analyze_vector(
    channels={
        "lead_I": ecg_lead_1,
        "lead_II": ecg_lead_2,
        "lead_III": ecg_lead_3,
    },
    sampling_rate=360.0,
    domain="biomedical",
)

print(f"Aggregated score: {vector.structural_score:.3f}")
print(f"Composite band: {vector.confidence_band}")
for name, ch in vector.channels.items():
    print(f"  {name}: {ch.confidence_band} (score={ch.structural_score:.3f})")

# With per-channel baselines (e.g. calm period reference)
vector = client.analyze_vector(
    channels={"SPY": spy_data, "VIX": vix_data, "GLD": gld_data},
    sampling_rate=1.0,
    baselines={"SPY": spy_calm, "VIX": vix_calm, "GLD": gld_calm},
)
```

### 7. Audit trail

```python
# Replay any past analysis
replay = client.audit_replay("550e8400-e29b-41d4-a716-446655440000")
print(f"Original score: {replay.output['structural_score']}")

# List recent analyses
history = client.audit_list(limit=10)
for entry in history:
    print(f"{entry.analysis_id} — {entry.structural_score}")
```

### 8. API guide (discoverability)

```python
# Fetch the full encoding guide — endpoints, patterns, tips, debugging
guide = client.guide()
print(guide["version"])            # "1.1"
print(list(guide.keys()))          # all available sections

# Common mistakes
for m in guide["common_mistakes"]:
    print(f"- {m['mistake']}: {m['fix']}")

# Which endpoint to use
for name, info in guide["endpoints"].items():
    print(f"{name}: {info.get('path', '')} — {info.get('when', '')}")
```

### 9. Version and compatibility

```python
info = client.version()
print(info["api_version"])                      # "2.3.0"
print(info["sdk_compat"]["recommended_version"])  # "1.5.31"
print(info["features"])                          # dict of supported features
print(info["limits"]["max_batch_size"])           # 100
```

## Async Support

```python
from alphainfo import AsyncAlphaInfo

async with AsyncAlphaInfo(api_key="ai_your_key") as client:
    result = await client.analyze(signal=data, sampling_rate=250.0)
    fp = await client.fingerprint(signal=data, sampling_rate=250.0)
```

All methods available on `AlphaInfo` are also available on `AsyncAlphaInfo`.

## Error Handling

```python
from alphainfo import AlphaInfo, AuthError, RateLimitError, ValidationError

client = AlphaInfo(api_key="ai_your_key")

try:
    result = client.analyze(signal=data, sampling_rate=250.0)
except AuthError:
    print("Invalid API key")
except RateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after}s")
except ValidationError as e:
    print(f"Invalid input: {e.message}")
```

**Exception hierarchy:**

| Exception | HTTP Code | When |
|-----------|-----------|------|
| `AuthError` | 401 | Invalid or missing API key |
| `ValidationError` | 400, 413 | Bad input or signal too large |
| `RateLimitError` | 429 | Quota or concurrency limit exceeded |
| `NotFoundError` | 404 | Analysis ID not found (audit) |
| `APIError` | 5xx | Server error |
| `TimeoutError` | — | Request timed out after retries |
| `NetworkError` | — | Connection failed |

All inherit from `AlphaInfoError`.

## Configuration

```python
client = AlphaInfo(
    api_key="ai_your_key",
    base_url="https://www.alphainfo.io",  # default
    timeout=30.0,                      # seconds (default)
    max_retries=3,                     # automatic retry on transient errors
    retry_base_delay=1.0,              # initial backoff delay (seconds)
    retry_max_delay=32.0,              # max delay between retries (seconds)
    http2=None,                        # auto-detect (True if h2 installed)
)
```

The client automatically retries on:
- Network timeouts and connection errors
- HTTP 429 (rate limits) — respects `Retry-After` header
- HTTP 5xx (server errors)

Non-retryable errors (401, 400, 404) are raised immediately.

Backoff is exponential: `retry_base_delay * 2^attempt`, capped at `retry_max_delay`.

## Rate Limit Info

```python
result = client.analyze(signal=data, sampling_rate=250.0)
info = client.rate_limit_info
if info:
    print(f"Remaining: {info.remaining}/{info.limit}")
```

## Signal Size Guide

| Samples | Behavior | Recommendation |
|---------|----------|----------------|
| < 10 | Rejected (422) | Hard minimum |
| 10-49 | Returns 0.5 + warning | Too short for multiscale |
| 50-99 | Returns 0.5 + warning | Limited confidence |
| 100-199 | Variable scores | Detection active, less reliable |
| **200-500** | **Reliable scores** | **Recommended range** |
| 500+ | Reliable, may dilute point events | Use windowing for point detection |

**Note:** `sampling_rate` controls multiscale window sizing but does not change scores for a given signal. For daily financial data use `sampling_rate=1.0`; for ECG at 250Hz use `sampling_rate=250.0`.

## Amplitude invariance — what scale changes mean

The engine measures **internal structure**, not absolute scale. The behavior
depends on whether you provide a baseline:

| Mode | Behavior | Why |
|---|---|---|
| **No baseline** (single signal) | Largely amplitude-invariant. `analyze(amp×0.5)` and `analyze(amp×2.0)` of the same shape return nearly identical `structural_score`. | Curvatures normalize against the signal's own statistics — uniform gain doesn't change shape. |
| **With baseline** (observation vs baseline) | Amplitude DOES register. Comparing `amp=1.2` against `amp=1.0` baseline drops the score to ~0.87 in our regression suite — the engine reports a real level/scale shift. | A baseline anchors absolute scale, so a uniform gain becomes a measurable structural deviation. |

**If you want amplitude-invariance in baseline mode** (e.g., comparing two ECG
recordings from different machines): z-normalize or min-max normalize **both**
signals before sending. The recipes layer's `feature_ensemble` includes
`z_norm` as a default channel for exactly this reason.

```python
import numpy as np
def z_norm(x): return (np.array(x) - np.mean(x)) / (np.std(x) + 1e-9)

result = client.compare(
    signal=z_norm(observation),
    baseline=z_norm(historical),
    domain="biomedical",
)
```

The full guarantee text is in `client.guide()["deterministic_guarantees"]`.

## Domains

| Domain | Use case |
|--------|----------|
| `generic` | Universal fallback — use when no specific domain fits (name the domain for better calibration) |
| `biomedical` | ECG, EEG, EMG, SpO2 |
| `finance` | Market prices, returns, volume |
| `power_grid` | Power grid frequency, load (alias: `energy`, `power`, `grid`) |
| `seismic` | Earthquakes, vibration sensors (alias: `earthquake`) |
| `sensors` | IoT, industrial machinery, SCADA (alias: `iot`, `industrial`) |
| `ai_ml` | Model drift, data quality (alias: `mlops`, `ml`, `ai`) |
| `security` | Network traffic, intrusion (alias: `cyber`) |
| `traffic` | Network / urban traffic flow (alias: `network`, `net`) |

> **Aliases are auto-resolved.** The API accepts both canonical names and the
> registered aliases on every analyze endpoint — pass `domain="energy"`,
> `"mlops"`, `"industrial"`, `"cyber"`, `"fintech"`, `"biomed"`, `"network"`,
> etc. and the server resolves to the canonical domain (no HTTP 400). The
> live alias map is exposed in `client.version()` under
> `domains.aliases` for inspection.

## Guides

All guide content is available programmatically via `client.guide()` and the live API at `GET /v1/guide`:

```python
guide = client.guide()  # returns all 15 sections, no auth required

guide["common_mistakes"]   # 10 pitfalls with symptoms and fixes
guide["performance_tips"]  # fast mode, batch vs loop, HTTP/2, retry tuning
guide["debugging_tips"]    # step-by-step troubleshooting + error hierarchy
guide["endpoints"]         # all endpoints — when to use, latency, quota cost
```

Full markdown versions are also included in the installed package under `alphainfo/guides/`.

## Recipes — higher-level patterns

The API surface is intentionally minimal: `analyze`, `analyze_batch`,
`analyze_vector`, `fingerprint`. Higher-level **recipes** compose those
primitives — the live manifest is at [`/v1/recipes`](https://www.alphainfo.io/v1/recipes)
and runnable walkthroughs + probe libraries are public at
[`info-dev-13/alphainfo-walkthroughs`](https://github.com/info-dev-13/alphainfo-walkthroughs):

| Recipe | What it does |
|---|---|
| `feature_ensemble` | Encode a signal multiple ways (raw, z_norm, rms, spectrum, autocorr, histogram, ...) and run them as channels of `analyze_vector`. Per-channel scores reveal *which axis* changed. |
| `windowed` | Slide a window over a long signal and `analyze_batch` each — finds *where* in the signal the change happened. |
| `parameter_search` | Grid → `analyze_batch` with observation as baseline → top-3 contains the truth. Calibration without gradients. |
| `schema_drift` | Hash JSON paths into a frequency vector and `analyze` it — detects field/type drift in event streams. |
| `motif_search` | Slide a window of motif length, batch each against the motif → find a known pattern in a long history. |
| `auto_diagnose` | Probe library + benign-control gate — answers "what KIND of change happened?" Domain-specific probe libraries for `finance`, `biomedical`, `sensors` (industrial), `ai_ml` (mlops drift) and `security` (SOC/log anomalies). |
| `event_grammar` | n-gram / transition encoding for log/event streams — detects grammar drift independent of token values. |
| `intents` | User-intent → recipe dispatcher. `dispatch(intent="regime_change", domain="finance")` chains windowed + auto_diagnose with finance probes. |
| `encoding_guide` ⚡ **meta-recipe** | "I have a signal — which encoder do I use?" `discover_encoding(signal)` profiles your data (range, kurtosis, periodicity, trend, categorical-likeness) and recommends ranked encoders **with reasoning**. `auto_encode()` applies them via `feature_ensemble`. Works for ANY 1-D signal, not just the 5 calibrated verticals. See `client.guide()["encoding_guide"]` for the full prose. |

**Walkthroughs end-to-end** (clone the public repo and run against your API key):

```bash
git clone https://github.com/info-dev-13/alphainfo-walkthroughs
cd alphainfo-walkthroughs
pip install -r requirements.txt
export ALPHAINFO_API_KEY=ai_your_key

python walkthroughs/finance_walkthrough.py            # COVID-19 crash on real SPY
python walkthroughs/ecg_walkthrough.py                # 3 known events in synthetic ECG
python walkthroughs/ecg_physionet_walkthrough.py      # real MIT-BIH records 100+200
python walkthroughs/vibration_walkthrough.py          # 3 motor conditions vs baseline
python walkthroughs/encoding_walkthrough.py           # which encoder fits your signal?
```

- Manifest: `GET /v1/recipes` ([live JSON](https://www.alphainfo.io/v1/recipes))
- Public walkthroughs + probes: [`info-dev-13/alphainfo-walkthroughs`](https://github.com/info-dev-13/alphainfo-walkthroughs)
- HTML overview: [alphainfo.io/recipes](https://www.alphainfo.io/recipes)
- Encoding guide prose: `client.guide()["encoding_guide"]`

Recipes are NOT part of the SDK contract — they live in the open repo as
copy/adapt-friendly reference implementations. The right encoding /
windowing decisions are domain-specific; locking them into the SDK
would either bloat the API surface or apply transforms users didn't
ask for.

## Links

- [API Documentation](https://www.alphainfo.io/docs)
- [Recipes](https://www.alphainfo.io/recipes) — composed patterns + walkthroughs
- [Benchmarks](https://www.alphainfo.io/benchmarks)
- [Dashboard](https://www.alphainfo.io/dashboard)

## License

MIT
