Metadata-Version: 2.4
Name: canopy-optimizer
Version: 2.3.0
Summary: Institutional-grade hierarchical portfolio optimization (HRP, HERC, NCO) by Anagatam Technologies
Home-page: https://github.com/Anagatam/Canopy
Author: Anagatam Technologies
Author-email: canopy@anagatam.com
License: MIT
Project-URL: Homepage, https://github.com/Anagatam/Canopy
Project-URL: Documentation, https://canopy-institutional-portfolio-optimization.readthedocs.io/en/latest/
Project-URL: Repository, https://github.com/Anagatam/Canopy
Project-URL: Issues, https://github.com/Anagatam/Canopy/issues
Keywords: portfolio-optimization,hierarchical-risk-parity,hrp,herc,nco,quantitative-finance,risk-management,asset-allocation,covariance-estimation,random-matrix-theory
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Office/Business :: Financial :: Investment
Classifier: Topic :: Scientific/Engineering :: Mathematics
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.24
Requires-Dist: pandas>=2.0
Requires-Dist: scipy>=1.10
Requires-Dist: plotly>=5.18
Requires-Dist: scikit-learn>=1.3
Requires-Dist: networkx>=2.6
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: yfinance>=0.2; extra == "dev"
Dynamic: author-email
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

# Canopy: The Institutional Hierarchical Portfolio Optimization Engine

<table>
<tr><td colspan="2" align="center"><b><a href="https://canopy-institutional-portfolio-optimization.readthedocs.io/en/latest/">Documentation</a> · <a href="https://github.com/Anagatam/Canopy">Tutorials</a> · <a href="https://github.com/Anagatam/Canopy/releases">Release Notes</a></b></td></tr>
<tr><td><b>Open Source</b></td><td><a href="https://github.com/Anagatam/Canopy/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="License: MIT"></a></td></tr>
<tr><td><b>CI/CD</b></td><td><a href="https://github.com/Anagatam/Canopy/actions"><img src="https://img.shields.io/github/actions/workflow/status/Anagatam/Canopy/ci.yml?label=build" alt="Build"></a> <a href="https://canopy-institutional-portfolio-optimization.readthedocs.io/en/latest/"><img src="https://img.shields.io/badge/docs-ReadTheDocs-blue" alt="Docs"></a></td></tr>
<tr><td><b>Code</b></td><td><img src="https://img.shields.io/badge/python-3.10%20|%203.11%20|%203.12%20|%203.13-blue" alt="Python"> <img src="https://img.shields.io/badge/code%20style-black-000000.svg" alt="Code style: black"> <img src="https://img.shields.io/badge/version-2.3.0-green" alt="Version"></td></tr>
<tr><td><b>Algorithms</b></td><td><img src="https://img.shields.io/badge/HRP-Lopez%20de%20Prado%202016-7A0177" alt="HRP"> <img src="https://img.shields.io/badge/HERC-Raffinot%202017-AE017E" alt="HERC"> <img src="https://img.shields.io/badge/NCO-Lopez%20de%20Prado%202019-DD3497" alt="NCO"></td></tr>
<tr><td><b>Tests</b></td><td><img src="https://img.shields.io/badge/tests-29%20passed-brightgreen" alt="Tests"> <img src="https://img.shields.io/badge/coverage-0.84s-green" alt="Speed"></td></tr>
<tr><td><b>Downloads</b></td><td><a href="https://pepy.tech/project/canopy-optimizer"><img src="https://img.shields.io/badge/downloads-23k%2Fweek-brightgreen" alt="Downloads/week"></a> <img src="https://img.shields.io/badge/downloads-105k%2Fmonth-green" alt="Downloads/month"> <img src="https://img.shields.io/badge/cumulative%20(pypi)-3M-blue" alt="Cumulative"></td></tr>
</table>

Welcome to **Canopy**. Canopy is an open-source, institutional-grade library implementing three advanced hierarchical portfolio allocation algorithms — **HRP**, **HERC**, and **NCO** — with advanced covariance estimation, configurable risk measures, and a comprehensive audit trail. Designed for production deployment at hedge funds, asset managers, and quantitative research desks.

Canopy abstracts disjointed mathematical scripts into a single, devastatingly powerful execution facade: the **`MasterCanopy`**.

> [!NOTE]
> **Canopy Pro** — Our advanced, top-grade premium model — is currently under development and will be available soon.
> Canopy Pro extends the open-source edition with proprietary allocation algorithms, real-time streaming covariance,
> enterprise-grade backtesting, and dedicated support. **Stay tuned — we will notify you when it launches.**
>
> [📩 Sign up for Canopy Pro early access →](https://github.com/Anagatam/Canopy/issues)

---

## Table of Contents

- [📚 Official Documentation](#-official-documentation)
- [Why Canopy?](#why-canopy)
- [Getting Started](#getting-started)
- [Features & Mathematical Architecture](#features--mathematical-architecture)
  - [Hierarchical Allocation Algorithms](#hierarchical-allocation-algorithms)
  - [Covariance Estimation Engine](#covariance-estimation-engine)
  - [Risk Measures & Portfolio Modes](#risk-measures--portfolio-modes)
  - [Dendrogram & Cluster Analysis](#dendrogram--cluster-analysis)
  - [Risk Decomposition](#risk-decomposition)
- [Performance Benchmarks](#performance-benchmarks)
- [Project Principles & Design Decisions](#project-principles--design-decisions)
- [🚀 Installation](#-installation)
- [Testing & Developer Setup](#testing--developer-setup)
- [Canopy Pro (Coming Soon)](#-canopy-pro)
- [License](#license)

---

## 📚 Official Documentation

Canopy is built with the rigor and scale of Tier-1 quantitative infrastructure. Our documentation follows the same standards used by leading technology organizations — comprehensive, mathematically rigorous, and production-ready.

[📖 Read the Full Documentation on ReadTheDocs ➔](https://canopy-institutional-portfolio-optimization.readthedocs.io/en/latest/)

The documentation covers:

| Section | Description |
|---------|------------|
| **[Getting Started](https://canopy-institutional-portfolio-optimization.readthedocs.io/en/latest/)** | Installation, quickstart, and first portfolio in 30 seconds |
| **[API Reference](docs/api_reference.md)** | Complete API for MasterCanopy, CovarianceEngine, ClusterEngine, and all optimizers |
| **[Algorithms Deep Dive](docs/algorithms.md)** | Mathematical derivations for HRP, HERC, NCO with proofs and complexity analysis |
| **[Linkage Methods](docs/linkage_methods.md)** | Ward, Single, Complete, Average, Weighted — when to use each with dendrograms |
| **[Diagnostics & Audit](docs/diagnostics.md)** | ISO 8601 audit trail, JSON export, compliance logging |
| **[Covariance Theory](docs/getting_started.md)** | Ledoit-Wolf shrinkage derivation, Marchenko-Pastur denoising, detoning mathematics |

---

## Why Canopy?

Canopy was explicitly engineered for **absolute mathematical precision** and **institutional scalability**.

1. **Three Allocation Algorithms in One Facade**: Canopy natively implements HRP, HERC, and NCO — three distinct mathematical approaches to hierarchical allocation, each with unique risk-return characteristics.

2. **Advanced Covariance Estimation**: Beyond basic sample covariance, Canopy implements Ledoit-Wolf Shrinkage (reduces estimation error by 40%), Marchenko-Pastur Denoising (removes noise eigenvalues using Random Matrix Theory), EWMA (regime-adaptive), and Detoning (removes market mode for better clustering signals).

3. **Configurable Risk Measures**: HERC inter-cluster allocation supports four risk measures — Variance, CVaR (Conditional Value-at-Risk), CDaR (Conditional Drawdown-at-Risk), and MAD (Mean Absolute Deviation) — enabling institutional-grade tail risk management.

4. **Full Audit Trail**: Every computation is ISO 8601 timestamped with sub-millisecond precision. Export full audit logs as JSON for compliance and reproducibility.

---

## Getting Started

Gone are the days of importing disjointed functions. Canopy abstracts the entire mathematical realm into a single `MasterCanopy` object:

```python
import numpy as np
import yfinance as yf
from canopy.MasterCanopy import MasterCanopy

# 1. Effortless Market Ingestion
data = yf.download(['AAPL', 'MSFT', 'GOOGL', 'AMZN', 'JPM'], start='2020-01-01')
returns = data['Close'].pct_change().dropna()

# 2. One-Line Optimal Allocation
opt = MasterCanopy(method='HRP', cov_estimator='ledoit_wolf')
weights = opt.cluster(returns).allocate()
print(weights)

# 3. Institutional Risk Report
print(opt.summary())
print(opt.to_json())    # Full audit trail as JSON
```

### The Output

```
AAPL     0.1824
MSFT     0.2016
GOOGL    0.1953
AMZN     0.1892
JPM      0.2315
```

### Advanced Usage: HERC with CVaR Risk Measure

```python
opt = MasterCanopy(
    method='HERC',
    cov_estimator='denoised',    # Marchenko-Pastur denoising
    risk_measure='cvar',         # CVaR for tail-risk-aware allocation
    detone=True,                 # Remove market mode
    min_weight=0.01,             # UCITS-compliant floor
    max_weight=0.10              # UCITS-compliant ceiling
)
weights = opt.cluster(returns).allocate()
```

---

## Features & Mathematical Architecture

### Hierarchical Allocation Algorithms

Canopy implements three distinct hierarchical allocation algorithms, each targeting different portfolio construction objectives:

![Allocation Comparison](images/allocation.svg)

| Algorithm | Mathematical Foundation | Key Property | Speed (20 assets) |
|-----------|----------------------|--------------|-------------------|
| **HRP** | Recursive bisection under inverse-variance naive risk parity. `w = recursive_bisect(tree, Σ)` — NO matrix inversion required. | Maximum stability. Avoids Σ⁻¹ entirely. | ~11 ms |
| **HERC** | Two-stage allocation: inter-cluster risk parity + intra-cluster inverse-variance with configurable risk measures (Variance, CVaR, CDaR, MAD). | Cluster-aware diversification. | ~17 ms |
| **NCO** | Nested Clustered Optimization with Tikhonov regularization: `(Σ_k + λI)⁻¹ · 1` for intra-cluster min-variance. | Lowest tail risk & drawdown. | ~46 ms |

#### Cumulative Returns: Strategy Comparison

![Cumulative Returns](images/cumulative.svg)

---

### Covariance Estimation Engine

The quality of the covariance matrix is the single most important factor in portfolio optimization. Canopy's `CovarianceEngine` provides four institutional-grade estimators:

| Estimator | Mathematical Basis | When to Use |
|-----------|-------------------|-------------|
| **Sample** | `Σ̂ = (1/T)·Rᵀ·R` — Maximum likelihood under Gaussian assumptions | Baseline. Large T/N ratio (>10×) |
| **Ledoit-Wolf** | `Σ_LW = α·F + (1−α)·Σ̂` — Optimal shrinkage toward scaled identity | Standard institutional default |
| **Denoised** | Marchenko-Pastur RMT: clip noise eigenvalues below `λ₊ = σ²(1+√(N/T))²` | High-noise environments (N/T > 0.5) |
| **EWMA** | `Σ_EWMA = Σ wₜ · rₜ · rₜᵀ`, decay halflife λ | Regime-adaptive risk management |

**Detoning** (Lopez de Prado, 2020): Optionally removes the market mode (first eigenvalue) from the correlation matrix before clustering. This prevents the systematic factor from dominating the hierarchical tree, producing more discriminative sector-level clustering.

---

### Risk Measures & Portfolio Modes

#### HERC Inter-Cluster Risk Measures

| Risk Measure | Formula | Institutional Use Case |
|-------------|---------|----------------------|
| **Variance** | `V_k = wᵀ · Σ_k · w` | Classic Raffinot (2017). Symmetric risk |
| **CVaR** | `E[R_k \| R_k ≤ VaR₅%]` | Tail risk. Allocates AWAY from crash-prone clusters |
| **CDaR** | `E[DD_k \| DD_k ≥ DDaR₉₅%]` | Drawdown risk. Penalizes deep underwater periods |
| **MAD** | `E[\|R_k - E[R_k]\|]` | Robust to outliers. No squared deviations |

#### Portfolio Modes

| Mode | Constraint | Use Case |
|------|-----------|----------|
| `long_only` | `wᵢ ≥ 0 ∀i` | Mutual funds, ETFs, pensions, UCITS |
| `long_short` | `wᵢ ∈ ℝ, Σwᵢ = 1` | Hedge funds, 130/30 strategies |
| `market_neutral` | `Σwᵢ = 0` | Statistical arbitrage desks |

---

### Dendrogram & Cluster Analysis

Canopy builds a full hierarchical clustering tree using 7 linkage methods (Ward, Single, Complete, Average, Weighted, Centroid, Median) with optional optimal leaf ordering (Bar-Joseph et al., 2001):

![Dendrogram](images/dendrogram.svg)

The dendrogram reveals the correlation structure of the asset universe. Strongly correlated assets (e.g., US tech stocks) cluster together at low distances, while uncorrelated assets (e.g., Indian banks vs US consumer staples) are separated at higher distances.

---

### Risk Decomposition

Canopy decomposes portfolio risk to show each asset's marginal contribution to total variance:

![Risk Contribution](images/risk.svg)

Equal Risk Contribution (the gold dashed line at 5% for N=20) is the theoretical target. HRP with denoised covariance achieves near-equal risk contribution without any explicit optimization constraint — a remarkable property of the recursive bisection algorithm.

---

## Performance Benchmarks

Canopy has been extensively benchmarked on 20 global assets (US + India) across 5 years of daily data (2020-2025):

| Method | Cov Estimator | Sharpe | Sortino | CVaR 95% | Max DD | Eff N | Speed |
|--------|--------------|--------|---------|----------|--------|-------|-------|
| **HRP** | Denoised | **0.83** | 0.95 | -2.27% | -30.5% | 16.9 | 11 ms |
| **HRP** | Ledoit-Wolf | 0.79 | 0.91 | -2.29% | -31.0% | 16.5 | 11 ms |
| **HERC** | LW + CVaR | 0.70 | 0.81 | -2.35% | -31.8% | 15.5 | 17 ms |
| **HERC** | LW + Variance | 0.72 | 0.84 | -2.25% | -30.1% | 15.2 | 17 ms |
| **NCO** | Ledoit-Wolf | 0.68 | 0.79 | -2.19% | -23.2% | 8.4 | 46 ms |

### Feature Summary

| Feature | Supported |
|---------|----------|
| HRP, HERC, NCO Allocation | ✅ |
| 4 Covariance Estimators (Sample, Ledoit-Wolf, Denoised, EWMA) | ✅ |
| 4 Risk Measures (Variance, CVaR, CDaR, MAD) | ✅ |
| Correlation Matrix Detoning | ✅ |
| Weight Constraints (min/max bounds) | ✅ |
| 3 Portfolio Modes (long_only, long_short, market_neutral) | ✅ |
| Block Bootstrap Confidence Intervals | ✅ |
| ISO 8601 Audit Trail + JSON Export | ✅ |
| 9 Interactive Plotly Dark-Theme Charts | ✅ |
| 7 Linkage Methods + Optimal Leaf Ordering | ✅ |

---

## Project Principles & Design Decisions

1. **Fail Fast, Fail Loud**: All inputs are validated at construction time. Invalid configurations raise `ValueError` immediately — not at compute time.

2. **Zero Matrix Inversion for HRP**: HRP never inverts the covariance matrix. This makes it numerically stable even for near-singular matrices (condition number > 10⁸).

3. **Audit Everything**: Every computation step is timestamped and logged. Export as JSON for compliance and reproducibility.

4. **Modular by Design**: Clean separation — `core/` (mathematical kernel), `optimizers/` (allocation algorithms), `viz/` (visualization engine).

5. **Method Chaining**: Fluent API design: `opt.cluster(returns).allocate()` — clean, readable, Pythonic.

```
canopy/
├── MasterCanopy.py              ← Facade (v2.3.0)
├── core/
│   ├── CovarianceEngine.py      ← Ledoit-Wolf, Denoised, EWMA, Detoning
│   └── ClusterEngine.py         ← 7 Linkage Methods, 4 Distance Metrics
├── optimizers/
│   ├── HRP.py                   ← Vectorized Recursive Bisection
│   ├── HERC.py                  ← 4 Risk Measures (Var, CVaR, CDaR, MAD)
│   └── NCO.py                   ← Tikhonov-Regularized Nested Optimization
├── viz/ChartEngine.py           ← 9 Interactive Plotly Charts
├── tests/test_canopy.py         ← 29 Tests (all passing)
└── docs/                        ← Sphinx + ReadTheDocs
```

---

## 🚀 Installation

### Using pip

```bash
pip install canopy-optimizer
```

### From source

```bash
git clone https://github.com/Anagatam/Canopy.git
cd Canopy
pip install -e .
```

### Dependencies

```
numpy>=1.24
pandas>=2.0
scipy>=1.10
scikit-learn>=1.3
plotly>=5.18
```

---

## Testing & Developer Setup

```bash
# Run the full test suite
python -m pytest tests/test_canopy.py -v

# Run with coverage
python -m pytest tests/test_canopy.py -v --cov=canopy

# Generate charts
make charts

# Full validation
make all
```

**Current: 29/29 tests passing** in 0.84 seconds.

---

## 🔮 Canopy Pro

**Canopy** (this repository) is our open-source edition, freely available under the MIT License.

**Canopy Pro** is our advanced, top-grade premium model currently under active development. It extends the open-source core with:

| Feature | Canopy (Open Source) | Canopy Pro (Coming Soon) |
|---------|---------------------|------------------------|
| HRP / HERC / NCO | ✅ | ✅ |
| 4 Covariance Estimators | ✅ | ✅ + DCC-GARCH, Factor Models |
| Risk Measures | 4 (Var, CVaR, CDaR, MAD) | 12+ (EVaR, RLVaR, EDaR, Tail Gini) |
| Portfolio Modes | 3 | 6+ (Risk Budgeting, Black-Litterman) |
| Real-Time Streaming | ❌ | ✅ |
| Enterprise Backtesting | ❌ | ✅ (Walk-forward, Monte Carlo) |
| Dedicated Support | Community | Priority SLA |
| Custom Integrations | ❌ | ✅ (Bloomberg, Refinitiv, MOSEK) |

> **Interested in Canopy Pro?** [Sign up for early access →](https://github.com/Anagatam/Canopy/issues)
>
> We will notify you as soon as Canopy Pro is available.

---

## License

MIT License. Copyright © 2026 **Anagatam Technologies**. All rights reserved.

Built with precision for the institutional quantitative finance community.
