Metadata-Version: 2.4
Name: ml4t-diagnostic
Version: 0.1.0a10
Summary: Comprehensive diagnostic and evaluation framework for quantitative finance ML workflows
Project-URL: Homepage, https://github.com/ml4t/diagnostic
Project-URL: Documentation, https://ml4t-diagnostic.readthedocs.io
Project-URL: Repository, https://github.com/ml4t/diagnostic
Project-URL: Issues, https://github.com/ml4t/diagnostic/issues
Project-URL: Changelog, https://github.com/ml4t/diagnostic/blob/main/CHANGELOG.md
Author-email: QuantLab Team <info@quantlab.io>
Maintainer-email: QuantLab Contributors <dev@quantlab.io>
License: MIT
License-File: LICENSE
Keywords: backtesting,cross-validation,embargo,finance,machine-learning,polars,purging,quantitative-finance,statistical-tests,trading,validation
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Office/Business :: Financial :: Investment
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: arch>=7.2.0
Requires-Dist: jinja2>=3.1.0
Requires-Dist: joblib>=1.3.0
Requires-Dist: numba>=0.57.0
Requires-Dist: numpy<2.0.0,>=1.24.0
Requires-Dist: pandas-market-calendars>=4.0.0
Requires-Dist: pandas>=2.0.0
Requires-Dist: polars>=0.20.0
Requires-Dist: pyarrow>=14.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: scikit-learn>=1.3.0
Requires-Dist: scipy<1.16.0,>=1.10.0
Requires-Dist: shap<0.50.0,>=0.41.0
Requires-Dist: statsmodels>=0.14.0
Requires-Dist: tqdm>=4.66.0
Provides-Extra: advanced
Requires-Dist: arch>=6.0.0; extra == 'advanced'
Provides-Extra: all
Requires-Dist: hypothesis>=6.80.0; extra == 'all'
Requires-Dist: ipdb>=0.13.0; extra == 'all'
Requires-Dist: ipython>=8.14.0; extra == 'all'
Requires-Dist: kaleido>=0.2.0; extra == 'all'
Requires-Dist: lightgbm>=4.0.0; extra == 'all'
Requires-Dist: matplotlib>=3.7.0; extra == 'all'
Requires-Dist: myst-parser>=2.0.0; extra == 'all'
Requires-Dist: nbsphinx>=0.9.0; extra == 'all'
Requires-Dist: plotly>=5.15.0; extra == 'all'
Requires-Dist: pre-commit>=3.3.0; extra == 'all'
Requires-Dist: pypdf>=5.0.0; extra == 'all'
Requires-Dist: pytest-benchmark>=4.0.0; extra == 'all'
Requires-Dist: pytest-cov>=4.1.0; extra == 'all'
Requires-Dist: pytest-timeout>=2.1.0; extra == 'all'
Requires-Dist: pytest-xdist>=3.3.0; extra == 'all'
Requires-Dist: pytest>=7.4.0; extra == 'all'
Requires-Dist: ruff>=0.1.0; extra == 'all'
Requires-Dist: seaborn>=0.12.0; extra == 'all'
Requires-Dist: sphinx-autodoc-typehints>=1.24.0; extra == 'all'
Requires-Dist: sphinx-rtd-theme>=1.3.0; extra == 'all'
Requires-Dist: sphinx>=7.0.0; extra == 'all'
Requires-Dist: streamlit>=1.28.0; extra == 'all'
Requires-Dist: ty; extra == 'all'
Requires-Dist: wandb>=0.16.0; extra == 'all'
Requires-Dist: xgboost>=2.0.0; extra == 'all'
Provides-Extra: all-ml
Requires-Dist: cupy-cuda11x>=11.0.0; extra == 'all-ml'
Requires-Dist: lightgbm>=4.0.0; extra == 'all-ml'
Requires-Dist: tensorflow>=2.0.0; extra == 'all-ml'
Requires-Dist: xgboost>=2.0.0; extra == 'all-ml'
Provides-Extra: dashboard
Requires-Dist: streamlit>=1.28.0; extra == 'dashboard'
Provides-Extra: deep
Requires-Dist: tensorflow>=2.0.0; extra == 'deep'
Provides-Extra: dev
Requires-Dist: hypothesis>=6.80.0; extra == 'dev'
Requires-Dist: ipdb>=0.13.0; extra == 'dev'
Requires-Dist: ipython>=8.14.0; extra == 'dev'
Requires-Dist: pre-commit>=3.3.0; extra == 'dev'
Requires-Dist: pytest-benchmark>=4.0.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.1.0; extra == 'dev'
Requires-Dist: pytest-timeout>=2.1.0; extra == 'dev'
Requires-Dist: pytest-xdist>=3.3.0; extra == 'dev'
Requires-Dist: pytest>=7.4.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Requires-Dist: ty; extra == 'dev'
Provides-Extra: docs
Requires-Dist: mkdocs-gen-files>=0.5.0; extra == 'docs'
Requires-Dist: mkdocs-literate-nav>=0.6.0; extra == 'docs'
Requires-Dist: mkdocs-material>=9.5.0; extra == 'docs'
Requires-Dist: mkdocs>=1.5.0; extra == 'docs'
Requires-Dist: mkdocstrings[python]>=0.24.0; extra == 'docs'
Provides-Extra: gpu
Requires-Dist: cupy-cuda11x>=11.0.0; extra == 'gpu'
Provides-Extra: integration
Provides-Extra: ml
Requires-Dist: lightgbm>=4.0.0; extra == 'ml'
Requires-Dist: xgboost>=2.0.0; extra == 'ml'
Provides-Extra: tracking
Requires-Dist: wandb>=0.16.0; extra == 'tracking'
Provides-Extra: viz
Requires-Dist: kaleido>=0.2.0; extra == 'viz'
Requires-Dist: matplotlib>=3.7.0; extra == 'viz'
Requires-Dist: plotly>=5.15.0; extra == 'viz'
Requires-Dist: pypdf>=5.0.0; extra == 'viz'
Requires-Dist: seaborn>=0.12.0; extra == 'viz'
Description-Content-Type: text/markdown

# ml4t-diagnostic

[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![PyPI](https://img.shields.io/pypi/v/ml4t-diagnostic)](https://pypi.org/project/ml4t-diagnostic/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

Statistical validation and diagnostics for quantitative trading strategies: signal analysis, backtest evaluation, and overfitting detection.

## Part of the ML4T Library Ecosystem

This library is one of five interconnected libraries supporting the machine learning for trading workflow described in [Machine Learning for Trading](https://mlfortrading.io):

![ML4T Library Ecosystem](docs/images/ml4t_ecosystem_workflow_print.jpeg)

Each library addresses a distinct stage: data infrastructure, feature engineering, signal evaluation, strategy backtesting, and live deployment.

## What This Library Does

Evaluating whether a signal or strategy has genuine predictive power requires statistical rigor. ml4t-diagnostic provides:

- Information coefficient (IC) analysis with HAC-adjusted standard errors
- Deflated Sharpe Ratio (DSR) and other multiple-testing corrections
- Combinatorial purged cross-validation (CPCV) for time series
- Feature importance analysis (MDI, PFI, MDA, SHAP)
- Trade-level diagnostics with SHAP-based error pattern discovery
- Portfolio performance metrics and tear sheets

The library implements methods from the academic finance literature, particularly those addressing backtest overfitting and false discovery in strategy research.

![ml4t-diagnostic Architecture](docs/images/ml4t_diagnostic_architecture_print.jpeg)

## Installation

```bash
pip install ml4t-diagnostic
```

Optional dependencies:

```bash
pip install ml4t-diagnostic[ml]   # SHAP, importance analysis
pip install ml4t-diagnostic[viz]  # Plotly visualizations
pip install ml4t-diagnostic[all]  # Everything
```

## Quick Start

### Signal Analysis

```python
from ml4t.diagnostic import analyze_signal

result = analyze_signal(
    factor=factor_data,  # date, asset, factor
    prices=price_data,   # date, asset, price
    periods=(1, 5, 21),
)

print(f"IC (1D): {result.ic['1D']:.4f}")
print(f"IC t-stat (1D): {result.ic_t_stat['1D']:.2f}")
print(f"Q5-Q1 spread (1D): {result.spread['1D']:.2%}")
```

### Deflated Sharpe Ratio

```python
from ml4t.diagnostic.evaluation.stats import deflated_sharpe_ratio

# Accounts for multiple testing
dsr_result = deflated_sharpe_ratio(
    returns=strategy_returns,
    benchmark_sharpe=0.0,
    n_trials=100,
)

print(f"Sharpe: {dsr_result.sharpe_ratio:.2f}")
print(f"Deflated Sharpe: {dsr_result.deflated_sharpe:.2f}")
print(f"Significant: {dsr_result.is_significant}")
```

### Feature Importance

```python
from ml4t.diagnostic.evaluation import analyze_ml_importance

# Combines MDI, PFI, MDA, SHAP methods
results = analyze_ml_importance(model, X, y)
print(results.consensus_ranking)
```

### Trade Diagnostics

```python
from ml4t.diagnostic.evaluation import TradeAnalysis, TradeShapAnalyzer

analyzer = TradeAnalysis(trade_records)
worst_trades = analyzer.worst_trades(n=20)

# SHAP-based error pattern discovery
shap_analyzer = TradeShapAnalyzer(model, features_df, shap_values)
result = shap_analyzer.explain_worst_trades(worst_trades)

for pattern in result.error_patterns:
    print(f"Pattern: {pattern.hypothesis}")
    print(f"Potential savings: ${pattern.potential_impact:,.2f}")
```

## Diagnostic Framework

```
Tier 1: Feature Analysis (Pre-Modeling)
├── Time series diagnostics (stationarity, ACF, volatility)
├── Distribution analysis (moments, normality, tails)
├── Feature importance (MDI, PFI, MDA, SHAP)
└── Feature interactions (conditional IC, H-stat)

Tier 2: Signal Analysis (Model Outputs)
├── IC analysis (time series, histogram, decay)
├── Quantile returns (spreads, monotonicity)
├── Turnover analysis
└── Multi-signal comparison

Tier 3: Backtest Analysis (Post-Modeling)
├── Trade analysis (win/loss, holding periods)
├── Statistical validity (DSR, RAS, PBO)
├── Trade-SHAP diagnostics
└── Excursion analysis (TP/SL optimization)

Tier 4: Portfolio Analysis (Production)
├── Performance metrics (Sharpe, Sortino, Calmar)
├── Drawdown analysis
├── Rolling metrics
└── Risk metrics (VaR, CVaR)
```

## Statistical Methods

| Method | Purpose |
|--------|---------|
| DSR (Deflated Sharpe) | Corrects for multiple testing bias |
| CPCV (Combinatorial Purged CV) | Leak-free time series validation |
| RAS (Rademacher Anti-Serum) | Backtest overfitting detection |
| PBO | Probability of backtest overfitting |
| HAC-adjusted IC | Autocorrelation-robust information coefficient |
| FDR Control | Multiple comparisons (Benjamini-Hochberg) |

## Cross-Validation

```python
from ml4t.diagnostic.splitters import WalkForwardCV, CombinatorialCV
from ml4t.diagnostic.visualization import plot_cv_folds

# Walk-forward with purging
cv = WalkForwardCV(n_splits=5, train_size=252, test_size=63, purge_days=21)

# Visualize fold structure
fig = plot_cv_folds(cv, dates)
fig.show()
```

## Technical Characteristics

- **Polars-based**: Native Polars DataFrames throughout
- **HAC standard errors**: Newey-West adjustment for autocorrelated data
- **Time-aware validation**: Purged and embargoed cross-validation splits

## Related Libraries

- **ml4t-data**: Market data acquisition and storage
- **ml4t-engineer**: Feature engineering and technical indicators
- **ml4t-backtest**: Event-driven backtesting
- **ml4t-live**: Live trading with broker integration

## Development

```bash
git clone https://github.com/applied-ai/ml4t-diagnostic.git
cd ml4t-diagnostic
uv sync
uv run pytest tests/ -q -n auto
uv run ty check
```

## References

- Lopez de Prado, M. (2018). *Advances in Financial Machine Learning*. Wiley.
- Bailey, D., & Lopez de Prado, M. (2012). "The Sharpe Ratio Efficient Frontier."
- Bailey, D., et al. (2014). "The Deflated Sharpe Ratio."

## License

MIT License - see [LICENSE](LICENSE) for details.
