Metadata-Version: 2.4
Name: charting-by-machines
Version: 0.2.0
Summary: ML-based portfolio selection from historical price patterns (Murray, Xia, Xiao 2024)
License: MIT
License-File: LICENSE
Keywords: machine-learning,portfolio-selection,finance,deep-learning,technical-analysis
Author: Your Name
Author-email: your.email@example.com
Requires-Python: >=3.10,<4.0
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Dist: hydra-core (>=1.3.0,<2.0.0)
Requires-Dist: loguru (>=0.7.0,<0.8.0)
Requires-Dist: matplotlib (>=3.8.0,<4.0.0)
Requires-Dist: mlflow (>=2.8.0,<3.0.0)
Requires-Dist: numpy (>=1.24.0,<2.0.0)
Requires-Dist: omegaconf (>=2.3.0,<3.0.0)
Requires-Dist: pandas (>=2.0.0,<3.0.0)
Requires-Dist: polars (>=1.0.0,<2.0.0)
Requires-Dist: pyarrow (>=14.0.0,<15.0.0)
Requires-Dist: pydantic (>=2.0.0,<3.0.0)
Requires-Dist: pydantic-settings (>=2.0.0,<3.0.0)
Requires-Dist: rich (>=13.0.0,<14.0.0)
Requires-Dist: scikit-learn (>=1.3.0,<2.0.0)
Requires-Dist: scipy (>=1.11.0,<2.0.0)
Requires-Dist: seaborn (>=0.13.0,<0.14.0)
Requires-Dist: statsmodels (>=0.14.0,<0.15.0)
Requires-Dist: torch (>=2.0.0,<3.0.0)
Requires-Dist: tqdm (>=4.66.0,<5.0.0)
Requires-Dist: typer (>=0.9.0,<0.10.0)
Requires-Dist: yfinance (>=0.2.28,<0.3.0)
Description-Content-Type: text/markdown

# Charting by Machines

[![Tests](https://img.shields.io/badge/tests-47%20passed-brightgreen)](tests/)
[![Coverage](https://img.shields.io/badge/coverage-41%25-yellow)](htmlcov/)
[![Python](https://img.shields.io/badge/python-3.10%2B-blue)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/license-MIT-green)](LICENSE)
[![PyPI](https://img.shields.io/pypi/v/charting-by-machines)](https://pypi.org/project/charting-by-machines/)

A Python package reproducing the ML-based portfolio selection methodology from **"Charting by Machines"** by Murray, Xia, and Xiao (2024, Journal of Financial Economics).

## Overview

This package implements a machine learning approach to test the efficient market hypothesis by forecasting stock returns from historical price patterns. The methodology uses CNN-LSTM neural networks to generate return forecasts that strongly predict the cross-section of future stock returns.

## Key Features

- **Multiple ML Architectures**: FNN, CNN, LSTM, CNN-LSTM
- **Flexible Data Sources**: Yahoo Finance, WRDS (CRSP), local files
- **Portfolio Construction**: Univariate, bivariate, and trivariate quintile sorting
- **Risk Analysis**: Factor models (CAPM, FF3, FF5, Carhart), Sharpe ratios
- **Experiment Tracking**: MLflow integration for reproducibility
- **Multiple Interfaces**: Python API, CLI, Jupyter notebooks

## Installation

```bash
# Using pip
pip install charting-by-machines

# Using poetry (recommended for development)
git clone https://github.com/yourusername/charting-by-machines.git
cd charting-by-machines
poetry install
```

## Quick Start

```python
from cbm import PortfolioEngine

# Initialize the engine
engine = PortfolioEngine()

# Load data (using Yahoo Finance)
engine.load_data(
    tickers=["AAPL", "MSFT", "GOOGL", ...],  # or use universe="sp500"
    start_date="2010-01-01",
    end_date="2023-12-31"
)

# Train the CNN-LSTM model
model_id = engine.train_model(
    architecture="cnn_lstm",
    loss_function="mse",
    weighting="ewpm",
    optimization_period=("2010-01", "2018-12")
)

# Generate forecasts
forecasts = engine.forecast(model_id=model_id)

# Construct portfolios sorted by ML forecasts
portfolios = engine.construct_portfolios(
    forecasts=forecasts,
    n_portfolios=10,
    weighting="value"
)

# Analyze performance
performance = engine.analyze_performance(portfolios)
print(performance.summary())
```

## CLI Usage

```bash
# Train a model
cbm train --config config/default.yaml

# Generate forecasts
cbm forecast --model-id <model_id> --output forecasts.parquet

# Run backtest
cbm backtest --config config/backtest.yaml
```

## Methodology

Based on Murray, Xia, and Xiao (2024), the package:

1. **Input Features**: Uses 12 cumulative monthly returns as input
2. **Neural Network**: CNN-LSTM architecture with MSE loss
3. **Weighting**: Equal-weighted per month (EWPM)
4. **Target Variable**: Normalized excess returns (RetNorm)
5. **Ensemble**: Averages 30 model fits for robust forecasts

## Project Structure

```
charting-by-machines/
├── src/cbm/
│   ├── core/           # Configuration, types, main engine
│   ├── data/           # Data adapters, feature engineering
│   ├── ml/             # Neural network models, training
│   ├── portfolio/      # Portfolio construction, analysis
│   ├── api/            # CLI, Python API
│   └── utils/          # Logging, metrics, helpers
├── tests/              # Unit and integration tests
├── config/             # Hydra configuration files
├── examples/           # Jupyter notebooks
└── docs/               # Documentation
```

## Citation

If you use this package in your research, please cite:

```bibtex
@article{murray2024charting,
  title={Charting by Machines: Machine Learning-Based Portfolio Selection from Historical Price Patterns},
  author={Murray, Scott and Xia, Yusen and Xiao, Houping},
  journal={Journal of Financial Economics},
  year={2024}
}
```

## License

MIT License - see [LICENSE](LICENSE) for details.

