Metadata-Version: 2.4
Name: paper-portfolio
Version: 0.1.1
Summary: A portfolio construction and analysis tool for asset-pricing strategies.
Author-email: Lorenzo Varese <55581163+lorenzovarese@users.noreply.github.com>
License-File: LICENSE
Requires-Python: >=3.11
Requires-Dist: matplotlib>=3.10.0
Requires-Dist: polars>=1.30.0
Requires-Dist: pydantic>=2.11.5
Requires-Dist: pyyaml>=6.0.2
Description-Content-Type: text/markdown

# paper-portfolio: Portfolio Construction & Performance Evaluation 📈

[![codecov](https://codecov.io/github/lorenzovarese/paper-asset-pricing/graph/badge.svg?token=ZUDEPEPJFK)](https://codecov.io/github/lorenzovarese/paper-asset-pricing)
[![PyPI version](https://badge.fury.io/py/paper-portfolio.svg)](https://badge.fury.io/py/paper-portfolio)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/release/python-3110/)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

`paper-portfolio` is the final component of the P.A.P.E.R (Platform for Asset Pricing Experimentation and Research) monorepo. It provides a powerful, configuration-driven framework for constructing investment portfolios based on the predictive outputs of machine learning models and evaluating their economic significance.

Using the prediction files generated by `paper-model`, this package allows you to backtest various long-short portfolio strategies, calculate standard performance metrics, and generate insightful reports and visualizations.

---

## ✨ Features

*   **Flexible Portfolio Construction:**
    *   **Long-Short Strategies:** Easily construct long-short portfolios by sorting assets based on their predicted returns each month.
    *   **Quantile-Based Selection:** Define portfolio legs using specific quantile ranges (e.g., long top 10%, short bottom 10%).
    *   **Weighting Schemes:** Supports both **`equal`** (equally-weighted) and **`value`** (e.g., market-cap weighted) portfolio construction.
*   **Comprehensive Performance Evaluation:**
    *   **Standard Metrics:** Calculates `annualized_sharpe_ratio`, `expected_shortfall` (CVaR), and tracks `cumulative_return`.
    *   **Benchmarking**: Automatically compares strategies against the risk-free rate and an optional, user-provided **market index benchmark**.
*   **In-Depth Analysis:**
    *   **Cross-Sectional Analysis**: Optionally generates plots showing the cumulative performance of assets sorted into deciles by prediction. This is crucial for checking if the model's predictions are monotonically related to returns.
*   **Configuration-Driven Workflow:**
    *   Define all portfolio strategies, the models to test, benchmarks, and metrics to calculate in a single, human-readable `portfolio-config.yaml` file. This ensures reproducibility and simplifies experimentation.
*   **Automated Reporting & Visualization:**
    *   Generates detailed summary reports in text files for each model-strategy combination.
    *   Automatically creates and saves PNG plots of cumulative returns, providing a clear visual comparison of the long, short, and combined portfolios against benchmarks.
    *   Saves detailed monthly portfolio returns to Parquet files for deeper, custom analysis.
*   **Seamless Integration:**
    *   Directly consumes the `.parquet` prediction files produced by `paper-model`.
    *   Orchestrated by the `paper-asset-pricing` CLI for a smooth, end-to-end research pipeline.

---

## 🚀 Installation

`paper-portfolio` is designed to be part of the larger `PAPER` monorepo.

**Recommended (as part of `paper-asset-pricing`):**

This method ensures `paper-portfolio` is available to the main `paper` CLI orchestrator.

```bash
# Using pip
pip install "paper-asset-pricing[portfolio]"

# Using uv
uv pip install "paper-asset-pricing[portfolio]"
```

**Standalone Installation:**

If you only need `paper-portfolio` and its core functionalities for a different project.

```bash
# Using pip
pip install paper-portfolio

# Using uv
uv pip install paper-portfolio
```

**From Source (for development within the monorepo):**

Navigate to the root of your `PAPER` monorepo and install `paper-portfolio` in editable mode.

```bash
# Using pip
pip install -e ./paper-portfolio

# Using uv
uv pip install -e ./paper-portfolio
```

---

## 📖 Usage Workflow

The `paper-portfolio` pipeline is the final step in the P.A.P.E.R workflow.

### 1. Prerequisites: Data and Model Pipelines

Before running the portfolio phase, you must first run the data and model pipelines to generate the necessary inputs.

```bash
# Assuming you are in your project directory (e.g., ThesisExample)

# 1. Run the data phase
paper execute data

# 2. Run the models phase
paper execute models
```

After these steps, your project's `models/predictions/` directory should contain files like `OLS_model_predictions.parquet`.

### 2. Portfolio Configuration (`portfolio-config.yaml`)

Create or edit the `portfolio-config.yaml` file in your project's `configs` directory. This file defines which models to test and which portfolio strategies to apply.

```yaml
# MyProjectExample/configs/portfolio-config.yaml

input_data:
  # List of model names whose predictions you want to evaluate.
  # These must match the names from models-config.yaml.
  prediction_model_names:
    - "OLS_model"
    - "GBRT_tuned"

  # The base name of the processed dataset used by the models.
  processed_dataset_name: "processed_panel_data"

  # Column names required for calculations.
  date_column: "date"
  id_column: "permno"
  risk_free_rate_col: "rf"
  value_weight_col: "marketcap" # For value-weighting

# Optional: Define a market index for benchmark comparison.
# The CSV file must be placed in the `portfolios/indexes/` directory.
market_benchmark:
  name: "Market Index"
  file_name: "market_index.csv"
  date_column: "caldt"
  return_column: "vwretd"
  date_format: "%Y%m%d"

# A list of portfolio strategies to backtest for each model.
strategies:
  - name: "Decile_Sort_Equal_Weighted"
    weighting_scheme: "equal"
    long_quantiles: [0.9, 1.0]   # Long the top 10%
    short_quantiles: [0.0, 0.1]  # Short the bottom 10%

  - name: "Decile_Sort_Value_Weighted"
    weighting_scheme: "value"
    long_quantiles: [0.9, 1.0]
    short_quantiles: [0.0, 0.1]

# A list of performance metrics to calculate and report.
metrics:
  - "sharpe_ratio"
  - "expected_shortfall"
  - "cumulative_return"

# Enable the generation of cross-sectional decile return plots.
cross_sectional_analysis: true
```

### 3. Running the Portfolio Pipeline

Execute the portfolio phase using the `paper-asset-pricing` CLI from your project directory.

```bash
# Assuming you are in your project directory (e.g., MyProjectExample)
paper execute portfolio
```

### 4. Expected Output

**Console Output:**

The console will show a high-level success message.

```
>>> Executing Portfolio Phase <<<
Portfolio phase completed successfully. Additional information in 'MyProjectExample/logs.log'
```

**`ThesisExample/portfolios/results/` Directory:**

The `results` directory will be populated with detailed reports and plots for each model-strategy combination.

```ThesisExample/portfolios/results/
├── cross_sectional_analysis/
│   ├── GBRT_tuned_cross_sectional_returns.png
│   └── OLS_model_cross_sectional_returns.png
├── GBRT_tuned_Decile_Sort_Equal_Weighted_cumulative_return.png
├── GBRT_tuned_Decile_Sort_Equal_Weighted_monthly_returns.parquet
├── GBRT_tuned_Decile_Sort_Equal_Weighted_report.txt
├── GBRT_tuned_Decile_Sort_Value_Weighted_cumulative_return.png
├── ... (and so on for all models and strategies)
```

**Example Report (`OLS_model_Decile_Sort_Value_Weighted_report.txt`):**

```
--- Portfolio Performance Report ---
Model: OLS_model
Strategy: Decile_Sort_Value_Weighted
------------------------------
sharpe_ratio: 1.2543
expected_shortfall: -0.0312
final_cumulative_return: 8.1234
------------------------------
```

---

## ⚙️ Configuration Reference

The `portfolio-config.yaml` file controls the entire portfolio evaluation process.

### `input_data`

*   `prediction_model_names` (list, required): A list of model names. The manager will look for prediction files named `{model_name}_predictions.parquet`.
*   `processed_dataset_name` (string, required): The base name of the processed dataset used for modeling. This is needed to fetch columns like the risk-free rate and value-weighting characteristic.
*   `date_column`, `id_column`, `risk_free_rate_col`, `value_weight_col` (string, optional): Names of key columns.

### `market_benchmark` (optional)

*   `name` (string, required): Display name for the benchmark.
*   `file_name` (string, required): The name of the CSV file in the `portfolios/indexes/` directory.
*   `date_column`, `return_column`, `date_format` (string, required): Column names and date format for the benchmark file.

### `strategies`

A list of portfolio strategies to backtest. Each strategy requires:

*   `name` (string, required): A unique name for the strategy (e.g., `"Value_Weighted_Decile"`).
*   `weighting_scheme` (string, required): Must be either `"equal"` or `"value"`.
*   `long_quantiles` (list of two floats, required): The lower and upper quantile boundaries for the long leg (e.g., `[0.9, 1.0]` for the top 10%).
*   `short_quantiles` (list of two floats, required): The lower and upper quantile boundaries for the short leg (e.g., `[0.0, 0.1]` for the bottom 10%).

### `metrics`

A list of performance metrics to compute. Supported values: `"sharpe_ratio"`, `"expected_shortfall"`, `"cumulative_return"`.

### `cross_sectional_analysis` (optional)

*   Set to `true` to enable the generation of decile-sorted performance plots for each model. Defaults to `false`.

---

## 🤝 Contributing

Contributions to `paper-portfolio` are highly welcome! If you have ideas for new performance metrics, portfolio construction techniques, or reporting features, please feel free to open an issue or submit a pull request.

---

## 📄 License

`paper-portfolio` is distributed under the MIT License. See the `LICENSE` file for more information.

---
