Metadata-Version: 2.4
Name: catboost-cli
Version: 0.1.0
Summary: Production-quality CLI for training, evaluating, and predicting with CatBoost models from tabular data
Author: CatBoost CLI Contributors
Maintainer: CatBoost CLI Contributors
License: MIT
Project-URL: Homepage, https://github.com/yourusername/catboost-cli
Project-URL: Documentation, https://github.com/yourusername/catboost-cli#readme
Project-URL: Repository, https://github.com/yourusername/catboost-cli
Project-URL: Issues, https://github.com/yourusername/catboost-cli/issues
Project-URL: Changelog, https://github.com/yourusername/catboost-cli/blob/main/CHANGELOG.md
Keywords: catboost,machine-learning,cli,mlops,gradient-boosting,tabular-data,classification,regression,ml-pipeline,data-science
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Typing :: Typed
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: catboost>=1.2
Requires-Dist: polars>=0.19.0
Requires-Dist: scikit-learn>=1.3.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: typer>=0.9.0
Requires-Dist: numpy>=1.24.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Dynamic: license-file

# CatBoost CLI

[![CI](https://github.com/yourusername/catboost-cli/actions/workflows/ci.yml/badge.svg)](https://github.com/yourusername/catboost-cli/actions/workflows/ci.yml)
[![PyPI version](https://badge.fury.io/py/catboost-cli.svg)](https://badge.fury.io/py/catboost-cli)
[![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

Production-quality Python CLI for training, evaluating, and predicting with CatBoost models from raw tabular CSV or Parquet files.

**Ready for production** with automated CI/CD, PyPI publishing, and comprehensive testing on Python 3.12.

## Highlights

- 🚀 **Easy Installation**: `pip install catboost-cli` or `uv pip install catboost-cli`
- ⚡ **uv Support**: 10-100x faster installs with [uv](https://github.com/astral-sh/uv) - see [UV_GUIDE.md](UV_GUIDE.md)
- 🔄 **Automated CI/CD**: GitHub Actions for testing and publishing
- 📦 **Modern Stack**: Polars, Pydantic v2, Typer
- 🎯 **Production-Ready**: Structured metadata, logging, validation
- 🧪 **Well-Tested**: Comprehensive CI testing on Python 3.12
- 📚 **Comprehensive Docs**: Quick start, publishing guide, contribution guide

## Features

- **Multiple Commands**: `train`, `eval`, `predict`, `info`
- **Data Formats**: CSV and Parquet support via Polars
- **Task Types**: Classification and Regression
- **Validation Strategies**: Holdout split, K-Fold Cross-Validation, **or predefined splits** 🆕
- **Lazy Loading**: Only loads needed columns for massive performance gains on wide datasets ⚡
- **Categorical Features**: Auto-detection and manual specification
- **Comprehensive Metrics**:
  - Classification: accuracy, precision, recall, f1, log_loss, roc_auc, confusion matrix
  - Regression: RMSE, MAE, R², MAPE, explained variance
- **Flexible Predictions**: Raw, probability, or class predictions
- **Model Metadata**: Automatic metadata tracking and persistence
- **Extensive CatBoost Parameters**: 30+ CLI flags for model tuning
- **Structured Output**: Pydantic-validated JSON metrics

## Installation

### From PyPI (Recommended)

```bash
# Using pip
pip install catboost-cli

# Or using uv (faster)
uv pip install catboost-cli
```

### From Source

```bash
# Clone the repository
git clone https://github.com/yourusername/catboost-cli.git
cd catboost-cli

# Using uv (recommended - faster)
uv sync

# Or using pip
pip install -e .

# With development dependencies
uv sync  # includes dev dependencies by default
# or
pip install -e ".[dev]"
```

### Requirements

- Python 3.9+
- CatBoost >= 1.2
- Polars >= 0.19.0
- scikit-learn >= 1.3.0
- Pydantic >= 2.0.0
- Typer >= 0.9.0
- NumPy >= 1.24.0

## Quick Start

**New to CatBoost CLI?** See [QUICKSTART.md](QUICKSTART.md) for a 5-minute tutorial.

```bash
# Install from PyPI
pip install catboost-cli
# or: uv pip install catboost-cli

# Train a classification model
catboost-cli train \
  --data-path data/train.csv \
  --model-path models/my_model.cbm \
  --target label \
  --task classification \
  --primary-metric f1_macro \
  --iterations 500

# Generate predictions
catboost-cli predict \
  --data-path data/test.csv \
  --model-path models/my_model.cbm \
  --out-path predictions.csv \
  --prediction-type probability

# Evaluate on labeled test set
catboost-cli eval \
  --data-path data/test.csv \
  --model-path models/my_model.cbm \
  --target label \
  --metrics-path metrics.json

# View model information
catboost-cli info --model-path models/my_model.cbm
```

## Usage Examples

### Example 1: Train Regression Model from Parquet with RMSE Primary Metric

```bash
catboost-cli train \
  --data-path housing_data.parquet \
  --model-path models/housing_model.cbm \
  --target price \
  --task regression \
  --primary-metric rmse \
  --iterations 1000 \
  --learning-rate 0.05 \
  --depth 8 \
  --test-size 0.2 \
  --metrics-path housing_metrics.json \
  --verbose
```

**What this does:**
- Loads housing data from Parquet file
- Trains a regression model to predict `price`
- Uses RMSE as the primary metric
- Splits data 80/20 for train/test
- Saves model, metadata, and metrics

### Example 2: Train Binary Classification from CSV with Auto-Detected Categorical Columns

```bash
catboost-cli train \
  --data-path customer_churn.csv \
  --model-path models/churn_model.cbm \
  --target churn \
  --task classification \
  --auto-cat \
  --primary-metric f1_macro \
  --average macro \
  --iterations 800 \
  --learning-rate 0.03 \
  --depth 6 \
  --early-stopping-rounds 50 \
  --use-best-model \
  --test-size 0.25 \
  --stratify \
  --random-seed 42 \
  --metrics-path churn_metrics.json
```

**What this does:**
- Auto-detects categorical columns (string/categorical dtypes)
- Uses stratified train/test split (respects class balance)
- Optimizes F1-score with macro averaging
- Enables early stopping with best model selection
- Primary metric reported: f1_macro

### Example 3: Predict Probabilities and Write Parquet

```bash
catboost-cli predict \
  --data-path new_customers.parquet \
  --model-path models/churn_model.cbm \
  --out-path predictions.parquet \
  --prediction-type probability \
  --append \
  --id-cols customer_id account_id
```

**What this does:**
- Loads new customer data from Parquet
- Generates probability predictions for each class
- Appends prediction columns to original data
- Includes ID columns in output
- Saves as Parquet file with columns: `customer_id`, `account_id`, `proba_0`, `proba_1`, etc.

### Example 4: Evaluate Saved Model on Labeled Test File

```bash
catboost-cli eval \
  --data-path test_set.csv \
  --model-path models/churn_model.cbm \
  --target churn \
  --metrics-path test_metrics.json \
  --predictions-path test_predictions.csv \
  --prediction-type class
```

**What this does:**
- Loads the trained model and its metadata
- Evaluates on labeled test data
- Computes all classification metrics
- Saves structured metrics to JSON
- Optionally saves predictions with class labels

### Example 5: Cross-Validation with 5 Folds

```bash
catboost-cli train \
  --data-path iris.csv \
  --model-path models/iris_model.cbm \
  --target species \
  --task classification \
  --cv-folds 5 \
  --fit-final \
  --auto-cat \
  --primary-metric accuracy \
  --iterations 500 \
  --learning-rate 0.05 \
  --depth 4 \
  --random-seed 42 \
  --metrics-path iris_cv_metrics.json
```

**What this does:**
- Performs 5-fold stratified cross-validation
- Reports mean and std for all metrics across folds
- Fits final model on full dataset (--fit-final)
- Saves final model and CV results in metrics JSON
- Primary metric: accuracy

**Output includes:**
```
Cross-Validation Summary:
----------------------------------------------------------------------
Metric                         Mean            Std
----------------------------------------------------------------------
* accuracy                     0.960000        0.032660
  precision_macro              0.963333        0.034993
  recall_macro                 0.960000        0.032660
  f1_macro                     0.960000        0.032660
----------------------------------------------------------------------
```

### Example 6: Advanced Training with Many CatBoost Parameters

```bash
catboost-cli train \
  --data-path fraud_detection.parquet \
  --model-path models/fraud_model.cbm \
  --target is_fraud \
  --task classification \
  --features transaction_amount merchant_id user_age device_type \
  --cat-cols merchant_id device_type \
  --drop-cols transaction_id timestamp \
  --primary-metric roc_auc \
  --test-size 0.3 \
  --stratify \
  --iterations 2000 \
  --learning-rate 0.01 \
  --depth 10 \
  --l2-leaf-reg 5.0 \
  --random-strength 2.0 \
  --bootstrap-type Bayesian \
  --bagging-temperature 0.5 \
  --rsm 0.8 \
  --min-data-in-leaf 5 \
  --grow-policy Lossguide \
  --early-stopping-rounds 100 \
  --use-best-model \
  --auto-class-weights Balanced \
  --thread-count 8 \
  --random-seed 123 \
  --verbose \
  --metrics-path fraud_metrics.json
```

**What this does:**
- Explicitly selects features and categorical columns
- Drops unwanted columns
- Uses Balanced class weights for imbalanced data
- Configures advanced CatBoost parameters:
  - Bayesian bootstrap with temperature
  - Random subspace method (RSM) for feature sampling
  - Lossguide tree growing policy
  - Custom regularization and leaf parameters
- Uses 8 threads for faster training

### Example 7: Multiclass Classification

```bash
catboost-cli train \
  --data-path wine_quality.csv \
  --model-path models/wine_model.cbm \
  --target quality_class \
  --task classification \
  --auto-cat \
  --primary-metric f1_weighted \
  --average weighted \
  --test-size 0.2 \
  --iterations 1000 \
  --learning-rate 0.05 \
  --depth 6 \
  --class-weights "1.0,1.5,2.0" \
  --metrics-path wine_metrics.json
```

**What this does:**
- Trains multiclass classifier
- Uses weighted F1 (accounts for class imbalance)
- Specifies custom class weights
- Primary metric: f1_weighted

### Example 8: Regression with Feature Engineering

```bash
catboost-cli train \
  --data-path sales_data.parquet \
  --model-path models/sales_forecast.cbm \
  --target sales \
  --task regression \
  --features temperature day_of_week holiday store_id product_category \
  --cat-cols day_of_week holiday store_id product_category \
  --primary-metric mae \
  --cv-folds 10 \
  --fit-final \
  --iterations 1500 \
  --learning-rate 0.02 \
  --depth 8 \
  --subsample 0.8 \
  --random-seed 42 \
  --metrics-path sales_cv_metrics.json
```

**What this does:**
- Explicitly selects features for the model
- Marks categorical features manually
- Uses 10-fold cross-validation
- Optimizes MAE (robust to outliers)
- Subsamples 80% of data per iteration

### Example 9: Predefined Train/Test Split (Time-Based) 🆕

```bash
catboost-cli train \
  --data-path timeseries_data.parquet \
  --model-path models/time_model.cbm \
  --target demand \
  --task regression \
  --auto-cat \
  --split-col time_split \
  --train-split-value train \
  --test-split-value test \
  --primary-metric rmse \
  --iterations 1000 \
  --learning-rate 0.05 \
  --depth 8 \
  --metrics-path time_metrics.json
```

**What this does:**
- Uses a **predefined split column** instead of random splitting
- Perfect for time series (train on past, test on future)
- Great for preventing data leakage (e.g., same user in train/test)
- `time_split` column contains "train" or "test" values
- See [ADVANCED_FEATURES.md](ADVANCED_FEATURES.md) for details

### Example 10: Lazy Loading on Wide Dataset ⚡

```bash
catboost-cli train \
  --data-path wide_dataset.parquet \
  --model-path models/wide_model.cbm \
  --target target \
  --features f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 \
  --cat-cols f1 f2 \
  --task classification \
  --primary-metric f1_macro \
  --test-size 0.2
```

**What this does:**
- Dataset has 1000+ columns but only uses 10
- **Lazy loading**: Only reads the 10 feature columns + target
- 10-100x faster loading on wide datasets
- Dramatically reduces memory usage
- Works best with Parquet format
- See [ADVANCED_FEATURES.md](ADVANCED_FEATURES.md) for performance tips

## CLI Commands Reference

### `train` - Train a Model

**Required:**
- `--data-path`: Path to training data
- `--model-path`: Path to save model
- `--target`: Target column name

**Key Options:**
- `--task`: `classification` or `regression` (default: classification)
- `--features`: Feature columns (repeatable)
- `--drop-cols`: Columns to exclude
- `--cat-cols`: Categorical columns (repeatable)
- `--auto-cat`: Auto-detect categorical columns from dtypes
- `--test-size`: Holdout test size (0-1)
- `--cv-folds`: Number of CV folds (if > 1, runs CV)
- `--fit-final`: Fit final model on full data after CV
- `--primary-metric`: Main metric to track
- `--average`: Averaging strategy (macro/weighted/micro)
- `--metrics-path`: Save metrics JSON

**CatBoost Parameters:**
- Core: `--iterations`, `--learning-rate`, `--depth`
- Regularization: `--l2-leaf-reg`, `--random-strength`
- Sampling: `--bootstrap-type`, `--subsample`, `--bagging-temperature`, `--rsm`
- Leaves: `--min-data-in-leaf`, `--leaf-estimation-method`, `--grow-policy`
- Early stopping: `--od-type`, `--od-wait`, `--early-stopping-rounds`, `--use-best-model`
- Class weights: `--class-weights`, `--auto-class-weights`, `--scale-pos-weight`
- System: `--thread-count`, `--verbose`, `--random-seed`

### `eval` - Evaluate a Model

**Required:**
- `--data-path`: Path to evaluation data
- `--model-path`: Path to trained model
- `--target`: Target column name
- `--metrics-path`: Path to save metrics JSON

**Optional:**
- `--predictions-path`: Save predictions
- `--prediction-type`: `raw`, `probability`, or `class`

### `predict` - Generate Predictions

**Required:**
- `--data-path`: Path to input data
- `--model-path`: Path to trained model
- `--out-path`: Path to save predictions

**Options:**
- `--prediction-type`: `raw` (default), `probability`, or `class`
- `--append`: Append predictions to original data
- `--id-cols`: ID columns to include (repeatable)

### `info` - View Model Information

**Required:**
- `--model-path`: Path to trained model

## Model Metadata

Each trained model generates a sidecar `.meta.json` file containing:

```json
{
  "task": "classification",
  "target_column": "label",
  "features": ["feature1", "feature2", "..."],
  "categorical_features": ["cat_feature1"],
  "label_mapping": {"ClassA": 0, "ClassB": 1},
  "training_timestamp": "2024-01-15T10:30:00",
  "catboost_params": {"iterations": 1000, "..."},
  "metric_config": {
    "primary_metric": "f1_macro",
    "average": "macro"
  },
  "data_shape": {"rows": 10000, "cols": 25},
  "model_path": "models/my_model.cbm",
  "best_iteration": 847,
  "tree_count": 848
}
```

## Metrics Output Format

Metrics JSON structure:

```json
{
  "task": "classification",
  "primary_metric": "f1_macro",
  "primary_metric_value": 0.8567,
  "metrics": {
    "accuracy": 0.8734,
    "precision_macro": 0.8512,
    "recall_macro": 0.8645,
    "f1_macro": 0.8567,
    "log_loss": 0.3421,
    "roc_auc": 0.9123,
    "confusion_matrix": [[450, 50], [45, 455]]
  },
  "cv_results": [...],  // If CV was used
  "cv_summary": {       // If CV was used
    "f1_macro": {"mean": 0.8567, "std": 0.0234},
    "accuracy": {"mean": 0.8734, "std": 0.0189}
  },
  "run_context": {
    "train_rows": 8000,
    "test_rows": 2000,
    "features_count": 20,
    "categorical_features_count": 3
  },
  "timestamp": "2024-01-15T10:45:30.123456"
}
```

## Primary Metrics

**Classification:**
- `accuracy`
- `precision_macro`, `precision_weighted`, `precision_micro`
- `recall_macro`, `recall_weighted`, `recall_micro`
- `f1_macro`, `f1_weighted`, `f1_micro`
- `roc_auc`
- `log_loss`

**Regression:**
- `rmse` (Root Mean Squared Error)
- `mae` (Mean Absolute Error)
- `r2` (R² Score)
- `mape` (Mean Absolute Percentage Error)
- `explained_variance`

## Tips and Best Practices

1. **Start Simple**: Begin with default parameters and iterate
2. **Use Cross-Validation**: More reliable than single holdout split
3. **Primary Metric**: Choose based on your business objective
   - Imbalanced data: Use `f1_macro`, `roc_auc`, or weighted metrics
   - Regression: `mae` for robustness, `rmse` for penalizing large errors
4. **Categorical Features**: Always specify or auto-detect them for best performance
5. **Early Stopping**: Enable with `--early-stopping-rounds` and `--use-best-model`
6. **Class Imbalance**: Use `--auto-class-weights Balanced` or custom `--class-weights`
7. **Reproducibility**: Always set `--random-seed`

## File Formats

- **Input**: CSV (`.csv`) and Parquet (`.parquet`, `.pq`)
- **Output**: Same as input, inferred from file extension
- **Models**: CatBoost binary format (`.cbm` recommended)
- **Metrics**: JSON (`.json`)

## Development

```bash
# Install in development mode with dev dependencies
pip install -e ".[dev]"

# Run linting
ruff check catboost_cli/
black --check catboost_cli/

# Format code
black catboost_cli/
```

## Architecture

```
catboost_cli/
├── __init__.py       # Package initialization
├── cli.py            # Typer CLI commands (main entry point)
├── io.py             # Polars-based data I/O
├── schema.py         # Pydantic models for config/metadata/metrics
├── features.py       # Feature selection and Pool creation
├── train.py          # Training logic (holdout + CV)
├── eval.py           # Evaluation and metrics computation
├── predict.py        # Prediction pipeline
├── meta.py           # Metadata read/write
└── utils.py          # Logging, validation, formatting
```

## Development

### Setup Development Environment

```bash
# Clone the repository
git clone https://github.com/yourusername/catboost-cli.git
cd catboost-cli

# Using Make (easiest)
make sync

# Or using uv directly
uv sync

# Or using pip with venv
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -e ".[dev]"
```

### Makefile Commands

The project includes a comprehensive Makefile for common tasks:

```bash
# Show all available commands
make help

# Development workflow
make sync          # Install dependencies with uv
make test          # Run tests
make lint          # Run linting
make format        # Format code with black
make check         # Run all checks

# Release workflow (runs checks, tags, and pushes)
make release-patch           # 0.1.0 -> 0.1.1
make release-minor           # 0.1.0 -> 0.2.0
make release-major           # 0.1.0 -> 1.0.0
make release V=0.2.0         # Specific version

# Build and publish (manual - usually automated via GitHub Actions)
make build         # Build distribution packages
make publish-test  # Publish to TestPyPI
make clean         # Clean build artifacts
make version       # Show current version
make info          # Show project info
```

### Running Tests

```bash
# Generate sample data
python generate_sample_data.py

# Run basic workflow test
catboost-cli train \
  --data-path sample_data/classification_sample.csv \
  --model-path test_model.cbm \
  --target target \
  --auto-cat \
  --iterations 10

# Test prediction
catboost-cli predict \
  --data-path sample_data/classification_sample.csv \
  --model-path test_model.cbm \
  --out-path test_predictions.csv \
  --prediction-type probability
```

### Code Quality

```bash
# Format code with Black
uv run --with black black catboost_cli/

# Lint with Ruff
uv run --with ruff ruff check catboost_cli/

# Type check with Pyright (optional)
uv run --with pyright pyright catboost_cli/
```

## Publishing to PyPI

This project uses GitHub Actions for automated publishing to PyPI.

### Automated Release (Recommended)

Simply run one of the release commands:

```bash
# Bump patch version (0.1.0 -> 0.1.1)
make release-patch

# Bump minor version (0.1.0 -> 0.2.0)
make release-minor

# Bump major version (0.1.0 -> 1.0.0)
make release-major

# Or specify exact version
make release V=0.2.0
```

**What happens:**
1. Runs pre-release checks (format, lint, tests)
2. Updates version in `pyproject.toml` and `__init__.py`
3. Commits changes
4. Creates git tag (e.g., `v0.2.0`)
5. Pushes tag to GitHub
6. **GitHub Actions automatically**:
   - Runs full test suite
   - Builds the package
   - Publishes to PyPI
   - Creates GitHub Release with auto-generated changelog

### Manual Publishing

If you need to publish manually:

```bash
# Build and publish to TestPyPI
make publish-test

# Build and publish to PyPI (with confirmation)
make publish
```

### PyPI API Token Setup

For GitHub Actions to publish to PyPI, configure an API token:

1. **Create PyPI API token**: https://pypi.org/manage/account/token/
   - Token name: `catboost-cli-github-actions`
   - Scope: Entire account (first publish) or Project (subsequent)

2. **Add to GitHub Secrets**: `https://github.com/yourusername/catboost-cli/settings/secrets/actions`
   - Name: `PYPI_API_TOKEN`
   - Value: Your PyPI token

See [PUBLISHING.md](PUBLISHING.md) for detailed instructions.

## CI/CD

This project uses GitHub Actions for continuous integration and deployment:

- **CI Workflow** (`.github/workflows/ci.yml`):
  - Runs on every push and pull request
  - Tests on Python 3.12 (ubuntu-latest)
  - Lints with Ruff and Black
  - Tests CLI installation and basic workflow

- **Release Workflow** (`.github/workflows/release.yml`):
  - Triggers on git tag push (e.g., `v0.1.0`)
  - Runs full test suite
  - Builds distribution packages
  - Publishes to PyPI using Trusted Publishing
  - Creates GitHub Release with auto-generated changelog
  - Attaches distribution files to release

## Contributing

Contributions are welcome! Here's how to contribute:

1. **Fork the repository**
2. **Create a feature branch**: `git checkout -b feature/amazing-feature`
3. **Make your changes** and add tests
4. **Run code quality checks**:
   ```bash
   black catboost_cli/
   ruff check catboost_cli/
   ```
5. **Commit your changes**: `git commit -m 'Add amazing feature'`
6. **Push to the branch**: `git push origin feature/amazing-feature`
7. **Open a Pull Request**

### Contribution Guidelines

- Follow existing code style (Black formatting, type hints)
- Add tests for new features
- Update documentation (README, docstrings)
- Update CHANGELOG.md
- Ensure CI passes

## License

MIT License - see [LICENSE](LICENSE) file for details

## Changelog

See [CHANGELOG.md](CHANGELOG.md) for version history and release notes.

## Documentation

- **[Quick Start](QUICKSTART.md)** - 5-minute getting started guide
- **[Makefile Guide](MAKEFILE_GUIDE.md)** - All make commands and workflows 🆕
- **[Advanced Features](ADVANCED_FEATURES.md)** - Lazy loading, predefined splits, performance tips
- **[uv Guide](UV_GUIDE.md)** - Using uv for 10-100x faster installs ⚡
- **[Publishing Guide](PUBLISHING.md)** - How to publish to PyPI
- **[Deployment & CI/CD](docs/DEPLOYMENT.md)** - Complete CI/CD documentation
- **[Contributing Guide](CONTRIBUTING.md)** - How to contribute
- **[Changelog](CHANGELOG.md)** - Version history and release notes

## Repository Structure

```
catboost-cli/
├── catboost_cli/              # Main package
│   ├── cli.py                 # CLI commands
│   ├── train.py               # Training logic
│   ├── eval.py                # Evaluation
│   ├── predict.py             # Predictions
│   └── ...
├── .github/
│   ├── workflows/
│   │   ├── ci.yml             # CI workflow
│   │   └── publish.yml        # PyPI publishing
│   ├── ISSUE_TEMPLATE/        # Issue templates
│   └── pull_request_template.md
├── scripts/
│   ├── prepare_release.py     # Release automation
│   └── check_release_ready.sh # Pre-release checks
├── docs/
│   └── DEPLOYMENT.md          # Deployment docs
├── pyproject.toml             # Package configuration
├── README.md                  # This file
├── CHANGELOG.md               # Version history
├── CONTRIBUTING.md            # Contribution guide
├── PUBLISHING.md              # PyPI publishing guide
└── LICENSE                    # MIT License
```

## Support

- **Issues**: [GitHub Issues](https://github.com/yourusername/catboost-cli/issues)
- **Discussions**: [GitHub Discussions](https://github.com/yourusername/catboost-cli/discussions)
- **Documentation**: See links above

## Acknowledgments

- Built with [CatBoost](https://catboost.ai/)
- Uses [Polars](https://www.pola.rs/) for fast data processing
- CLI powered by [Typer](https://typer.tiangolo.com/)
- Validation with [Pydantic](https://pydantic.dev/)
- CI/CD with [GitHub Actions](https://github.com/features/actions)
