Metadata-Version: 2.4
Name: ntrees_tuning
Version: 0.1.0
Summary: The package **ntrees_tuning** is an **extension to sklearn**. To Random Forests and Gradient Boosting it adds the `ntrees` parameter which gives control over how many trees are used for prediction. The main benefit is that it enables to tune the `ntrees` parameter w.r.t. the OOB-error without having to retrain a new model for each value of `ntrees`.
Author-email: Schalwal <walmitschal@proton.me>
License-Expression: MIT
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: scikit-learn>=1.3.2
Requires-Dist: scipy>=1.7.0
Requires-Dist: numpy>=1.21.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Dynamic: license-file

The package **ntrees_tuning** is an **extension to sklearn**. To Random Forests and Gradient Boosting it adds the `ntrees` parameter which gives control over how many trees are used for prediction. The main benefit is that it enables to tune the `ntrees` parameter w.r.t. the OOB-error without having to retrain a new model for each value of `ntrees`.

The package introduces subclasses to the `sklearn`-classes of Random Forest and Gradient Boosting (RandomForestClassifier, RandomForestRegressor, GradientBoostingClassifier, GradientBoostingRegressor). Each adds two new methods called `predict_ntree` and `tune_ntree` which enable predicting and tuning the `ntrees` parameter possible

## Example usage:


# 1. Create data:

```python
from sklearn.datasets import make_classification, make_regression
Xcls, ycls = make_classification(n_samples=200, n_features=20, n_classes=3, random_state=42, n_clusters_per_class=3, n_informative=5)
Xreg, yreg = make_regression(n_samples=200, n_features=20, random_state=42)
```

# 2. Create and Fit RandomForest and GradientBoosting models for Regression and Classification

For tuning the `ntrees` parameter new custom classes are introduced. They are direct descendants of `sklearn` classes (`RandomForestClassifier`, `RandomForestRegressor`, `GradientBoostingClassifier`, `GradientBoostingRegressor`).

```python
import ntree_tuning as ntt

rf_cls = ntt.Ntree_RandForest_Classifier(n_estimators=100)
rf_cls.fit(Xcls, ycls)

rf_reg = ntt.Ntree_RandForest_Regressor(n_estimators=100)
rf_reg.fit(Xreg, yreg)

gb_cls = ntt.Ntree_GradBoost_Classifier(n_estimators=100, subsample=0.8)
gb_cls.fit(Xcls, ycls)

gb_reg = ntt.Ntree_GradBoost_Regressor(n_estimators=100, subsample=0.8)
gb_reg.fit(Xreg, yreg)
```

# 3. Tune ntrees

You then can call the `tune_ntrees` method to get a dictionary of the pairs of the `ntrees` value and the oob-error.

```python
# Gradient Boosting
print(gb_reg.tune_ntrees())
print(gb_cls.tune_ntrees())


# Random Forests
min_trees = 20
max_trees = 80
delta_trees = 5

print(rf_reg.tune_ntrees(Xreg, yreg, min_trees, max_trees, delta_trees))
print(rf_cls.tune_ntrees(Xcls, ycls, min_trees, max_trees, delta_trees))
```

# 4. Predict with ntrees

```python
print(gb_reg.predict_ntrees(Xreg, ntrees=10))
```


