Metadata-Version: 2.4
Name: evo-gafs
Version: 0.1.0
Summary: Genetic Algorithm Feature Selector — a scikit-learn-compatible wrapper feature selector for tabular data (evo-suite)
Author: Axel Skrauba
License: MIT
Project-URL: Repository, https://github.com/AxelSkrauba/evo-suite
Project-URL: Documentation, https://github.com/AxelSkrauba/evo-suite/tree/main/packages/evo-gafs
Project-URL: Bug Tracker, https://github.com/AxelSkrauba/evo-suite/issues
Keywords: feature selection,genetic algorithm,DEAP,NSGA-II,machine learning,scikit-learn,wrapper method,evolutionary computation
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: numpy>=1.24
Requires-Dist: pandas>=1.5
Requires-Dist: scikit-learn>=1.6
Requires-Dist: deap>=1.4
Provides-Extra: viz
Requires-Dist: matplotlib>=3.6; extra == "viz"
Provides-Extra: dev
Requires-Dist: pytest>=7.4; extra == "dev"
Requires-Dist: pytest-cov>=4.1; extra == "dev"
Requires-Dist: ruff>=0.6; extra == "dev"
Requires-Dist: mypy>=1.8; extra == "dev"
Requires-Dist: matplotlib>=3.6; extra == "dev"

# evo-gafs — Genetic Algorithm Feature Selector

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](../../LICENSE)

A **scikit-learn-compatible** wrapper feature selector for tabular data, powered
by [DEAP](https://github.com/DEAP/deap). `evo-gafs` searches for the subset of
features that maximises a cross-validated score of your model, and lets you
explicitly trade raw performance for a smaller feature set — useful for edge
deployment.

Part of the [`evo-suite`](../../README.md) family (import name: `evo_gafs`).

## Why evo-gafs?

| Capability | evo-gafs |
|------------|----------|
| Single-objective **weighted** fitness with a configurable `alpha` (performance ↔ compression) | ✓ |
| **Multi-objective** NSGA-II with an accessible Pareto front | ✓ |
| **Repair operator** guaranteeing a minimum number of features | ✓ |
| Evaluation **cache** to skip repeated genomes | ✓ |
| Native scikit-learn `fit`/`transform`/`get_support`, usable in a `Pipeline` | ✓ |
| Built-in multi-dataset `BenchmarkRunner` | ✓ |

## Installation

```bash
pip install evo-gafs            # core
pip install evo-gafs[viz]       # + matplotlib for the plotting helpers
```

## Quickstart

```python
from sklearn.datasets import load_breast_cancer
from sklearn.tree import DecisionTreeClassifier
from evo_gafs import GAFeatureSelector, GAConfig

X, y = load_breast_cancer(return_X_y=True, as_frame=True)

selector = GAFeatureSelector(
    estimator=DecisionTreeClassifier(random_state=42),
    config=GAConfig(population_size=30, n_generations=20, alpha=0.8, verbose=False),
)
selector.fit(X, y)

print(selector.summary())
X_reduced = selector.transform(X)
print("Selected:", selector.get_support(indices=True))
```

### Multi-objective (Pareto front)

```python
config = GAConfig(mode="multiobjective", population_size=40, n_generations=30, verbose=False)
selector = GAFeatureSelector(estimator=DecisionTreeClassifier(random_state=42), config=config)
selector.fit(X, y)

for point in selector.result_.pareto_front:
    print(point["n_features"], point["cv_score"])
```

### In a scikit-learn pipeline

```python
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC

pipe = Pipeline([
    ("scaler", StandardScaler()),
    ("selector", GAFeatureSelector(estimator=DecisionTreeClassifier(), config=config)),
    ("clf", SVC()),
])
pipe.fit(X, y)
```

## The `alpha` trade-off (single-objective)

```
fitness = alpha * cv_score + (1 - alpha) * compression
compression = 1 - n_selected / n_total
```

- `alpha = 1.0` → pure wrapper (performance only)
- `alpha ≈ 0.7` → balanced, good default for edge deployment

## Citation

```bibtex
@software{evo_gafs,
  author    = {Skrauba, Axel},
  title     = {evo-gafs: Genetic Algorithm Feature Selector for tabular data},
  year      = {2026},
  version   = {0.1.0},
  url       = {https://github.com/AxelSkrauba/evo-suite}
}
```

## License

[MIT](../../LICENSE)
