Metadata-Version: 2.4
Name: fsga
Version: 1.1.8
Summary: Feature selection using a genetic algorithm
Author-email: Piotr Krzysztof <piotr@codextechnologies.org>
License: MIT
Project-URL: Homepage, https://github.com/zweiss/feature-selection-via-genetic-algorithm
Project-URL: Documentation, https://github.com/zweiss/feature-selection-via-genetic-algorithm/blob/main/README.md
Project-URL: Repository, https://github.com/zweiss/feature-selection-via-genetic-algorithm
Project-URL: Issues, https://github.com/zweiss/feature-selection-via-genetic-algorithm/issues
Keywords: genetic-algorithm,feature-selection,machine-learning,optimization,evolutionary-computation
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.13
Description-Content-Type: text/markdown
Requires-Dist: numpy>=2.3.5
Requires-Dist: pandas>=2.3.3
Requires-Dist: scikit-learn>=1.7.2
Requires-Dist: matplotlib>=3.10.7
Requires-Dist: seaborn>=0.13.2
Requires-Dist: pyyaml>=6.0.3
Requires-Dist: scipy>=1.16.3
Provides-Extra: dev
Requires-Dist: pytest>=9.0.1; extra == "dev"
Requires-Dist: pytest-cov>=7.0.0; extra == "dev"
Requires-Dist: ruff>=0.14.5; extra == "dev"
Requires-Dist: mypy>=1.18.2; extra == "dev"
Requires-Dist: black>=25.11.0; extra == "dev"
Provides-Extra: viz
Requires-Dist: plotly>=6.4.0; extra == "viz"
Requires-Dist: ipywidgets>=8.1.8; extra == "viz"
Provides-Extra: ml
Requires-Dist: xgboost>=3.1.1; extra == "ml"
Requires-Dist: lightgbm>=4.6.0; extra == "ml"
Provides-Extra: notebooks
Requires-Dist: jupyterlab>=4.4.10; extra == "notebooks"
Provides-Extra: all
Requires-Dist: fsga[dev,ml,notebooks,viz]; extra == "all"

# Feature Selection via Genetic Algorithm (FSGA)

[![PyPI version](https://badge.fury.io/py/fsga.svg)](https://pypi.org/project/fsga/)
[![Total Downloads](https://static.pepy.tech/badge/fsga)](https://pepy.tech/project/fsga)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/fsga)](https://pypi.org/project/fsga/)

A university project implementing feature selection using Genetic Algorithms, with evaluation and visualization tools.

## Quick Start

```bash
# Installation
git clone <repository-url>
cd feature-selection-via-genetic-algorithm
uv venv && source .venv/bin/activate
uv pip install -e .

# Run example
python experiments/run_comparison.py
```

## Basic Usage

```python
from fsga.core.genetic_algorithm import GeneticAlgorithm
from fsga.datasets.loader import load_dataset
from fsga.evaluators.accuracy_evaluator import AccuracyEvaluator
from fsga.ml.models import ModelWrapper

# Load data and setup
X_train, X_test, y_train, y_test, _ = load_dataset('iris', split=True)
model = ModelWrapper('rf', n_estimators=50, random_state=42)
evaluator = AccuracyEvaluator(X_train, y_train, X_test, y_test, model)

# Run GA
from fsga.selectors.tournament_selector import TournamentSelector
from fsga.operators.uniform_crossover import UniformCrossover
from fsga.mutations.bitflip_mutation import BitFlipMutation

ga = GeneticAlgorithm(
    num_features=X_train.shape[1],
    evaluator=evaluator,
    selector=TournamentSelector(evaluator, tournament_size=3),
    crossover_operator=UniformCrossover(),
    mutation_operator=BitFlipMutation(probability=0.01),
    population_size=50,
    num_generations=100,
    early_stopping_patience=10
)

results = ga.evolve()
print(f"Accuracy: {results['best_fitness']:.2%}")
print(f"Features: {results['best_chromosome'].sum()}/{X_train.shape[1]}")
```

## Key Features

- **Modular Design**: Swappable operators, selectors, and evaluators
- **Multiple Operators**: 5 crossover types, 5 selection strategies, 3 fitness functions
- **Baseline Comparisons**: Built-in RFE, LASSO, Mutual Information, Chi², ANOVA
- **Statistical Testing**: Wilcoxon, Mann-Whitney, Cohen's d, Jaccard stability
- **Visualization**: 9 plot functions for analysis and comparison
- **Experiment Framework**: `ExperimentRunner` for reproducible experiments
- **Configuration**: YAML-based configuration system

## Architecture

```
fsga/
├── core/          # GA engine (genetic_algorithm, population)
├── operators/     # Crossover: uniform, single-point, two-point, multi-point
├── mutations/     # Mutation: bitflip
├── selectors/     # Selection: tournament, roulette, ranking, elitism
├── evaluators/    # Fitness: accuracy, F1, balanced accuracy
├── ml/            # Model wrappers (sklearn integration)
├── datasets/      # Dataset loaders (iris, wine, breast_cancer, digits)
├── analysis/      # Baselines + ExperimentRunner
├── visualization/ # 9 plot functions
└── utils/         # Config, metrics, serialization, logging
```

## Documentation

- **[Getting Started](docs/GETTING_STARTED.md)** - Installation and basic usage
- **[Tutorial](docs/TUTORIAL.md)** - Step-by-step guide with examples
- **[Architecture](docs/ARCHITECTURE.md)** - System design and extension points
- **[Project Plan](docs/about/project-plan.md)** - Status and roadmap
- **Module READMEs** - See `fsga/*/README.md` for component details

## Example Results

**Breast Cancer Dataset** (30 features → 12 features):
- GA Accuracy: **98.3%** with **40% of features**
- All Features: 95.7% with 100% of features
- **+2.6% accuracy, 60% dimensionality reduction**

**Iris Dataset** (4 features → 2 features):
- GA Accuracy: **98.3%** with **50% of features**
- Selected: petal length, petal width

**Wine Dataset** (13 features → 6.5 features):
- GA Accuracy: **100%** with **50% of features**

## Running Experiments

```bash
# Full analysis (all datasets, all visualizations)
python experiments/run_experiment.py

# Quick test (single dataset, fewer runs)
python experiments/run_experiment.py --quick

# Specific datasets only
python experiments/run_experiment.py --datasets iris wine

# Without visualizations (faster)
python experiments/run_experiment.py --no-plots

# Results saved to: results/{mode}/{dataset}/
```

## Tests

```bash
# Run all tests
uv run pytest tests/ -v

# With coverage
uv run pytest tests/ --cov=fsga --cov-report=html

# Current: 280+ tests, 82% coverage
```

## Configuration

Example config (`configs/default.yaml`):

```yaml
population_size: 50
num_generations: 100
mutation_rate: 0.01
crossover_rate: 0.8
early_stopping_patience: 10

dataset:
  name: iris
  split_ratio: 0.7
```

Load with:

```python
from fsga.utils.config import Config
config = Config.from_file('configs/default.yaml')
```

## License

MIT License - see LICENSE file for details.

## Contributing

Contributions welcome! See module READMEs for extension points:
- New operators: `fsga/operators/README.md`
- New selectors: `fsga/selectors/README.md`
- New evaluators: `fsga/evaluators/README.md`
