Metadata-Version: 2.4
Name: music-classifier-cleaner
Version: 0.1.0
Summary: Classify and organise a music library by artist genre via MusicBrainz
Requires-Python: >=3.10,<4.0
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Dist: mutagen
Requires-Dist: rapidfuzz
Requires-Dist: requests
Description-Content-Type: text/markdown

# Music Classifier & Cleaner

Classifies a music library by artist genre (via MusicBrainz) and reorganises folders by genre.

## Installation

```bash
# Install in editable mode (recommended for development)
pip install -e .

# Or with Poetry
poetry install
```

## Commands

| Command | Purpose |
|---|---|
| `classify-organize` | Scan library, classify artists by genre via MusicBrainz, reorganise into genre folders |
| `scan-library` | Scan library for artists with few songs, output a CSV for manual review |
| `discover-from-library` | Process the review CSV — remove artist folders or explore top tracks via Deezer |
| `tag-library-genres` | Tag all audio files with top 3 MusicBrainz genres + language tag |

All commands are available system-wide after `pip install -e .`. Alternatively use `poetry run <command>`.

---

### `classify-organize` — Classify and organise by genre

Scans the library root for artist folders (any top-level directory that isn't a genre folder), plus loose MP3s without a parent artist folder. For each artist:

1. **Deduplicates** similar folder names using fuzzy matching (e.g. `"Greenday"` → `"Green Day"`)
2. **Renames** the folder to the canonical name, merging if the target already exists
3. **Updates** the `artist` ID3 tag in every MP3 inside the folder to match the canonical name
4. **Queries MusicBrainz** for the artist's genre tags
5. **Classifies** the tags into one of the predefined genre buckets via keyword matching
6. **Moves** the entire artist folder into the matching genre subfolder (or `other/` if nothing matched)

```bash
classify-organize /path/to/music/library
```

---

### `scan-library` — Scan for sparse artists

Walks every genre subfolder, counts songs per artist, and writes a CSV of artists with few songs for manual review.

- Skips empty artist folders (prompts to delete them)
- Also checks loose MP3s in the library root (tagged as genre `"root"`)
- Outputs `artists_to_review.csv` in the library root with columns: `artist`, `genre`, `song_count`, `path`, `decision`

```bash
# Default threshold: 4 songs
scan-library /path/to/music/library

# Custom threshold
scan-library /path/to/music/library -t 3
```

After filling in the `decision` column (`remove` or `explore`), process the CSV with `discover-from-library`.

---

### `discover-from-library` — Process review decisions

Reads the CSV produced by `scan-library` and acts on each row:

- **`remove`** — deletes the entire artist folder via `shutil.rmtree`
- **`explore`** — looks up the artist on Deezer and prints their top 5 tracks with durations

```bash
discover-from-library /path/to/artists_to_review.csv
```

---

### `tag-library-genres` — Tag library with genres from MusicBrainz

Recursively walks every `.mp3` and `.flac` file in the library. For each file:

1. **Reads the artist** from the file's metadata (EasyID3 for MP3, FLAC Vorbis comments for FLAC)
2. **Looks up the artist** on MusicBrainz — fetches genre tags and detects language from tags
3. **Verifies the match** — only tags the file if the MusicBrainz matched name matches the file's artist tag (case-insensitive). Skips if they differ (e.g. MusicBrainz returned a different artist)
4. **Writes tags** — up to 3 genre tags and an ISO 639-2 language code
5. **Skips already-tagged** files — checks existing tags before querying or writing

Caches MusicBrainz results per artist so files by the same artist only trigger one API lookup.

```bash
# Preview only (no files are modified)
tag-library-genres /path/to/music/library -n

# Tag all files
tag-library-genres /path/to/music/library
```

**Tags written:**

| Format | Genre | Language |
|---|---|---|
| MP3 | `TCON` frame — comma-separated string (e.g. `"Swing, Jazz, Big Band"`) | `TLAN` frame — ISO 639-2 code (e.g. `"eng"`) |
| FLAC | Multiple `GENRE` Vorbis comments | `LANGUAGE` Vorbis comment |

---

## Running tests

```bash
# Install test dependencies
pip install pytest pytest-cov

# Run all tests
pytest tests/ -v

# Run with coverage report
pytest tests/ --cov=music_classifier.utils -v

# Run with coverage + line numbers of misses
pytest tests/ --cov=music_classifier.utils --cov-report=term-missing -v

# Generate HTML coverage report
pytest tests/ --cov=music_classifier.utils --cov-report=html
open htmlcov/index.html
```

## Project structure

```
music-classifier-cleaner/
├── pyproject.toml          # Poetry config + console_scripts entry points
├── README.md
├── music_classifier/       # Installable Python package
│   ├── __init__.py
│   ├── utils.py            # Core utilities (MusicBrainz, tagging, classification)
│   └── cli.py              # CLI entry point functions (argparse + main)
└── tests/
    ├── __init__.py
    └── test_utils.py       # 87 tests covering non-API code paths
```

