Metadata-Version: 2.4
Name: songchain
Version: 0.2.0
Summary: Chain together ML models for music, LangChain-style
Author-email: Anand Sampat <anands@cs.stanford.edu>
License: MIT License
        
        Copyright (c) 2026 Anand Sampat
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://github.com/asampat3090/songchain
Keywords: music,generative-ai,midi,audio,musicgen,chains
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Multimedia :: Sound/Audio
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Requires-Dist: torch>=2.0.0
Requires-Dist: soundfile
Requires-Dist: librosa
Requires-Dist: pretty_midi
Requires-Dist: pypianoroll
Requires-Dist: tqdm
Provides-Extra: generation
Requires-Dist: audiocraft; extra == "generation"
Provides-Extra: transcription
Requires-Dist: basic-pitch; extra == "transcription"
Requires-Dist: transformers; extra == "transcription"
Provides-Extra: datasets
Requires-Dist: pydub; extra == "datasets"
Provides-Extra: plot
Requires-Dist: matplotlib; extra == "plot"
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: coverage-badge; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: flake8; extra == "dev"
Requires-Dist: pydub; extra == "dev"
Requires-Dist: matplotlib; extra == "dev"
Dynamic: license-file

# 🎵 SongChain 🔗

![Coverage](./coverage.svg)
![Tests](https://github.com/asampat3090/songchain/actions/workflows/ci.yml/badge.svg)

[LangChain](https://python.langchain.com/docs/get_started/introduction.html) made it easy to string together language models and their data processing. **SongChain does the same for music models** — voice, audio, and MIDI-based generation — so musicians and developers can compose AI music pipelines as easily as language pipelines.

Key differences from language chaining:

* **Focus on audio rather than language**: musicians work with audio and MIDI clips, and craft songs from variations of them
* **Audio and MIDI need their own encodings**: clips expose waveforms, spectrograms, piano rolls, note sequences, and EnCodec tokens — the representations music models actually consume
* **Chaining is temporal too**: outputs are concatenated, layered, and looped in time to craft larger pieces

## Installation

Into your own project (not yet on PyPI — install from GitHub or a local clone):

```bash
pip install git+https://github.com/asampat3090/songchain.git   # from GitHub
pip install /path/to/songchain                                  # from a local clone

# optional model backends (work with either form)
pip install "songchain[generation] @ git+https://github.com/asampat3090/songchain.git"   # MusicGen / AudioGen
pip install "/path/to/songchain[transcription]"                  # Basic Pitch, Pop2Piano
```

For working on songchain itself:

```bash
git clone https://github.com/asampat3090/songchain.git && cd songchain
pip install -e ".[dev]"
```

Then in your downstream code:

```python
# my_project/make_jingle.py
from songchain import Chain, PromptTemplate, concat
from songchain.models import NoteSequence, Transpose, MIDISynth, Fade

jingle_chain = Chain([
    PromptTemplate("{n1}5:0.2 {n2}5:0.2 {n3}5:0.2 {n3}5:0.6"),
    NoteSequence(),
    Transpose(semitones=-5),
    MIDISynth(sample_rate=22050),
    Fade(fade_out=0.3),
])

a = jingle_chain.run(n1="C", n2="E", n3="G")
b = jingle_chain.run(n1="D", n2="F", n3="A")
concat([a, b], crossfade=0.05).save("jingle.wav")
```

## Quickstart

Chain a prompt through MIDI generation, transformation, and synthesis to audio — no pretrained models needed:

```python
from songchain import Chain, PromptTemplate
from songchain.models import NoteSequence, Transpose, MIDISynth, Fade

chain = Chain([
    PromptTemplate("{root}4 {third}4 {fifth}4 {root}5:1.0"),
    NoteSequence(duration=0.25),     # text  -> MIDI
    Transpose(semitones=-12),        # MIDI  -> MIDI
    MIDISynth(sample_rate=44100),    # MIDI  -> audio
    Fade(fade_out=0.5),              # audio -> audio
])

clip = chain.run(root="C", third="E", fifth="G")
clip.save("arpeggio.wav")
```

Generate music with a pretrained model (requires `[generation]`):

```python
from songchain import Chain, PromptTemplate, concat
from songchain.models import MusicGen

chain = Chain([
    PromptTemplate("{genre} beat with {mood} vibes, {bpm} BPM"),
    MusicGen(size="small", duration=8.0),
])

verse = chain.run(genre="lofi", mood="chill", bpm=80)
chorus = chain.run(genre="lofi", mood="uplifting", bpm=80)
song = concat([verse, chorus, verse], crossfade=0.5)
song.save("song.wav")
```

Chains validate modalities up front — a MIDI model feeding an audio effect fails at construction time, before any weights load. Compose chains with `|`: `Chain([...]) | MIDISynth()`.

## Data: AudioClip and MIDIClip

Clips wrap a file (or in-memory data) and expose every ML representation, so models can be swapped without rewriting preprocessing:

```python
from songchain import AudioClip, MIDIClip

audio = AudioClip("song.wav")
audio.waveform          # (channels x frames) float32 tensor
audio.spectrogram       # STFT power spectrogram
audio.mel_spectrogram   # mel-scaled spectrogram
audio.encodec           # EnCodec discrete codes (requires [generation])
audio.to_mono().resample(32000).slice(0, 10).save("intro.wav")

midi = MIDIClip("melody.mid")
midi.pianoroll(type="binary")     # (128 x steps) — MidiNet, MuseGAN style
midi.pianoroll(type="velocity")   # velocity-aware — Music Transformer style
midi.notes(type="index")          # monophonic — MelodyRNN, MusicVAE style
midi.notes(type="note")           # note names — DeepBach style
midi.transpose(5).time_stretch(0.5).synthesize().save("variation.wav")
```

## Models

All models share one interface — declare input/output modality, implement `generate` — and register by name (`get_model("musicgen", duration=8.0)`).

| Model | Chain | Backend | Extra |
|---|---|---|---|
| `NoteSequence` | text → MIDI | built-in | — |
| `Transpose`, `MIDITimeStretch` | MIDI → MIDI | built-in | — |
| `MIDISynth` | MIDI → audio | built-in | — |
| `Gain`, `Reverse`, `Fade` | audio → audio | built-in | — |
| `MusicGen` | text → audio | Meta audiocraft | `[generation]` |
| `AudioGen` | text → audio | Meta audiocraft | `[generation]` |
| `BasicPitch` | audio → MIDI | Spotify basic-pitch | `[transcription]` |
| `Pop2Piano` | audio → MIDI | HuggingFace | `[transcription]` |

Add your own by subclassing a base (e.g. `TextToAudioModel`) and decorating with `@register("name")`.

## Temporal composition

```python
from songchain import concat, overlay, loop, concat_midi

track = concat([intro, verse, chorus], crossfade=0.25)  # join in time
mix = overlay([drums, bass, melody])                    # layer tracks
beat = loop(bar, times=8)                               # repeat
medley = concat_midi([melody_a, melody_b])              # join MIDI
```

## Dataset preparation

Split songs into fixed-size chunks with text descriptions — the layout audiocraft-style fine-tuning expects:

```python
from songchain.util import split_audio_and_save

split_audio_and_save(30, "my_songs/", "chill bollywood beats with vocals")
# -> my_songs/output/000_000.wav, 000_000.txt, ...
```

## End-to-end example

`examples/make_song.py` composes a complete short song — a four-chord progression (C–Am–F–G) with an arpeggiated melody chain overlaid on a bass chain, arranged bar-by-bar, looped, and faded — and writes both the audio and the MIDI:

```bash
python examples/make_song.py
# wrote ./song.wav (15.7s) and ./song.mid
```

This exact workflow is verified by the integration tests in `tests/test_integration.py`, which run the example and assert the artifacts are real: audible audio at the right duration, and MIDI that reloads with the full arrangement and can itself be fed back into a chain.

## Development

```bash
pip install -e ".[dev]"
pytest --cov=songchain          # run tests with coverage
coverage-badge -o coverage.svg -f  # refresh the badge
black songchain tests && flake8    # format + lint
```

The test suite runs without any heavy model downloads — pretrained backends are exercised through fakes.
