Metadata-Version: 2.3
Name: fiddledyn
Version: 0.0.1
Summary: A dynamic config loader and serializer based on Fiddle.
Author: acecchini
Author-email: acecchini <ale.cecchini.valette@gmail.com>
Requires-Dist: fiddle>=0.3.0,<0.4.0
Requires-Dist: nemo-run>=0.8.1,<0.9.0 ; extra == 'nemo'
Requires-Python: >=3.12, <3.13
Provides-Extra: nemo
Description-Content-Type: text/markdown

# FiddleDyn

**Structure-Aware Configuration for Fiddle and NeMo Run**

FiddleDyn extends [Fiddle](https://github.com/google/fiddle) and [NeMo Run](https://github.com/NVIDIA/nemo_run) with CLI overrides, global DAG references, and round-trip serialization.

## Installation

```bash
# Core installation (Fiddle backend only)
pip install fiddledyn

# With NeMo Run support (recommended)
pip install fiddledyn[nemo]
```

> **Note**: `nemo_run` is an optional dependency. Without it, only the Fiddle backend (`Backend.FIDDLE`) is available. The NEMO backend requires `pip install fiddledyn[nemo]`.

---

## Quick Start

```python
# main.py
import fiddledyn as dyn
import fiddle as fdl

config = dyn.parse_cli()
trainer = fdl.build(config)
```

```bash
python main.py -f config.yaml model.lr=0.001 optimizer=@adam.yaml
```

---

## Features

| Feature | Description |
|---------|-------------|
| **CLI Overrides** | Deep nested values via dot notation (`model.encoder.dim=1024`) |
| **DAG References** | Share instances across files with `_id_`/`_ref_` |
| **Partial Configs** | Deferred instantiation with `_partial_: true` |
| **Callable References** | Pass any callable (class, function, method) using `_call_: false` |
| **File Overrides** | Replace branches with `key=@file.yaml` |
| **Positional Args** | Support for `*args` via `_args_` |
| **Include Defaults** | Serialize configs with parameter defaults |
| **Shallow/Deep Defaults** | Control recursive expansion of callable defaults |
| **Round-Trip Safe** | Full serialization preserving all metadata |
| **Backend Agnostic** | Works with both `nemo_run` and `fiddle` |

---

## YAML Syntax

FiddleDyn uses special keys prefixed with `_` to control configuration behavior:

| Key | Type | Description |
|-----|------|-------------|
| `_target_` | `str` | Dotted path to the class/function to instantiate |
| `_partial_` | `bool` | If `true`, creates a `Partial` instead of `Config` |
| `_call_` | `bool` | If `false`, returns the raw class/function reference |
| `_id_` | `str` | Registers the object in the global registry |
| `_ref_` | `str` | References a registered object by its ID |
| `_args_` | `list` | Positional arguments to pass to the target |

---

## Detailed Feature Guide

### 1. Basic Configuration (`_target_`)

Define Python objects in YAML using `_target_`:

```yaml
# model.yaml
_target_: mylib.Model
hidden_size: 512
dropout: 0.1
```

```python
import fiddledyn as dyn
import fiddle as fdl

config = dyn.load_yaml("model.yaml")
model = fdl.build(config)  # Creates Model(hidden_size=512, dropout=0.1)
```

### 2. Nested Configurations

Configurations can be arbitrarily nested:

```yaml
# trainer.yaml
_target_: mylib.Trainer
model:
  _target_: mylib.Model
  encoder:
    _target_: mylib.Encoder
    vocab_size: 50000
    hidden_dim: 768
  dropout: 0.1
optimizer:
  _target_: torch.optim.Adam
  lr: 0.001
max_epochs: 100
```

### 3. Partial Configurations (`_partial_: true`)

Create partial function applications that can be called later with additional arguments:

```yaml
# optimizer.yaml
_target_: torch.optim.Adam
_partial_: true
lr: 0.001
weight_decay: 0.01
```

```python
config = dyn.load_yaml("optimizer.yaml")
# config is a Partial - missing the 'params' argument

# Later, complete the partial
optimizer = fdl.build(config)(params=model.parameters())
```

### 4. Class References (`_call_: false`)

Pass raw classes or functions instead of instantiating them:

```yaml
# factory.yaml
_target_: mylib.OptimizerFactory
optimizer_cls:
  _target_: torch.optim.Adam
  _call_: false  # Returns the Adam class, not an instance
scheduler_cls:
  _target_: torch.optim.lr_scheduler.CosineAnnealingLR
  _call_: false
```

```python
config = dyn.load_yaml("factory.yaml")
factory = fdl.build(config)
# factory.optimizer_cls is torch.optim.Adam (the class itself)
# factory.scheduler_cls is CosineAnnealingLR (the class itself)
```

### 5. DAG References (`_id_` and `_ref_`)

Share object instances across your configuration using `_id_` and `_ref_`:

```yaml
# config.yaml
shared_encoder:
  _target_: mylib.Encoder
  _id_: enc  # Register with ID "enc"
  vocab_size: 50000

model1:
  _target_: mylib.Model
  encoder:
    _ref_: enc  # Reference the shared encoder

model2:
  _target_: mylib.Model
  encoder:
    _ref_: enc  # Same encoder instance!
```

```python
ctx = dyn.ParserContext()
config = dyn.load_yaml("config.yaml", ctx)
dyn.resolve_placeholders(config, ctx.registry)

# Both models share the exact same encoder object
assert config["model1"].encoder is config["model2"].encoder
```

**Cross-File References**: References work across multiple files:

```yaml
# backbone.yaml
_target_: mylib.Backbone
_id_: backbone
dim: 512
```

```yaml
# heads.yaml
classifier:
  _target_: mylib.Classifier
  backbone:
    _ref_: backbone
detector:
  _target_: mylib.Detector
  backbone:
    _ref_: backbone
```

```python
ctx = dyn.ParserContext()
backbone = dyn.load_yaml("backbone.yaml", ctx)
heads = dyn.load_yaml("heads.yaml", ctx)
dyn.resolve_placeholders(heads, ctx.registry)

# Both heads share the same backbone
assert heads["classifier"].backbone is heads["detector"].backbone
```

### 6. Positional Arguments (`_args_`)

For functions requiring positional arguments:

```yaml
# layer.yaml
_target_: mylib.create_layer
_args_: [64, 128]  # Positional args
bias: true          # Keyword arg
```

Equivalent to: `create_layer(64, 128, bias=True)`

### 7. Callable Serialization

Any callable (class, function, built-in, method) bound to a Config is serialized with `_target_` and `_call_: false`:

```python
from math import sqrt

class Model:
    def __init__(self, activation=sqrt):
        ...

config = fdl.Config(Model, activation=sqrt)
dyn.config_to_dict(config)
# {"_target_": "mylib.Model", "activation": {"_target_": "math.sqrt", "_call_": false}}
```

This works for all callable types:
- Classes: `cls=MyClass` → `{"_target_": "mylib.MyClass", "_call_": false}`
- Functions: `fn=my_func` → `{"_target_": "mylib.my_func", "_call_": false}`
- Built-ins: `fn=sqrt` → `{"_target_": "math.sqrt", "_call_": false}`
- Methods: `fn=obj.method` → `{"_target_": "mylib.MyClass.method", "_call_": false}`
- C-extension functions: `fn=torch.abs` → `{"_target_": "torch.abs", "_call_": false}`

**C-Extension Support**: Functions from packages like `torch` that originate from internal C++ classes are correctly handled. The serializer uses the public module path rather than internal qualified names (e.g. `torch.abs` instead of `torch._C._VariableFunctions.abs`).

### 8. Include Defaults (`include_defaults`)

Capture complete configuration graphs with all parameter defaults:

```python
class Encoder:
    def __init__(self, vocab_size: int, hidden_dim: int = 256, num_layers: int = 4):
        ...

config = fdl.Config(Encoder, vocab_size=50000)

# Without defaults - only explicit values
dyn.config_to_dict(config)
# {"_target_": "...", "vocab_size": 50000}

# With defaults - all values
dyn.config_to_dict(config, include_defaults=True)
# {"_target_": "...", "vocab_size": 50000, "hidden_dim": 256, "num_layers": 4}
```

### 9. Shallow vs Deep Defaults (`deep_defaults`)

Control how callable defaults are expanded using `deep_defaults`:

```python
class Factory:
    def __init__(self, cls: type = Encoder, name: str = "default"):
        ...

config = fdl.Config(Factory)

# Deep (default): Callable defaults include THEIR parameter defaults
dyn.config_to_dict(config, include_defaults=True, deep_defaults=True)
# {
#   "_target_": "Factory",
#   "cls": {
#     "_target_": "Encoder",
#     "_call_": false,
#     "vocab_size": 32000,   # Encoder's defaults included
#     "hidden_dim": 256,
#     "num_layers": 4
#   },
#   "name": "default"
# }

# Shallow: Callable defaults are NOT expanded
dyn.config_to_dict(config, include_defaults=True, deep_defaults=False)
# {
#   "_target_": "Factory",
#   "cls": {"_target_": "Encoder", "_call_": false},  # No Encoder defaults
#   "name": "default"
# }
```

---

## CLI Usage

### With Factory File (`-f`)

```bash
# Load base config
python main.py -f config.yaml

# Override scalar values
python main.py -f config.yaml model.lr=0.001 epochs=100

# Override nested values with dot notation
python main.py -f config.yaml model.encoder.vocab_size=100000

# Override with file content
python main.py -f config.yaml optimizer=@adamw.yaml

# Multiple overrides
python main.py -f config.yaml model=@large_model.yaml optimizer=@adamw.yaml epochs=200
```

### Without Factory File (Direct `@` Syntax)

Build configurations entirely from CLI without a base file:

```bash
# Load entire config from file
python main.py model=@model.yaml optimizer=@optimizer.yaml

# Mix file loading with overrides
python main.py model=@model.yaml model.dropout=0.2 optimizer.lr=0.001

# Inline YAML values
python main.py model="{_target_: mylib.Model, hidden_size: 512}"
```

### List Index Overrides

```bash
# Override specific list elements
python main.py -f config.yaml callbacks.0.patience=10 callbacks.1.save_path=/new/path
```

---

## API Reference

### CLI

```python
config = dyn.parse_cli()                    # Load from CLI args (NEMO backend)
config = dyn.parse_cli(backend="fiddle")    # Use Fiddle backend
data = dyn.parse_cli(as_dict=True)          # Return raw dict (no Config objects)
```

### I/O

```python
# Load YAML
config = dyn.load_yaml("config.yaml")                      # Auto-create context
config = dyn.load_yaml("config.yaml", ctx)                 # With explicit context
data = dyn.load_yaml("config.yaml", as_dict=True)          # Return raw dict

# Save YAML
dyn.dump_yaml(config, "output.yaml")                       # Write to file
yaml_str = dyn.dump_yaml(config)                           # Return string
dyn.dump_yaml(config, "full.yaml", include_defaults=True)  # Include defaults (deep)
dyn.dump_yaml(config, "shallow.yaml", include_defaults=True, deep_defaults=False)  # Shallow
```

### Parsing

```python
# Dictionary to Config
config = dyn.dict_to_config(data)                          # Fiddle backend (default)
config = dyn.dict_to_config(data, backend=dyn.Backend.NEMO)
config = dyn.dict_to_config(data, ctx=ctx)                 # With shared context

# Class-based API
parser = dyn.ConfigParser(backend=dyn.Backend.FIDDLE)
config = parser.parse(data, ctx)
```

### Serialization

```python
# Config to dictionary
data = dyn.config_to_dict(config)
data = dyn.config_to_dict(config, include_defaults=True)              # Include defaults (deep)
data = dyn.config_to_dict(config, include_defaults=True, deep_defaults=False)  # Shallow defaults

# Class-based API
serializer = dyn.ConfigSerializer(include_defaults=True, deep_defaults=True)
data = serializer.serialize(config)
yaml_str = serializer.to_yaml(config)
```

### Reference Resolution

```python
# Resolve DeferredReference placeholders
ctx = dyn.ParserContext()
config1 = dyn.load_yaml("file1.yaml", ctx)
config2 = dyn.load_yaml("file2.yaml", ctx)
dyn.resolve_placeholders(config2, ctx.registry)

# Class-based API
resolver = dyn.ReferenceResolver()
resolver.resolve(config, registry)
```

### Utilities

```python
# Resolve string path to Python object
cls = dyn.resolve_target("torch.optim.Adam")

# Get qualified name of callable
name = dyn.get_target_name(torch.optim.Adam)
# "torch.optim.adam.Adam"
```

---

## Backend Selection

FiddleDyn supports two backends:

| Backend | Config Type | Partial Type | Requirement |
|---------|-------------|--------------|-------------|
| `fiddle` (default) | `fiddle.Config` | `fiddle.Partial` | (included) |
| `nemo` | `nemo_run.Config` | `nemo_run.Partial` | `pip install fiddledyn[nemo]` |

```python
# Check if NeMo Run is available
from fiddledyn import HAS_NEMO_RUN
if HAS_NEMO_RUN:
    config = dyn.parse_cli(backend="nemo")
else:
    config = dyn.parse_cli(backend="fiddle")

# With ParserContext
ctx = dyn.ParserContext(backend=dyn.Backend.FIDDLE)
config = dyn.load_yaml("config.yaml", ctx)
```

---

## Example: Complete Training Configuration

```yaml
# train_config.yaml
shared_encoder:
  _target_: mylib.Encoder
  _id_: encoder
  vocab_size: 50000
  hidden_dim: 768
  num_layers: 12

model:
  _target_: mylib.TransformerModel
  encoder:
    _ref_: encoder
  decoder:
    _target_: mylib.Decoder
    hidden_dim: 768

optimizer:
  _target_: torch.optim.AdamW
  _partial_: true
  lr: 0.0001
  weight_decay: 0.01

trainer:
  _target_: mylib.Trainer
  model:
    _ref_: model
  optimizer:
    _ref_: optimizer
  callbacks:
    - _target_: mylib.EarlyStopping
      patience: 5
    - _target_: mylib.ModelCheckpoint
      save_path: checkpoints/
```

```bash
python train.py \
  -f train_config.yaml \
  shared_encoder.vocab_size=100000 \
  optimizer.lr=0.00005 \
  trainer.callbacks.0.patience=10
```

---

## Architecture

```
src/fiddledyn/
├── core/           # Backend, ParserContext, DeferredReference
├── parsing/        # dict_to_config, ConfigParser
├── serialization/  # config_to_dict, ConfigSerializer
├── resolution/     # resolve_placeholders, ReferenceResolver
├── cli.py          # parse_cli
├── io.py           # load_yaml, dump_yaml
└── utils.py        # resolve_target, get_target_name
```

---

## License

MIT
