Loader

Loader(unit=None, number_type=NumberType.float)

Abstract base class for loading music representations.

Loader provides a unified interface for loading one or more source files and aggregating their events into an EventData. Metadata about the sources is stored in the PyArrow table’s schema metadata for determinism.

Subclasses must implement: - _load_source(): Parse a single source file into event rows - _default_unit: The default time unit for this loader type

Attributes: events: The EventData containing all loaded events. sources: List of loaded source file paths. unit: The time unit for coordinates. number_type: The number type for coordinates.

Examples: >>> # Subclass implementation >>> class MidiLoader(Loader): … _default_unit = TimeUnit.ticks … _event_data_class = EventData … … def _load_source(self, path): … # Parse MIDI file, return (metadata_dict, event_rows) … return {“format”: “midi”}, [{“id”: “n1”, …}] >>> >>> loader = MidiLoader() >>> loader.load(“piece.mid”) >>> print(loader.event_summary())

Attributes

Name Description
events The EventData containing all loaded events.
metadata Aggregated metadata from all sources.
number_type The number type for coordinates.
sources List of loaded source file paths.
store Return an EventStore wrapping the loader’s events.
unit The time unit for coordinates.

Methods

Name Description
clear Clear all loaded sources and events.
count_events_by_temporal_type Count events grouped by temporal_type (instant/interval).
count_events_by_type Count events grouped by event_type.
create_bundle Create an AlignmentBundle. Override in subclasses that support bundles.
create_cmap Create a ConversionMap from two coordinate columns.
create_group Create a TimelineGroup. Override in subclasses that support groups.
create_timeline Create a Timeline from the loaded events.
create_timelines Create all timelines, optionally filtered by regex pattern.
event_summary Get a summary of loaded events.
from_file Load one or more files and return the loader (convenience constructor).
from_parquet Load a Loader from a Parquet file.
load Load one or more source files.
to_parquet Save the loaded events to a Parquet file.

clear

Loader.clear()

Clear all loaded sources and events.

count_events_by_temporal_type

Loader.count_events_by_temporal_type()

Count events grouped by temporal_type (instant/interval).

Returns: Dict mapping “instant”/“interval” to counts.

count_events_by_type

Loader.count_events_by_type()

Count events grouped by event_type.

Returns: Dict mapping event type names to counts.

create_bundle

Loader.create_bundle(**kwargs)

Create an AlignmentBundle. Override in subclasses that support bundles.

Returns: An AlignmentBundle, or None if the loader does not produce bundles.

create_cmap

Loader.create_cmap(source_column, target_column, *, map_type=None, **kwargs)

Create a ConversionMap from two coordinate columns.

This method creates a C-Map from loaded coordinate data, enabling conversion between different coordinate systems (e.g., seconds to pixels).

Both columns must contain coordinate data (either core coordinates like ‘start’/‘end’, or CoordinateField extra columns).

Args: source_column: Name of the source coordinate column. target_column: Name of the target coordinate column. map_type: The map class to create. Defaults to TableMap. Supported: TableMap, LinearMap, ScalarMap. **kwargs: Additional arguments passed to the map constructor. For TableMap: kind, extrapolate For LinearMap: (computed automatically from data)

Returns: A ConversionMap instance.

Raises: ValueError: If columns don’t exist or aren’t coordinate columns. ValueError: If insufficient data points for the map type.

Examples: >>> # Load data with dual coordinates >>> loader.load(“data.tsv”)

>>> # Create TableMap (default) from start -> x_pixels
>>> cmap = loader.create_cmap("start", "x_pixels")

>>> # Create LinearMap (fits y = ax + b to data)
>>> cmap = loader.create_cmap("start", "x_pixels", map_type=LinearMap)

>>> # TableMap with custom interpolation
>>> cmap = loader.create_cmap("start", "x_pixels", kind="cubic")

create_group

Loader.create_group(**kwargs)

Create a TimelineGroup. Override in subclasses that support groups.

Returns: A TimelineGroup, or None if the loader does not produce groups.

create_timeline

Loader.create_timeline(
    uid=None,
    store_filters=None,
    include_stores=None,
    exclude_stores=None,
    flatten=False,
)

Create a Timeline from the loaded events.

Convenience method that delegates to self.store.create_timeline().

Args: uid: Unique ID for the parent timeline. Auto-generated if None. store_filters: Per-data filter kwargs to apply before timeline creation. Example: {“notes”: {“event_type”: “Note”}}. include_stores: Only include these data (default: all non-empty). exclude_stores: Exclude these data from the timeline. flatten: If True, merge all events into a single parent timeline.

Returns: A Timeline containing the loaded events.

Examples: >>> loader = Ms3Loader() >>> loader.load(“notes.tsv”) >>> timeline = loader.create_timeline(uid=“my_score”)

create_timelines

Loader.create_timelines(id_pattern=None)

Create all timelines, optionally filtered by regex pattern.

The default implementation returns a single-element list with create_timeline(). Subclasses with multi-timeline output (e.g., TiliaJsonLoader, MatchfileLoader) override this.

Args: id_pattern: Optional regex pattern to filter timeline IDs.

Returns: List of Timeline objects.

event_summary

Loader.event_summary()

Get a summary of loaded events.

Returns: Dict with event counts, types, coordinate range, etc.

from_file

Loader.from_file(*paths, **kwargs)

Load one or more files and return the loader (convenience constructor).

This combines instantiation and loading into a single call.

Args: *paths: Paths to source files. **kwargs: Additional keyword arguments passed to __init__.

Returns: A new Loader instance with the files already loaded.

Examples: >>> loader = Ms3Loader.from_file(“notes.tsv”) >>> len(loader.events) 42

from_parquet

Loader.from_parquet(path)

Load a Loader from a Parquet file.

Note: This creates a new Loader with the EventData loaded, but source paths may not be accessible for re-loading.

Args: path: Path to the Parquet file.

Returns: A new Loader with events loaded from the file.

load

Loader.load(*sources)

Load one or more source files.

Events from all sources are aggregated into the EventData. Metadata for each source is recorded separately.

Supports both vectorized (column dict) and legacy (row dicts) modes: - Vectorized: _load_source returns dict[str, np.ndarray | pa.Array] - Legacy: _load_source returns list[dict[str, Any]]

Args: *sources: Paths to source files.

Returns: Self, for method chaining.

Raises: FileNotFoundError: If any source doesn’t exist. ValueError: If any source is invalid.

to_parquet

Loader.to_parquet(path)

Save the loaded events to a Parquet file.

The metadata (including source info) is preserved in the file.

Args: path: Path to write the Parquet file.