Metadata-Version: 2.4
Name: typespecs
Version: 8.0.0
Summary: Data specifications via type hints
Project-URL: homepage, https://astropenguin.github.io/typespecs
Project-URL: repository, https://github.com/astropenguin/typespecs
Author-email: Akio Taniguchi <mail@taniguchiak.io>
License: MIT License
        
        Copyright (c) 2025-2026 Akio Taniguchi
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: annotation,dataclass,dataframe,namedtuple,python,specification,typeddict,typing
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Python: <4.0,>=3.10
Requires-Dist: packaging>=24
Requires-Dist: pandas>=2
Requires-Dist: readonlydict>=1
Requires-Dist: typing-extensions>=4
Description-Content-Type: text/markdown

# Typespecs

[![Release](https://img.shields.io/pypi/v/typespecs?label=Release&color=cornflowerblue&style=flat-square)](https://pypi.org/project/typespecs/)
[![Python](https://img.shields.io/pypi/pyversions/typespecs?label=Python&color=cornflowerblue&style=flat-square)](https://pypi.org/project/typespecs/)
[![Downloads](https://img.shields.io/pypi/dm/typespecs?label=Downloads&color=cornflowerblue&style=flat-square)](https://pepy.tech/project/typespecs)
[![DOI](https://img.shields.io/badge/DOI-10.5281/zenodo.17681195-cornflowerblue?style=flat-square)](https://doi.org/10.5281/zenodo.17681195)
[![Tests](https://img.shields.io/github/actions/workflow/status/astropenguin/typespecs/tests.yaml?label=Tests&style=flat-square)](https://github.com/astropenguin/typespecs/actions)

Data specifications via type hints

## Overview

**Typespecs** is a lightweight Python library that leverages [`typing.Annotated`](https://docs.python.org/3.14/library/typing.html#typing.Annotated) to manage metadata (category, description, units, ...) within the type hints of your data structures.
It offers a dedicated read-only dictionary called a **type specification** to attach your metadata to your type hints.
This approach keeps your code clean and seamlessly coexists with other `Annotated`-based libraries such as [Pydantic](https://pydantic.dev/).
Finally, the attached metadata can be extracted and aggregated into a [`pandas.DataFrame`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object called a **specification DataFrame**, making it easier to manage it using the rich PyData ecosystem.

## Installation

```bash
pip install typespecs
```

## Basic Usage

You can create and attach a type specification, [`typespecs.Spec(key=value, ...)`](https://astropenguin.github.io/typespecs/_apidoc/typespecs.html#typespecs.Spec), to a type hint of your data structure such as [Python's Data Classes](https://docs.python.org/3.14/library/dataclasses.html) and [Pydantic models](https://pydantic.dev/docs/validation/latest/concepts/models/).
The `Spec` object acts as a read-only dictionary, ensuring your metadata remains immutable and safe from runtime modifications.
Once your data structure is defined, use [`typespecs.from_annotated(obj)`](https://astropenguin.github.io/typespecs/_apidoc/typespecs.html#typespecs.from_annotated) to extract and aggregate the attached metadata into a specification DataFrame.
By default, the actual data and the metadata-stripped type hints will also be stored in the `data` and `type` columns, respectively (you can control this behavior using the `data` and `type` parameters in `from_annotated`).
```python
import typespecs as ts
from dataclasses import dataclass
from typing import Annotated as Ann, TypeVar


@dataclass
class Weather:
    temp: Ann[list[float], ts.Spec(category="data", name="Temperature", units="K")]
    wind: Ann[list[float], ts.Spec(category="data", name="Wind speed", units="m/s")]
    loc: Ann[str, ts.Spec(category="info", name="Observed location")]


weather = Weather([273.15, 280.15], [5.0, 10.0], "Tokyo")
specs = ts.from_annotated(weather)
print(specs)
```
```
      category              data               name           type  units
temp      data  [273.15, 280.15]        Temperature    list[float]      K
wind      data       [5.0, 10.0]         Wind speed    list[float]    m/s
loc       info             Tokyo  Observed location  <class 'str'>   <NA>
```

You can attach multiple `Spec` objects to a single type hint.
If metadata overlaps between them, the last one will take precedence.
```python
Temp = Ann[list[float], ts.Spec(category="data", name="Temperature")]
Wind = Ann[list[float], ts.Spec(category="data", name="Wind speed")]
Loc = Ann[str, ts.Spec(category="info", name="Observed Location")]


@dataclass
class Weather:
    temp: Ann[Temp, ts.Spec(units="K")]
    wind: Ann[Wind, ts.Spec(units="m/s")]
    loc: Ann[Loc, ts.Spec(name="City")]


weather = Weather([273.15, 280.15], [5.0, 10.0], "Tokyo")
specs = ts.from_annotated(weather)
print(specs)
```
```
      category              data         name           type  units
temp      data  [273.15, 280.15]  Temperature    list[float]      K
wind      data       [5.0, 10.0]   Wind speed    list[float]    m/s
loc       info             Tokyo         City  <class 'str'>   <NA>
```

## Advanced Usage

### Handling Nested Types

Typespecs simplifies working with nested types.
By default, the metadata attached to nested types will be merged into a single parent row.
```python
Float = Ann[float, ts.Spec(dtype="f8")]


@dataclass
class Weather:
    temp: Ann[list[Float], ts.Spec(category="data", name="Temperature", units="K")]
    wind: Ann[list[Float], ts.Spec(category="data", name="Wind speed", units="m/s")]
    loc: Ann[str, ts.Spec(category="info", name="Observed location")]


weather = Weather([273.15, 280.15], [5.0, 10.0], "Tokyo")
specs = ts.from_annotated(weather)
print(specs)
```
```
      category              data  dtype               name           type  units
temp      data  [273.15, 280.15]     f8        Temperature    list[float]      K
wind      data       [5.0, 10.0]     f8         Wind speed    list[float]    m/s
loc       info             Tokyo   <NA>  Observed location  <class 'str'>   <NA>
```

You can disable this merging behavior using `merge=False` in `from_annotated`.
```python
specs = ts.from_annotated(weather, merge=False)
print(specs)
```
```
        category              data  dtype               name             type  units
temp        data  [273.15, 280.15]   <NA>        Temperature      list[float]      K
temp/0      <NA>              <NA>     f8               <NA>  <class 'float'>   <NA>
wind        data       [5.0, 10.0]   <NA>         Wind speed      list[float]    m/s
wind/0      <NA>              <NA>     f8               <NA>  <class 'float'>   <NA>
loc         info             Tokyo   <NA>  Observed location    <class 'str'>   <NA>
```

Finally, you can include the nested type itself as part of the metadata using the special `typespecs.ITSELF` object.
This is useful when you want to handle the inner type alongside other metadata within the specification DataFrame.
```python
Dtype = Ann[TypeVar("T"), ts.Spec(dtype=ts.ITSELF)]


@dataclass
class Weather:
    temp: Ann[list[Dtype[float]], ts.Spec(category="data", name="Temperature", units="K")]
    wind: Ann[list[Dtype[float]], ts.Spec(category="data", name="Wind speed", units="m/s")]
    loc: Ann[str, ts.Spec(category="info", name="Observed location")]


weather = Weather([273.15, 280.15], [5.0, 10.0], "Tokyo")
specs = ts.from_annotated(weather)
print(specs)
```
```
      category              data            dtype               name           type  units
temp      data  [273.15, 280.15]  <class 'float'>        Temperature    list[float]      K
wind      data       [5.0, 10.0]  <class 'float'>         Wind speed    list[float]    m/s
loc       info             Tokyo             <NA>  Observed location  <class 'str'>   <NA>
```

### Handling Missing Values

By default, missing metadata is filled with `pandas.NA` in a specification DataFrame.
You can specify custom fallback values by using the `default` parameter in `from_annotated`.

```python
specs = ts.from_annotated(weather, default={"dtype": None, "units": "1"})
print(specs)
```
```
      category              data            dtype               name           type  units
temp      data  [273.15, 280.15]  <class 'float'>        Temperature    list[float]      K
wind      data       [5.0, 10.0]  <class 'float'>         Wind speed    list[float]    m/s
loc       info             Tokyo             None  Observed location  <class 'str'>      1
```

### Handling Type Hint(s) Directly

You can create a specification DataFrame from type hint(s) using [`typespecs.from_annotation`](https://astropenguin.github.io/typespecs/_apidoc/typespecs.html#typespecs.from_annotation) and [`typespecs.from_annotations`](https://astropenguin.github.io/typespecs/_apidoc/typespecs.html#typespecs.from_annotations).
This is useful when you want to directly handle type hints without defining them within a data structure.
```python
annotations = {
      "temp": Ann[list[Dtype[float]], ts.Spec(category="data", name="Temperature", units="K")],
      "wind": Ann[list[Dtype[float]], ts.Spec(category="data", name="Wind speed", units="m/s")],
      "loc": Ann[str, ts.Spec(category="info", name="Observed location")],
}
specs = ts.from_annotations(annotations)
print(specs)
```
```
      category            dtype               name           type  units
temp      data  <class 'float'>        Temperature    list[float]      K
wind      data  <class 'float'>         Wind speed    list[float]    m/s
loc       info             <NA>  Observed location  <class 'str'>   <NA>
```
```python
specs = ts.from_annotation(annotations["temp"])
print(specs)
```
```
      category            dtype         name         type  units
root      data  <class 'float'>  Temperature  list[float]      K
```
