Metadata-Version: 2.3
Name: ontoweaver
Version: 1.0.0
Summary: Exports from tables to Semantic Knowledge Graphs, with a simple declarative mapping.
Author: Johann Dreo
Author-email: johann.dro@pasteur.fr>, Marko Baric <marko.baric@pasteur.fr>, Claire Laudy <claire.laudy@thalesgroup.com>, Matthieu Najm <matthieu.najm@pasteur.fr>, Benno Schwikowski <benno.schwikowski@pasteur.fr
Requires-Python: >=3.10,<4.0
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Dist: alive-progress (>=3.2,<4.0)
Requires-Dist: biocypher (>=0.11.0,<0.12.0)
Requires-Dist: fsspec (>=2025.10.0,<2026.0.0)
Requires-Dist: jsonargparse (>=4.39,<5.0)
Requires-Dist: numpy (>=2.2.4,<3.0.0)
Requires-Dist: owlready2 (>=0.49,<0.50)
Requires-Dist: pandas (>=2.3.1,<3.0.0)
Requires-Dist: pandera[io] (>=0.27.0,<0.28.0)
Requires-Dist: pooch (>=1.7.0,<2.0.0)
Requires-Dist: xdg-base-dirs (>=6.0.2,<7.0.0)
Description-Content-Type: text/markdown

# OntoWeaver

OntoWeaver is a tool for transforming tabular data in
Semantic Knowledge Graphs (SKG) databases.

OntoWeaver allows writing a simple declarative mapping to express how columns from
a table should be converted as typed nodes, edges or properties in an SKG.

![Diagram showing that OntoWeaver needs ontologies, tabular data and graph schema to produce a Semantic Knowledge Graph.](https://raw.githubusercontent.com/oncodash/ontoweaver/refs/heads/main/docs/OntoWeaver__simple-summary.svg)


SKG databases allows for an easy integration of very heterogeneous data, and
OntoWeaver brings a reproducible approach to building them.

With OntoWeaver, you can very easily implement a script that will allow you
to automatically reconfigure a new SKG from the input data, each time you need it.

OntoWeaver has been tested on large scale biomedical use cases, and we can
guarantee that it is simple to operate by anyone having a basic knowledge
of programming.


## Basics

OntoWeaver provides a simple layer of abstraction on top of [Biocypher](https://biocypher.org),
which remains responsible for doing the ontology alignment,
supporting several graph database backends,
and allowing reproducible & configurable builds.

With a pure Biocypher approach, you would have to write a whole adapter by hand,
with OntoWeaver, you just have to express a mapping in YAML, looking like:
```yaml
row: # The meaning of an entry in the input table.
   map:
      column: <column name in your CSV>
      to_subject: <ontology node type to use for representing a row>

transformers: # How to map cells to nodes and edges.
    - map: # Map a column to a node.
        column: <column name>
        to_object: <ontology node type to use for representing a column>
        via_relation: <edge type for linking subject and object nodes>
    - map: # Map a column to a property.
        column: <another name>
        to_property: <property name>
        for_object: <type holding the property>

metadata: # Optional properties added to every node and edge.
    - source: "My OntoWeaver adapter"
    - version: "v1.2.3"
```

OntoWeaver can read anything that [Pandas](https://pandas.pydata.org/) can load,
which means a lot of tabular formats.


### Installation

The project is written in Python and uses [Poetry](https://python-poetry.org).
You can install the necessary dependencies in a virtual environment like this:

```
git clone https://github.com/oncodash/ontoweaver.git
cd ontoweaver
poetry install
```

Poetry will create a virtual environment according to your configuration (either
centrally or in the project folder). You can activate it by running `poetry
shell` inside the project directory.

Theoretically, the graph can be imported in
any [graph] database supported by BioCypher
(Neo4j, ArangoDB, CSV, RDF, PostgreSQL, SQLite, NetworkX, …
see [BioCypher's documentation](https://biocypher.org/output/index.html)).


### Usage Documentation

Detailed documentation with tutorials and a more detailed installation guide is available
[on the OntoWeaver website](https://ontoweaver.readthedocs.io/en/latest/).


### Tests

Tests are located in the `tests/` subdirectory and may be a good starting point
to see OntoWeaver in practice. You may start with `tests/test_simplest.py` which
shows the simplest example of mapping tabular data through BioCypher.

To run tests, use `pytest`:
```
poetry run pytest
```


### Contributing

In case of any questions or improvements feel free to open an issue or a pull request!


