Metadata-Version: 2.3
Name: ontoweaver
Version: 0.2.1
Summary: Exports from tables to Semantic Knowledge Graphs, with a simple declarative mapping.
Author: Johann Dreo
Author-email: johann.dro@pasteur.fr>, Marko Baric <marko.baric@pasteur.fr>, Claire Laudy <claire.laudy@thalesgroup.com>, Matthieu Najm <matthieu.najm@pasteur.fr>, Benno Schwikowski <benno.schwikowski@pasteur.fr
Requires-Python: >=3.12,<4.0
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Dist: biocypher (>=0.9.3,<0.10.0)
Requires-Dist: jsonargparse (>=4.38,<5.0)
Requires-Dist: numpy (>=2.2.4,<3.0.0)
Requires-Dist: owlready2 (>=0.47,<0.48)
Requires-Dist: pandas (>=2.0.1,<3.0.0)
Requires-Dist: pandera[io] (>=0.23.1,<0.24.0)
Requires-Dist: pooch (>=1.7.0,<2.0.0)
Requires-Dist: xdg-base-dirs (>=6.0.2,<7.0.0)
Description-Content-Type: text/markdown

# OntoWeaver

OntoWeaver is a tool for importing table data in
Semantic Knowledge Graphs (SKG) databases.

OntoWeaver allows writing a simple declarative mapping to express how columns from
a [Pandas](https://pandas.pydata.org/) table are to be converted as typed nodes
or edges in an SKG.

![](docs/OntoWeaver_logo__big.svg)

It provides a simple layer of abstraction on top of [Biocypher](https://biocypher.org),
which remains responsible for doing the ontology alignment,
supporting several graph database backends,
and allowing reproducible & configurable builds.

With a pure Biocypher approach, you would have to write a whole adapter by hand,
with OntoWeaver, you just have to express a mapping in YAML, looking like:
```yaml
row: # The meaning of an entry in the input table.
   map:
      column: <column name in your CSV>
      to_subject: <ontology node type to use for representing a row>

transformers: # How to map cells to nodes and edges.
    - map: # Map a column to a node.
        column: <column name>
        to_object: <ontology node type to use for representing a column>
        via_relation: <edge type for linking subject and object nodes>
    - map: # Map a column to a property.
        column: <another name>
        to_property: <property name>
        for_object: <type holding the property>

metadata: # Optional properties added to every node and edge.
    - source: "My OntoWeaver adapter"
    - version: "v1.2.3"
```

### Installation 

The project is written in Python and uses [Poetry](https://python-poetry.org).
You can install the necessary dependencies in a virtual environment like this:

```
git clone https://github.com/oncodash/ontoweaver.git
cd ontoweaver
poetry install
```

Poetry will create a virtual environment according to your configuration (either
centrally or in the project folder). You can activate it by running `poetry
shell` inside the project directory.

Theoretically, the graph can be imported in
any [graph] database supported by BioCypher
(Neo4j, ArangoDB, CSV, RDF, PostgreSQL, SQLite, NetworkX, …
see [BioCypher's documentation](https://biocypher.org/output/index.html)).

### Usage Documentation

Detailed documentation with tutorials and a more detailed installation guide is available 
[on the OntoWeaver website](https://ontoweaver.readthedocs.io/en/latest/).

### Tests

Tests are located in the `tests/` subdirectory and may be a good starting point
to see OntoWeaver in practice. You may start with `tests/test_simplest.py` which
shows the simplest example of mapping tabular data through BioCypher.

To run tests, use `pytest`:
```
poetry run pytest
```
or, alternatively:
```
poetry shell
pytest
```

### Contributing

In case of any questions, or improvements feel free to open an issue or a pull request!

