Metadata-Version: 2.4
Name: ckanext-csvtocsvw
Version: 1.0.4
Summary: CKAN extension automatically generating CSVW metadata for uploaded textual tabular data and importing the first table into the CKAN datastore.
Author-email: Thomas Hanke <thomas.hanke@iwm.fraunhofer.de>
License: AGPL-3.0-or-later
Project-URL: Homepage, https://github.com/Mat-O-Lab/ckanext-csvtocsvw
Project-URL: Repository, https://github.com/Mat-O-Lab/ckanext-csvtocsvw
Keywords: CKAN,CSVW,CSV,JSON-LD,RDF,metadata,datastore,Mat-O-Lab
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: ckanapi
Requires-Dist: requests>=2.4.2
Requires-Dist: requests-toolbelt
Requires-Dist: rdflib
Requires-Dist: pandas
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-ckan; extra == "dev"
Dynamic: license-file

[![Tests](https://github.com/Mat-O-Lab/ckanext-csvtocsvw/actions/workflows/test.yml/badge.svg)](https://github.com/Mat-O-Lab/ckanext-csvtocsvw/actions/workflows/test.yml)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.XXXXXXX.svg)](https://doi.org/10.5281/zenodo.XXXXXXX)

# ckanext-csvtocsvw

Extension automatically generating csvw metadata for uploaded textual tabular data. It uploads the data of the first table documented into a datastore for the source csv file.
**should be used as replacement for datapusher**

## Requirements
Needs a running instance of the [CSVToCSVW Application](https://github.com/Mat-O-Lab/CSVToCSVW).
Point at it through env variables.
Also needed is a Api Token for an account with the right privaledges to make the background job work on private datasets and ressources.

```bash
CKANINI__CKANEXT__CSVTOCSVW_URL=http://${CSVTOCSVW_HOST}:${CSVTOCSVW_APP_PORT}
CKANINI__CKANEXT__CSVTOCSVW__CKAN_TOKEN=${CKAN_API_TOKEN}
```

You can set the default formats to annotate by setting the env variable CSVTOCSVW_FORMATS for example
```bash
CKANINI__CKANEXT__CSVTOCSVW__FORMATS="csv txt asc"
```
else it will react to the following  formats: "csv", "txt", "asc", "tsv"

If you need to process files that are not hosted through https (CKAN is not ssl configured), you can disable ssl verification
```bash
CKANINI__CKANEXT__CSVTOCSVW__SSL_VERIFY=False
```

## Purpose
Reacts to CSV files uploaded. DEFAULT_FORMATS are "csv; txt" It creates two to sites for each resource.
- /annotate creates CSVW annotation file for a CSV in json-ld format named <csv_filename>-metadata.json, uploades table-1 to ckan datastore o u can explorer it with recline views
- /transform utilizes CSVW metadata to transform the whole content of the csv file to rdf, output is <csv_filename>.ttl
The plugins default behavior includes a trigger to csv file uploads, so it runs annotation automatically on upload.
The transformation is a bonus feature and outputs standard tabular data as mentioned in the CSVW documentation of the W3C. It must be triggered manually.

Compatibility with core CKAN versions:

| CKAN version    | Compatible?   |
| --------------- | ------------- |
| 2.9 and earlier | not tested    |
| 2.10            | ✓ CI tested   |
| 2.11            | ✓ CI tested   |


## Installation

To install the extension:

1. Activate your CKAN virtual environment, for example:
```bash
. /usr/lib/ckan/default/bin/activate
```
2. Use pip to install package
```bash
pip install ckanext-csvtocsvw
```
3. Add `csvtocsvw` to the `ckan.plugins` setting in your CKAN
   config file (by default the config file is located at
   `/etc/ckan/default/ckan.ini`).

4. Restart CKAN. For example, if you've deployed CKAN with Apache on Ubuntu:
```bash
sudo service apache2 reload
```

## Developer installation

To install ckanext-csvtocsvw for development, activate your CKAN virtualenv and do:
```bash
git clone https://github.com/Mat-O-Lab/ckanext-csvtocsvw.git
cd ckanext-csvtocsvw
pip install -e ".[dev]"
```

## Tests

To run the tests, do:
```bash
pytest --ckan-ini=test.ini
```

## Citation

If you use this software, please cite it. GitHub shows a **"Cite this repository"** button (top right of the repo page) that exports the [CITATION.cff](CITATION.cff) in APA or BibTeX format.

After the first Zenodo release, a DOI-specific BibTeX entry will be available on the Zenodo record. Until then:

```bibtex
@software{hanke_ckanext_csvtocsvw,
  author       = {Hanke, Thomas},
  title        = {ckanext-csvtocsvw},
  url          = {https://github.com/Mat-O-Lab/ckanext-csvtocsvw},
  license      = {AGPL-3.0-or-later},
}
```

## License

[AGPL](https://www.gnu.org/licenses/agpl-3.0.en.html)

## Acknowledgments
The authors would like to thank the Federal Government and the Heads of Government of the Länder for their funding and support within the framework of the [Platform Material Digital](https://www.materialdigital.de) consortium. Funded by the German [Federal Ministry of Education and Research (BMBF)](https://www.bmbf.de/bmbf/en/) through the [MaterialDigital](https://www.bmbf.de/SharedDocs/Publikationen/de/bmbf/5/31701_MaterialDigital.pdf?__blob=publicationFile&v=5) Call in Project [KupferDigital](https://www.materialdigital.de/project/1) - project id 13XP5119.
