Metadata-Version: 2.4
Name: nci_cidc_schemas
Version: 0.28.13
Summary: The CIDC data model and tools for working with it.
Home-page: https://github.com/NCI-CIDC/cidc-schemas
Author: NCI
Author-email: nci-cidc-tools-admin@mail.nih.gov
License: MIT license
Keywords: cidc_schemas
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.13,<3.14
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: argparse~=1.4.0
Requires-Dist: dateparser~=1.2.2
Requires-Dist: deepdiff~=8.6.1
Requires-Dist: jinja2~=3.1.6
Requires-Dist: jsonmerge~=1.6.1
Requires-Dist: jsonpointer~=3.0.0
Requires-Dist: jsonschema~=3.2.0
Requires-Dist: openpyxl~=3.1.5
Requires-Dist: pandas~=3.0.0
Requires-Dist: pyyaml~=6.0.3
Requires-Dist: xlsxwriter~=3.2.9
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: license
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# cidc-schemas

This repository contains formal definitions of the CIDC metadata model using [json-schema](https://json-schema.org/) syntax and vocabulary.

### View documentation at https://nci-cidc.github.io/cidc-schemas/

## Installation

To install the latest released version, run:

```bash
pip install nci-cidc-schemas
```

## Development

### Project Structure

- **`cidc_schemas/`** - a python module for generating, validating, and reading manifest and assay templates.
  - **`schemas/`** - json specifications defining the CIDC metadata model.
    - `templates/` - schemas for generating and validating manifest and assay templates.
    - `assays/` - schemas for defining assay data models.
    - `artifacts/` - schemas for defining artifacts.
- **`docs/`** - the most recent build of the data model documentation, along with templates and scripts for re-generating the documentation.
- **`template_examples/`** - example populated Excel files for template specifications in `schemas/templates`, and `.csv`s auto-generated from those `.xlsx`s that allow to transparently keep track of changes in them.
- **`tests/`** - tests for the `cidc_schemas` module.

### Developer Setup

Install necessary dependencies.

```bash
pip install -r requirements.dev.txt
```

Install and configure pre-commit hooks.

```bash
pre-commit install
```

### Running tests

This repository has unit tests in the _tests_ folder. After installing dependencies
the tests can be run via the command

```bash
pytest tests
```

### Building documentation

Pre-commit hooks ensure documentation is automatically up-to date. To build the documentation manually, run the following commands:

```bash
python setup.py install # install helpers from the cidc_schemas library
python docs/generate_docs.py
```

This will output the generated html documents in `docs/docs`. If the updated docs are pushed up and merged into master, they will be viewable at https://nci-cidc.github.io/cidc-schemas/.

## Using the Command-Line Interface

This project comes with a command-line interface for validating schemas and generating/validating assay and manifest templates.

### Install the CLI

Clone the repository and cd into it

```bash
git clone git@github.com:NCI-CIDC/cidc-schemas.git
cd cidc-schemas
```

Install the `cidc_schemas` package (this adds the `cidc_schemas` CLI to your console)

```bash
python setup.py install
```

Run `cidc_schemas --help` to see available options.

If you're making changes to the module and want those changes to be reflected in the CLI without reinstalling the `cidc_schemas` module every time, run

```bash
python3 -m cidc_schemas.cli [args]
```

### Creating a new assay or analysis type ###

In order to create a new assay type, your best bet is to just search for an existing assay and copy it.

Preferably, look at scrnaseq and copy exactly what it does. Make changes in the assay schema and template for your
particular assay and/or analysis schema.

Once you update and update the version of this repo, update api-gae. You should only need to copy what scrnaseq
did in api-gae in order for files to show up on the portal. Make sure to update the api-gae version. Update the api-gae version used
in cloud-functions.

Finally, make sure to update the cli tool to include the new assay.

There are a lot of gotchas and hidden parsing going on behind the scenes. Listing them all would be hard, so the
practical advice is to follow an existing working template.

Be sure to regenerate the docs after creating your schema, so the new schema is added to the reference docs.

### Generate templates

Create a template for a given template configuration.

```bash
cidc_schemas generate_template -m templates/manifests/pbmc_template.json -o pbmc.xlsx
```

### Validate filled-out templates

Check that a populated template file is valid with respect to a template specification.

```bash
cidc_schemas validate_template -m templates/manifests/pbmc_template.json -x template_examples/pbmc_template.xlsx
```

### Validate JSON schemas

Check that a JSON schema conforms to the JSON Schema specifications.

```bash
cidc_schemas validate_schema -f shipping_core.json
```
