Metadata-Version: 2.4
Name: urgap
Version: 3.3.9
Project-URL: Homepage, https://github.com/urgap/urgap
Project-URL: Documentation, https://urgap.github.io/urgap/
Project-URL: Repository, https://github.com/urgap/urgap.git
Project-URL: Issues, https://github.com/urgap/urgap/issues
Project-URL: Changelog, https://github.com/urgap/urgap/blob/main/CHANGELOG.md
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX
Classifier: Operating System :: POSIX :: SunOS/Solaris
Classifier: Operating System :: Unix
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Chemistry
Requires-Dist: click==8.3.1
Requires-Dist: fastapi==0.133.0
Requires-Dist: fastexcel==0.19.0
Requires-Dist: flask-wtf==1.2.2
Requires-Dist: flask==3.1.3
Requires-Dist: jinja2==3.1.6
Requires-Dist: jsonschema==4.26.0
Requires-Dist: mcp==1.26.0
Requires-Dist: nbformat>=4.2.0
Requires-Dist: networkx<3.5,>=3.4.2; python_version == '3.10'
Requires-Dist: networkx==3.6.1; python_version >= '3.11'
Requires-Dist: openpyxl
Requires-Dist: packaging==25.0
Requires-Dist: pandas<2.4.0,>=1.4.2
Requires-Dist: plotly
Requires-Dist: polars>=1.33.0
Requires-Dist: pygithub==2.8.1
Requires-Dist: requests<2.33,>=2.26
Requires-Dist: sqlalchemy==2.0.47
Requires-Dist: tqdm==4.67.3
Requires-Dist: uvicorn==0.40.0
Requires-Dist: xlsxwriter==3.2.9
Provides-Extra: all
Requires-Dist: apache-beam==2.70.0; extra == 'all'
Requires-Dist: azure-identity==1.25.2; extra == 'all'
Requires-Dist: azure-keyvault-secrets==4.10.0; extra == 'all'
Requires-Dist: azure-monitor-opentelemetry==1.8.6; extra == 'all'
Requires-Dist: azure-servicebus==7.14.3; extra == 'all'
Requires-Dist: azure-storage-blob==12.28.0; extra == 'all'
Requires-Dist: azure-storage-file-datalake==12.22.0; extra == 'all'
Requires-Dist: azure-storage-file-share==12.24.0; extra == 'all'
Requires-Dist: black>=18.3a0; extra == 'all'
Requires-Dist: chemical-composition~=1.0.6; extra == 'all'
Requires-Dist: cloud-sql-python-connector~=1.0; extra == 'all'
Requires-Dist: fastmcp==3.0.2; extra == 'all'
Requires-Dist: flowkit; extra == 'all'
Requires-Dist: google-apitools==0.5.35; extra == 'all'
Requires-Dist: google-cloud-logging==3.13.0; extra == 'all'
Requires-Dist: google-cloud-secret-manager==2.26.0; extra == 'all'
Requires-Dist: google-cloud-storage==3.9.0; extra == 'all'
Requires-Dist: google-crc32c==1.8.0; extra == 'all'
Requires-Dist: ipython; extra == 'all'
Requires-Dist: mcp==1.26.0; extra == 'all'
Requires-Dist: nbsphinx; extra == 'all'
Requires-Dist: opentelemetry-exporter-otlp-proto-grpc==1.39.0; extra == 'all'
Requires-Dist: opentelemetry-sdk==1.39.0; extra == 'all'
Requires-Dist: pg8000~=1.29; extra == 'all'
Requires-Dist: prefect-kubernetes==0.7.3; extra == 'all'
Requires-Dist: prefect==3.6.13; extra == 'all'
Requires-Dist: psycopg2-binary; extra == 'all'
Requires-Dist: pymongo<5.0.0,>=3.8.0; extra == 'all'
Requires-Dist: pysmb==1.2.13; extra == 'all'
Requires-Dist: pytest; extra == 'all'
Requires-Dist: pytest-asyncio>=0.23; extra == 'all'
Requires-Dist: pyvis==0.3.2; extra == 'all'
Requires-Dist: sphinx; extra == 'all'
Requires-Dist: sphinx-rtd-theme; extra == 'all'
Requires-Dist: unimod-mapper~=0.6.4; extra == 'all'
Requires-Dist: uparma~=1.0.1; extra == 'all'
Provides-Extra: cloud
Requires-Dist: apache-beam==2.70.0; extra == 'cloud'
Requires-Dist: azure-identity==1.25.2; extra == 'cloud'
Requires-Dist: azure-keyvault-secrets==4.10.0; extra == 'cloud'
Requires-Dist: azure-monitor-opentelemetry==1.8.6; extra == 'cloud'
Requires-Dist: azure-servicebus==7.14.3; extra == 'cloud'
Requires-Dist: azure-storage-blob==12.28.0; extra == 'cloud'
Requires-Dist: azure-storage-file-datalake==12.22.0; extra == 'cloud'
Requires-Dist: azure-storage-file-share==12.24.0; extra == 'cloud'
Requires-Dist: cloud-sql-python-connector~=1.0; extra == 'cloud'
Requires-Dist: fastmcp==3.0.2; extra == 'cloud'
Requires-Dist: google-apitools==0.5.35; extra == 'cloud'
Requires-Dist: google-cloud-logging==3.13.0; extra == 'cloud'
Requires-Dist: google-cloud-secret-manager==2.26.0; extra == 'cloud'
Requires-Dist: google-cloud-storage==3.9.0; extra == 'cloud'
Requires-Dist: google-crc32c==1.8.0; extra == 'cloud'
Requires-Dist: mcp==1.26.0; extra == 'cloud'
Requires-Dist: opentelemetry-exporter-otlp-proto-grpc==1.39.0; extra == 'cloud'
Requires-Dist: opentelemetry-sdk==1.39.0; extra == 'cloud'
Requires-Dist: pg8000~=1.29; extra == 'cloud'
Requires-Dist: prefect-kubernetes==0.7.3; extra == 'cloud'
Requires-Dist: prefect==3.6.13; extra == 'cloud'
Requires-Dist: pysmb==1.2.13; extra == 'cloud'
Requires-Dist: pyvis==0.3.2; extra == 'cloud'
Provides-Extra: databases
Requires-Dist: psycopg2-binary; extra == 'databases'
Requires-Dist: pymongo<5.0.0,>=3.8.0; extra == 'databases'
Provides-Extra: dev
Requires-Dist: black>=18.3a0; extra == 'dev'
Requires-Dist: pytest; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Provides-Extra: docs
Requires-Dist: ipython; extra == 'docs'
Requires-Dist: nbsphinx; extra == 'docs'
Requires-Dist: sphinx; extra == 'docs'
Requires-Dist: sphinx-rtd-theme; extra == 'docs'
Provides-Extra: legacy
Requires-Dist: chemical-composition~=1.0.6; extra == 'legacy'
Requires-Dist: flowkit; extra == 'legacy'
Requires-Dist: unimod-mapper~=0.6.4; extra == 'legacy'
Requires-Dist: uparma~=1.0.1; extra == 'legacy'
Description-Content-Type: text/x-rst

Urgap - node wrapping framework
===============================

Urgap is a cloud-native framework for file-based data engineering, containing abstraction layers for data and meta data, extensive re-run skipping logic and data versioning. Urgap can be incorporated with any scheduling/pipelining tool making pipeline development independent from business logic and data storage, while offering standardized logging and execution, which makes monitoring and debugging easy.

Urgap gives us the governance constraints required for a decentralized data domain autonomy as Urgap will enforce shared common data IO for storage, a common meta data capturing process in form of an interface thus can be plugged into any existing processes and finally global data lineages. 

|build-status-azure|

.. |build-status-azure| image:: https://dev.azure.com/DevOps-RD/RD-DSO/_apis/build/status%2Fgsk-tech.urgap?repoName=gsk-tech%2Furgap&branchName=refs%2Fpull%2F388%2Fmerge
   :target: https://dev.azure.com/DevOps-RD/RD-DSO/_build?definitionId=13513
   :alt: ADO CI status

Learn More
----------

Watch our introduction talk **urgap - unified resource governance and data provenance** by Christian Fufezan to get a comprehensive overview of urgap's design and capabilities:

.. image:: https://img.youtube.com/vi/63pYK1xZPx8/0.jpg
   :target: https://www.youtube.com/watch?v=63pYK1xZPx8
   :alt: Watch the video
   :width: 560

How to Setup
------------

Prerequisites
~~~~~~~~~~~~~

We recommend using a virtual environment for Python projects. This guide uses `uv` for dependency management.

Installation
~~~~~~~~~~~~

**Basic Installation** (local file system access only):

.. code-block:: bash

    uv pip install -e .

**With Cloud Storage Support:**

.. code-block:: bash

    uv pip install -e ".[cloud]"

**With All Optional Dependencies:**

.. code-block:: bash

    uv pip install -e ".[all]"

Available extras include:

* ``cloud``: Azure and Google Cloud storage backends
* ``all``: All optional dependencies

Running Tests
~~~~~~~~~~~~~

Install test dependencies:

.. code-block:: bash

    uv pip install pytest

Run the test suite:

.. code-block:: bash

    pytest tests

Quickstart: Writing Your First Pipeline
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The best way to learn urgap is through a complete example. Check out the end-to-end filter CSV pipeline:

* **Location:** `tests/integrationtests/end2end/test_filter_csv_pipeline.py`
* **What it demonstrates:** Complete pipeline setup, node configuration, and execution
* **Requirements:** Everything needed to run this example is included in the repository

This example can be run entirely on your local machine without any external dependencies.

To run the example:

.. code-block:: bash

    pytest tests/integrationtests/end2end/test_filter_csv_pipeline.py

Documentation
--------------

Please use sphinx in the docs folder


.. note::

    Currently CI does not include pushing the documentation to readthedocs, therefore please 
    #. checkout the repo
    #. pip install -e .
    #. cd docs
    #. make html
    #. open docs/build/index.html
