Metadata-Version: 2.4
Name: datannurpy
Version: 0.24.1
Summary: Python library for datannur catalog metadata management
Project-URL: Homepage, https://github.com/datannur/datannurpy
Project-URL: Repository, https://github.com/datannur/datannurpy
Project-URL: Documentation, https://github.com/datannur/datannurpy#readme
Project-URL: Issues, https://github.com/datannur/datannurpy/issues
Author: datannur
License: MIT
License-File: LICENSE
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Python: >=3.9
Requires-Dist: cryptography>=46.0.7
Requires-Dist: fsspec>=2024.0
Requires-Dist: ibis-framework[duckdb]>=11.0
Requires-Dist: jsonjsdb>=0.8.7
Requires-Dist: openpyxl>=3.0
Requires-Dist: polars>=1.0
Requires-Dist: python-dotenv>=1.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: typing-extensions>=4.0; python_version < '3.11'
Requires-Dist: xlrd>=2.0
Provides-Extra: azure
Requires-Dist: adlfs>=2024.0; (python_version >= '3.10') and extra == 'azure'
Provides-Extra: cloud
Requires-Dist: adlfs>=2024.0; (python_version >= '3.10') and extra == 'cloud'
Requires-Dist: gcsfs>=2024.0; (python_version >= '3.10') and extra == 'cloud'
Requires-Dist: s3fs>=2024.0; (python_version >= '3.10') and extra == 'cloud'
Provides-Extra: databases
Requires-Dist: ibis-framework[mssql,mysql,oracle,postgres]>=11.0; extra == 'databases'
Provides-Extra: delta
Requires-Dist: deltalake>=0.18.0; extra == 'delta'
Provides-Extra: gcs
Requires-Dist: gcsfs>=2024.0; (python_version >= '3.10') and extra == 'gcs'
Provides-Extra: iceberg
Requires-Dist: pyiceberg>=0.10.0; (python_version >= '3.10') and extra == 'iceberg'
Requires-Dist: requests>=2.33.0; (python_version >= '3.10') and extra == 'iceberg'
Provides-Extra: mssql
Requires-Dist: ibis-framework[mssql]>=11.0; extra == 'mssql'
Provides-Extra: mysql
Requires-Dist: ibis-framework[mysql]>=11.0; extra == 'mysql'
Provides-Extra: oracle
Requires-Dist: ibis-framework[oracle]>=11.0; extra == 'oracle'
Provides-Extra: postgres
Requires-Dist: ibis-framework[postgres]>=11.0; extra == 'postgres'
Provides-Extra: s3
Requires-Dist: s3fs>=2024.0; (python_version >= '3.10') and extra == 's3'
Provides-Extra: ssh
Requires-Dist: paramiko>=4.0; extra == 'ssh'
Provides-Extra: stat
Requires-Dist: pyreadstat<=1.2.8,>=1.2.0; (python_version < '3.10') and extra == 'stat'
Requires-Dist: pyreadstat>=1.2.0; (python_version >= '3.10') and extra == 'stat'
Description-Content-Type: text/markdown

<picture>
  <source media="(prefers-color-scheme: dark)" srcset="https://raw.githubusercontent.com/datannur/datannur/main/public/assets/main-banner-dark.png">
  <img alt="datannur logo" src="https://raw.githubusercontent.com/datannur/datannur/main/public/assets/main-banner.png">
</picture>

[![MIT License](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![PyPI version](https://img.shields.io/pypi/v/datannurpy.svg)](https://pypi.org/project/datannurpy/)
[![Python](https://img.shields.io/badge/python-≥3.9-blue.svg)](https://pypi.org/project/datannurpy/)
[![CI](https://github.com/datannur/datannurpy/actions/workflows/ci.yml/badge.svg)](https://github.com/datannur/datannurpy/actions/workflows/ci.yml)
[![codecov](https://codecov.io/gh/datannur/datannurpy/branch/main/graph/badge.svg)](https://codecov.io/gh/datannur/datannurpy)

# datannurpy

datannurpy is the Python builder for [datannur](https://github.com/datannur/datannur) catalogs: it scans files and databases, extracts metadata and statistics, and exports a ready-to-use catalog.

**Key features:**

- **Broad format support** - CSV, Excel, Parquet, Delta Lake, Iceberg, SAS, SPSS, Stata
- **Database introspection** - PostgreSQL, MySQL, Oracle, SQL Server, SQLite, DuckDB
- **Remote & cloud storage** - SFTP, S3, Azure Blob, GCS via fsspec
- **Rich metadata** - schema, statistics, frequencies, enumerations, auto-tagging
- **Incremental scans** - only rescan what changed between runs
- **YAML or Python API** - declarative configuration or programmatic control

## Quick start

```bash
pip install datannurpy
```

```yaml
# catalog.yml
app_path: ./my-catalog
open_browser: true

add:
  - folder: ./data
    include: ["*.csv", "*.xlsx", "*.parquet"]

  - database: sqlite:///mydb.sqlite
```

```bash
python -m datannurpy catalog.yml
```

## Documentation

📖 **Full documentation:** [docs.datannur.com/builder](https://docs.datannur.com/builder/)

🌐 **Website:** [datannur.com](https://datannur.com)

🚀 **Demo:** [dev.datannur.com](https://dev.datannur.com/)

## Contributing

For development documentation and contributing guidelines, see [`CONTRIBUTING.md`](CONTRIBUTING.md).

## License

MIT - see [LICENSE](LICENSE). All dependencies are MIT/Apache 2.0/BSD compatible.
