Metadata-Version: 2.4
Name: bijux-canon-ingest
Version: 0.3.6
Summary: Deterministic document ingestion, chunking, and retrieval preparation for the bijux-canon package family from Bijux.
Project-URL: Homepage, https://bijux.io/bijux-canon/bijux-canon-ingest/
Project-URL: Website, https://bijux.io/
Project-URL: Repository, https://github.com/bijux/bijux-canon
Project-URL: Documentation, https://bijux.io/bijux-canon/bijux-canon-ingest/
Project-URL: Issues, https://github.com/bijux/bijux-canon/issues
Project-URL: Changelog, https://github.com/bijux/bijux-canon/blob/main/packages/bijux-canon-ingest/CHANGELOG.md
Project-URL: Security, https://github.com/bijux/bijux-canon/blob/main/SECURITY.md
Project-URL: Funding, https://github.com/sponsors/bijux
Project-URL: PackageMap, https://bijux.io/bijux-canon/01-bijux-canon/foundation/package-map/
Project-URL: CompatibilityGuide, https://bijux.io/bijux-canon/08-compat-packages/migration/migration-guidance/
Author-email: Bijan Mousavi <bijan@bijux.io>
Maintainer-email: Bijan Mousavi <bijan@bijux.io>
License: Apache-2.0
License-File: LICENSE
License-File: NOTICE
Keywords: bijux,bijux-canon,bijux.io,chunking,document-processing,indexing,ingest,retrieval,retrieval-augmented-generation
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: <4.0,>=3.11
Requires-Dist: fastapi<1.0,>=0.115
Requires-Dist: msgpack<2,>=1.0
Requires-Dist: numpy<3,>=1.26
Requires-Dist: pydantic<3,>=2.7
Requires-Dist: pyyaml<7,>=6
Requires-Dist: uvicorn<1.0,>=0.32
Provides-Extra: dev
Requires-Dist: bandit<2.0,>=1.7.10; extra == 'dev'
Requires-Dist: build<2.0,>=1.2; extra == 'dev'
Requires-Dist: deptry<1.0,>=0.10.0; extra == 'dev'
Requires-Dist: hypothesis<7.0,>=6.91; extra == 'dev'
Requires-Dist: interrogate<2.0,>=1.7.0; extra == 'dev'
Requires-Dist: mkdocs-click<1,>=0.8; extra == 'dev'
Requires-Dist: mkdocs-gen-files<1,>=0.5; extra == 'dev'
Requires-Dist: mkdocs-git-revision-date-localized-plugin<2,>=1.2; extra == 'dev'
Requires-Dist: mkdocs-glightbox>=0.4.0; extra == 'dev'
Requires-Dist: mkdocs-include-markdown-plugin; extra == 'dev'
Requires-Dist: mkdocs-literate-nav<1,>=0.6; extra == 'dev'
Requires-Dist: mkdocs-material<10,>=9.5; extra == 'dev'
Requires-Dist: mkdocs-minify-plugin<1,>=0.8; extra == 'dev'
Requires-Dist: mkdocs-redirects<2,>=1.2; extra == 'dev'
Requires-Dist: mkdocs-section-index<1,>=0.3; extra == 'dev'
Requires-Dist: mkdocs<2,>=1.6; extra == 'dev'
Requires-Dist: mkdocstrings[python]<2,>=0.29; extra == 'dev'
Requires-Dist: mypy<2.0,>=1.11; extra == 'dev'
Requires-Dist: openapi-spec-validator<1.0,>=0.7.1; extra == 'dev'
Requires-Dist: pip-audit<3.0,>=2.7.3; extra == 'dev'
Requires-Dist: prance>=25.4.0.0; extra == 'dev'
Requires-Dist: pymdown-extensions<11,>=10.8; extra == 'dev'
Requires-Dist: pytest-asyncio<2.0,>=1.0.0; extra == 'dev'
Requires-Dist: pytest-cov<8.0,>=6.0; extra == 'dev'
Requires-Dist: pytest-timeout<3.0,>=2.3; extra == 'dev'
Requires-Dist: pytest<10.0,>=9.0.3; extra == 'dev'
Requires-Dist: ruff<1.0,>=0.6.9; extra == 'dev'
Requires-Dist: schemathesis<5.0,>=4.0; extra == 'dev'
Requires-Dist: twine<7.0,>=6.1.0; extra == 'dev'
Requires-Dist: types-pyyaml<7.0,>=6.0.12; extra == 'dev'
Requires-Dist: vulture<3.0,>=2.7; extra == 'dev'
Provides-Extra: docs
Requires-Dist: mkdocs-click<1,>=0.8; extra == 'docs'
Requires-Dist: mkdocs-gen-files<1,>=0.5; extra == 'docs'
Requires-Dist: mkdocs-git-revision-date-localized-plugin<2,>=1.2; extra == 'docs'
Requires-Dist: mkdocs-glightbox>=0.4.0; extra == 'docs'
Requires-Dist: mkdocs-literate-nav<1,>=0.6; extra == 'docs'
Requires-Dist: mkdocs-material<10,>=9.5; extra == 'docs'
Requires-Dist: mkdocs-minify-plugin<1,>=0.8; extra == 'docs'
Requires-Dist: mkdocs-redirects<2,>=1.2; extra == 'docs'
Requires-Dist: mkdocs-section-index<1,>=0.3; extra == 'docs'
Requires-Dist: mkdocs<2,>=1.6; extra == 'docs'
Requires-Dist: mkdocstrings[python]<2,>=0.29; extra == 'docs'
Requires-Dist: pymdown-extensions<11,>=10.8; extra == 'docs'
Description-Content-Type: text/markdown

# bijux-canon-ingest

<!-- bijux-canon-badges:generated:start -->
[![Python 3.11+](https://img.shields.io/badge/python-3.11%2B-3776AB?logo=python&logoColor=white)](https://pypi.org/project/bijux-canon-ingest/)
[![Typing: typed](https://img.shields.io/badge/typing-typed%20(PEP%20561)-0A7BBB)](https://pypi.org/project/bijux-canon-ingest/)
[![License: Apache-2.0](https://img.shields.io/badge/license-Apache--2.0-0F766E)](https://github.com/bijux/bijux-canon/blob/main/LICENSE)
[![CI Status](https://github.com/bijux/bijux-canon/actions/workflows/verify.yml/badge.svg)](https://github.com/bijux/bijux-canon/actions/workflows/verify.yml)
[![GitHub Repository](https://img.shields.io/badge/github-bijux%2Fbijux--canon-181717?logo=github)](https://github.com/bijux/bijux-canon)

[![bijux-canon-ingest](https://img.shields.io/pypi/v/bijux-canon-ingest?label=ingest&logo=pypi)](https://pypi.org/project/bijux-canon-ingest/)
[![bijux-canon-runtime](https://img.shields.io/pypi/v/bijux-canon-runtime?label=runtime&logo=pypi)](https://pypi.org/project/bijux-canon-runtime/)
[![bijux-canon-agent](https://img.shields.io/pypi/v/bijux-canon-agent?label=agent&logo=pypi)](https://pypi.org/project/bijux-canon-agent/)
[![bijux-canon-reason](https://img.shields.io/pypi/v/bijux-canon-reason?label=reason&logo=pypi)](https://pypi.org/project/bijux-canon-reason/)
[![bijux-canon-index](https://img.shields.io/pypi/v/bijux-canon-index?label=index&logo=pypi)](https://pypi.org/project/bijux-canon-index/)
[![agentic-flows](https://img.shields.io/pypi/v/agentic-flows?label=agentic--flows&logo=pypi)](https://pypi.org/project/agentic-flows/)
[![bijux-agent](https://img.shields.io/pypi/v/bijux-agent?label=bijux--agent&logo=pypi)](https://pypi.org/project/bijux-agent/)
[![bijux-rag](https://img.shields.io/pypi/v/bijux-rag?label=bijux--rag&logo=pypi)](https://pypi.org/project/bijux-rag/)
[![bijux-rar](https://img.shields.io/pypi/v/bijux-rar?label=bijux--rar&logo=pypi)](https://pypi.org/project/bijux-rar/)
[![bijux-vex](https://img.shields.io/pypi/v/bijux-vex?label=bijux--vex&logo=pypi)](https://pypi.org/project/bijux-vex/)

[![bijux-canon-ingest](https://img.shields.io/badge/ingest-ghcr-181717?logo=github)](https://github.com/bijux/bijux-canon/pkgs/container/bijux-canon%2Fbijux-canon-ingest)
[![bijux-canon-runtime](https://img.shields.io/badge/runtime-ghcr-181717?logo=github)](https://github.com/bijux/bijux-canon/pkgs/container/bijux-canon%2Fbijux-canon-runtime)
[![bijux-canon-agent](https://img.shields.io/badge/agent-ghcr-181717?logo=github)](https://github.com/bijux/bijux-canon/pkgs/container/bijux-canon%2Fbijux-canon-agent)
[![bijux-canon-reason](https://img.shields.io/badge/reason-ghcr-181717?logo=github)](https://github.com/bijux/bijux-canon/pkgs/container/bijux-canon%2Fbijux-canon-reason)
[![bijux-canon-index](https://img.shields.io/badge/index-ghcr-181717?logo=github)](https://github.com/bijux/bijux-canon/pkgs/container/bijux-canon%2Fbijux-canon-index)
[![agentic-flows](https://img.shields.io/badge/agentic--flows-ghcr-181717?logo=github)](https://github.com/bijux/bijux-canon/pkgs/container/bijux-canon%2Fagentic-flows)
[![bijux-agent](https://img.shields.io/badge/bijux--agent-ghcr-181717?logo=github)](https://github.com/bijux/bijux-canon/pkgs/container/bijux-canon%2Fbijux-agent)
[![bijux-rag](https://img.shields.io/badge/bijux--rag-ghcr-181717?logo=github)](https://github.com/bijux/bijux-canon/pkgs/container/bijux-canon%2Fbijux-rag)
[![bijux-rar](https://img.shields.io/badge/bijux--rar-ghcr-181717?logo=github)](https://github.com/bijux/bijux-canon/pkgs/container/bijux-canon%2Fbijux-rar)
[![bijux-vex](https://img.shields.io/badge/bijux--vex-ghcr-181717?logo=github)](https://github.com/bijux/bijux-canon/pkgs/container/bijux-canon%2Fbijux-vex)

[![bijux-canon-ingest docs](https://img.shields.io/badge/docs-ingest-2563EB?logo=materialformkdocs&logoColor=white)](https://bijux.io/bijux-canon/bijux-canon-ingest/)
[![bijux-canon-runtime docs](https://img.shields.io/badge/docs-runtime-2563EB?logo=materialformkdocs&logoColor=white)](https://bijux.io/bijux-canon/bijux-canon-runtime/)
[![bijux-canon-agent docs](https://img.shields.io/badge/docs-agent-2563EB?logo=materialformkdocs&logoColor=white)](https://bijux.io/bijux-canon/bijux-canon-agent/)
[![bijux-canon-reason docs](https://img.shields.io/badge/docs-reason-2563EB?logo=materialformkdocs&logoColor=white)](https://bijux.io/bijux-canon/bijux-canon-reason/)
[![bijux-canon-index docs](https://img.shields.io/badge/docs-index-2563EB?logo=materialformkdocs&logoColor=white)](https://bijux.io/bijux-canon/bijux-canon-index/)
<!-- bijux-canon-badges:generated:end -->

`bijux-canon-ingest` is the package that turns raw documents into deterministic
ingest artifacts and retrieval-ready structures. It is where cleaning,
chunking, package-local retrieval assembly, and ingest-facing boundaries live.

This package should help a maintainer answer practical questions such as:

- how does a source document become stable ingest output
- where do retrieval-oriented assembly steps belong
- which code is pure transformation logic and which code is adapter work

## Legacy continuity

- compatibility package: [`bijux-rag`](https://pypi.org/project/bijux-rag/)
- legacy import root: `bijux_rag`
- legacy command: `bijux-rag`
- canonical migration guide: <https://bijux.io/bijux-canon/08-compat-packages/migration/migration-guidance/>
- retired repository target: <https://github.com/bijux/bijux-rag>

## What this package owns

- document cleaning, normalization, and chunking
- ingest-local retrieval and indexing assembly
- package-local CLI and HTTP boundaries
- ingest-specific adapters, safeguards, and observability helpers

## What this package does not own

- standalone vector execution semantics
- runtime-wide governance, persistence, or replay authority
- repository tooling and release automation

## Source map

- [`src/bijux_canon_ingest/processing`](https://github.com/bijux/bijux-canon/tree/main/packages/bijux-canon-ingest/src/bijux_canon_ingest/processing) for deterministic document transforms
- [`src/bijux_canon_ingest/retrieval`](https://github.com/bijux/bijux-canon/tree/main/packages/bijux-canon-ingest/src/bijux_canon_ingest/retrieval) for retrieval-oriented models and assembly
- [`src/bijux_canon_ingest/application`](https://github.com/bijux/bijux-canon/tree/main/packages/bijux-canon-ingest/src/bijux_canon_ingest/application) for package workflows
- [`src/bijux_canon_ingest/infra`](https://github.com/bijux/bijux-canon/tree/main/packages/bijux-canon-ingest/src/bijux_canon_ingest/infra) and [`src/bijux_canon_ingest/integrations`](https://github.com/bijux/bijux-canon/tree/main/packages/bijux-canon-ingest/src/bijux_canon_ingest/integrations) for adapters
- [`src/bijux_canon_ingest/interfaces`](https://github.com/bijux/bijux-canon/tree/main/packages/bijux-canon-ingest/src/bijux_canon_ingest/interfaces) for CLI and HTTP edges
- [`tests`](https://github.com/bijux/bijux-canon/tree/main/packages/bijux-canon-ingest/tests) for behavior, layout, and corpus-backed checks

## Read this next

- [Package guide](https://bijux.io/bijux-canon/bijux-canon-ingest/)
- [Package overview](https://bijux.io/bijux-canon/bijux-canon-ingest/foundation/package-overview/)
- [Ownership boundary](https://bijux.io/bijux-canon/bijux-canon-ingest/foundation/ownership-boundary/)
- [Architecture overview](https://bijux.io/bijux-canon/bijux-canon-ingest/architecture/)
- [Operator workflows](https://bijux.io/bijux-canon/bijux-canon-ingest/interfaces/operator-workflows/)
- [Compatibility packages](https://bijux.io/bijux-canon/08-compat-packages/)
- [Changelog](https://github.com/bijux/bijux-canon/blob/main/packages/bijux-canon-ingest/CHANGELOG.md)

## Primary entrypoint

- console script: `bijux-canon-ingest`

## Release Readiness

- upcoming release line: `0.3.6`
- package changelog: [`CHANGELOG.md`](CHANGELOG.md)
