Metadata-Version: 2.4
Name: natc
Version: 0.1.2
Summary: NeuroSymbolic Adaptive Tensor Compression for CPU-first dynamic inference.
Author: NATC Contributors
License-Expression: MIT
Project-URL: Homepage, https://github.com/Jatinverma0786/NATC
Project-URL: Documentation, https://github.com/Jatinverma0786/NATC#readme
Project-URL: Repository, https://github.com/Jatinverma0786/NATC
Keywords: tensor-compression,inference,llm,sparse-attention,neural-cache
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.24
Requires-Dist: torch>=2.1
Requires-Dist: transformers>=4.40
Requires-Dist: accelerate>=0.28
Requires-Dist: sentence-transformers>=2.6
Requires-Dist: faiss-cpu>=1.8
Requires-Dist: numba>=0.59
Requires-Dist: onnxruntime>=1.17
Requires-Dist: safetensors>=0.4
Requires-Dist: scipy>=1.10
Provides-Extra: openvino
Requires-Dist: openvino>=2024.0; extra == "openvino"
Provides-Extra: triton
Requires-Dist: triton>=2.3; extra == "triton"
Provides-Extra: llama-cpp
Requires-Dist: llama-cpp-python>=0.2; extra == "llama-cpp"
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-cov>=5.0; extra == "dev"
Requires-Dist: ruff>=0.4; extra == "dev"
Requires-Dist: mypy>=1.8; extra == "dev"
Requires-Dist: build>=1.2; extra == "dev"
Requires-Dist: twine>=5.0; extra == "dev"
Dynamic: license-file

# NATC

NATC stands for **NeuroSymbolic Adaptive Tensor Compression**. It is an
experimental Python framework for CPU-first dynamic inference architecture:
compress neural tensors into latent "Knowledge DNA", reconstruct required
weights on demand, route prompts through reasoning capsules, reduce attention
work with sparse prediction, and reuse repeated reasoning fragments through a
neural cache.

NATC is not a traditional quantization library. It is a modular research
framework for experimenting with new inference pipelines around compression,
routing, reconstruction, caching, and CPU execution.

## Status

This repository currently contains a working package scaffold with real,
testable implementations for the core NATC modules. It is suitable for local
experiments, API development, benchmarking, and future integration work.

## Installation

From PyPI, once the package is published:

```bash
pip install natc
```

From this repository:

```bash
git clone https://github.com/Jatinverma0786/NATC.git
cd NATC
pip install -e ".[dev]"
```

## Quick Start

```python
from natc import NATCModel

model = NATCModel.from_pretrained("distilgpt2")
model.enable_dna()
model.enable_capsules()
model.enable_sparse_attention()
model.enable_cache()
model.enable_cpu_acceleration()

response = model.generate("Explain quantum mechanics")
print(response)
```

For offline experiments, build from a local state dictionary:

```python
import numpy as np
from natc import NATCModel

state = {
    "embed.weight": np.random.default_rng(0).normal(size=(32, 16)),
    "mlp.weight": np.random.default_rng(1).normal(size=(16, 16)),
}

model = NATCModel.from_state_dict(state, rank=8)
print(model.generate("Explain a Python matrix multiplication pattern"))
print(model.stats())
```

## Main Features

- **Public NATC model API**
  - `NATCModel.from_pretrained(...)`
  - `NATCModel.from_state_dict(...)`
  - `NATCModel.generate(...)`
  - `enable_dna()`
  - `enable_capsules()`
  - `enable_sparse_attention()`
  - `enable_cache()`
  - `enable_cpu_acceleration()`
  - `export_dna(...)` and `import_dna(...)`
  - runtime statistics through `model.stats()`

- **Knowledge DNA encoder**
  - low-rank tensor decomposition with SVD
  - tensor clustering for row-level pattern summaries
  - model/state-dict encoding
  - tensor fingerprints
  - reconstruction quality metrics per layer
  - compression ratio reporting
  - dense reconstruction through `DNADecoder`
  - portable `.npz` export/import
  - safetensors import/export helpers

- **Dynamic weight synthesis**
  - on-demand reconstruction of selected layers
  - sparse materialization by thresholding small weights
  - prompt-aware synthesis hook
  - runtime model loading for torch-style modules

- **Reasoning capsules**
  - prompt scoring and expert routing
  - built-in capsules for math, code, creative writing, science, and medical context
  - activation vectors for downstream routing
  - capsule execution summaries

- **Predictive sparse attention**
  - token routing from deterministic text embeddings
  - embedding routing for precomputed vectors
  - top-k sparse graph construction
  - threshold, local-window, and hybrid sparse graph policies
  - sparse scaled dot-product attention
  - adjacency and weight outputs for inspection

- **Neural cache layer**
  - RAM cache plus persistent SQLite fragment store
  - deterministic cache keys
  - similarity lookup from hashing-vector text embeddings
  - optional FAISS-backed similarity lookup when `faiss-cpu` is installed
  - cache hit ratio tracking
  - LRU, FIFO, and LFU RAM eviction policies
  - reusable pattern storage for prompts, reasoning fragments, and outputs

- **Fractal tensor storage**
  - recursive tensor block encoding
  - low-variance region compression
  - dense reconstruction
  - node counting and estimated compression ratio

- **Inference compiler**
  - prompt analysis
  - task detection
  - capsule selection
  - required layer planning
  - sparse attention graph planning
  - execution cost estimation
  - plan optimization

- **CPU optimization layer**
  - CPU capability detection
  - portable NUMA topology detection
  - best-effort CPU affinity pinning
  - SIMD-aware kernel selection labels
  - numpy-backed matrix multiplication
  - optional numba-accelerated matrix multiplication hot path
  - batched matrix multiplication and vector dot helpers
  - quantized matrix multiplication helper
  - sparse thresholding
  - threaded execution engine for work units

- **HuggingFace integration**
  - lazy `transformers` loading
  - `AutoModelForCausalLM` support
  - `AutoTokenizer` support
  - adapter through `natc.integrations.huggingface`
  - offline deterministic fallback backend for local tests
  - local sentence-transformer prompt embeddings through `NATC_EMBEDDING_MODEL`

- **Optional runtime integrations**
  - ONNX Runtime adapter
  - OpenVINO adapter
  - llama.cpp adapter

- **Benchmark suite**
  - synthetic tensor compression benchmark
  - local HuggingFace checkpoint benchmark mode
  - memory estimate reporting
  - latency and throughput reporting
  - tokens/sec from the generation path
  - cache hit ratio reporting
  - reconstruction error reporting
  - JSON and markdown benchmark report export
  - CLI entry point: `natc-benchmark`

- **Packaging and CI**
  - `pyproject.toml` package metadata
  - typed package marker with `py.typed`
  - pytest test suite
  - GitHub Actions build/test workflow
  - PyPI Trusted Publishing workflow using GitHub OIDC
  - security workflow with dependency audit and OSSF Scorecard
  - GitHub release notes workflow
  - version bump and changelog helper scripts
  - API documentation generator

## Module Map

```text
natc/
  dna/            Knowledge DNA encoding, decoding, export, import
  synthesis/      Dynamic weight reconstruction and sparse materialization
  capsules/       Prompt-routed reasoning capsules
  attention/      Predictive sparse attention routing
  cache/          RAM and SQLite neural fragment cache
  fractal/        Recursive fractal tensor storage
  compiler/       Prompt-to-execution-plan compiler
  cpu/            CPU kernels and threaded execution
  integrations/   HuggingFace integration
  benchmark/      Benchmark runner and CLI
  scripts/        Docs, version, and changelog automation
  model.py        Public NATCModel facade
  config.py       Runtime configuration
  utils.py        Shared helpers
```

## Knowledge DNA Example

```python
import numpy as np
from natc.dna import encode_model, decode_model, export_dna, import_dna

state = {
    "linear.weight": np.random.default_rng(0).normal(size=(64, 32)),
}

dna = encode_model(state, rank=8)
reconstructed = decode_model(dna)

print(dna.compression_ratio())
print(reconstructed["linear.weight"].shape)

export_dna(dna, "model-dna.npz")
loaded = import_dna("model-dna.npz")
```

## Compiler Example

```python
from natc.compiler import InferenceCompiler

compiler = InferenceCompiler()
plan = compiler.compile_prompt("Write Python code to solve a matrix problem")
plan = compiler.optimize_execution(plan)

print(plan.task)
print(plan.capsules)
print(plan.required_layers)
```

## Benchmark

```bash
natc-benchmark --layers 4 --rows 128 --cols 128 --rank 16
```

Run a local HuggingFace checkpoint benchmark:

```bash
natc-benchmark --model-name distilgpt2 --allow-download --max-tensors 12 --output reports/natc.md
```

Example output:

```json
{
  "compression_ratio": 1.84,
  "memory_saved": 0.25,
  "speedup": 1.12,
  "cpu_efficiency": 0.07,
  "tokens_per_sec": 3000.0,
  "cache_hit_ratio": 0.5,
  "latency": 0.02,
  "throughput": 190.0,
  "reconstruction_error": 0.01,
  "layer_errors": {}
}
```

## Documentation Automation

```bash
python scripts/generate_api_docs.py
python scripts/bump_version.py 0.2.0
python scripts/update_changelog.py 0.2.0 --since v0.1.0
```

## Tests

```bash
pip install -e ".[dev]"
pytest
```

## Completed Roadmap

- Larger benchmark scenarios for local transformer checkpoints.
- Reconstruction quality metrics per layer.
- Local sentence-transformer prompt embeddings through `NATC_EMBEDDING_MODEL`.
- Optional FAISS-backed similarity indexing for cache lookup.
- Safetensors import/export helpers.
- ONNX Runtime and OpenVINO execution adapters.
- Optional llama.cpp integration.
- Advanced sparse attention policies beyond top-k similarity.
- NUMA inspection and best-effort CPU affinity controls.
- Additional CPU kernels and optional numba hot path.
- Capsule plugin registration for custom experts.
- LRU, FIFO, and LFU cache eviction policies.
- JSON and markdown benchmark report export.
- API documentation generation.
- Examples for HuggingFace, cache, compiler plans, and DNA files.
- Release notes and changelog automation.
- Security and supply-chain checks in CI.
- Package version bump helper automation.

## External Release Step

The code and GitHub Actions workflow are ready for PyPI Trusted Publishing. The
actual PyPI upload still depends on GitHub Actions being allowed to run for the
repository account and the PyPI Trusted Publisher configuration being active.

## License

MIT License. See [LICENSE](LICENSE).
