Metadata-Version: 2.4
Name: kiste
Version: 0.9.2
Summary: Stabilized repo, data, GitOps, cloud, Argo CD, self-hosting, and release inspector
License-Expression: Apache-2.0
Project-URL: Homepage, https://github.com/KisteBox/kiste-py
Project-URL: Repository, https://github.com/KisteBox/kiste-py
Project-URL: Issues, https://github.com/KisteBox/kiste-py/issues
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: github
Requires-Dist: PyGithub; extra == "github"
Provides-Extra: hf
Requires-Dist: huggingface-hub; extra == "hf"
Provides-Extra: sops
Requires-Dist: subprocess-tee; extra == "sops"
Dynamic: license-file

# Kiste

Kiste is a Python CLI for inspecting application repositories and preparing them
for Kubernetes/GitOps packaging.

It can scan local, GitHub, and Hugging Face repositories; combine many repos into
one workspace graph; inspect dataset dependencies; check token capabilities;
generate a GitOps repo; prepare cloud and Argo CD plans; and create SOPS-style
encrypted Kubernetes Secret manifests.

Kiste is also an importable Python library. The CLI is one interface over the
same scanner, workspace, GitOps, token, and secret-management APIs.

## Install

Install `uv` first if it is not already available:

```powershell
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
```

From the repo root:

```bash
uv sync
```

Install provider SDK extras when scanning remote repositories:

```bash
uv sync --extra github
uv sync --extra hf
uv sync --extra sops
uv sync --group dev
```

The GitHub provider prefers PyGithub when the `github` extra is installed. The
Hugging Face provider prefers `huggingface_hub` when the `hf` extra is installed.
Kiste keeps a stdlib HTTP fallback for minimal installs.

If `kiste` is not found after install, either run through Python:

```bash
uv run python -m kiste_cli.main --help
```

or use the uv-managed console script:

```bash
uv run kiste --help
```

## Quick Start

Local repo scan:

```bash
kiste init --name robot-api
kiste scan .
kiste repo summary
kiste check --repo . --target k8s
```

Remote scans:

```bash
kiste scan github owner/repo --ref main
kiste scan github owner/repo --ref main --token github-dev
kiste scan hf owner/model-name
kiste scan hf owner/dataset-name --type dataset
kiste provider hf inspect owner/space-name --type space
```

Token metadata:

```bash
kiste token add github --name github-dev --scope repo:read,actions:read,packages:write --expires 2026-08-01 --metadata-only
kiste token add hf --name hf-dev --scope repo:read --expires 2026-08-01 --metadata-only
kiste token list
kiste token check github-dev --need repo:read
```

Workspace workflow:

```bash
kiste workspace init --name robot-platform
kiste workspace add github owner/frontend --name frontend --type frontend --token github-dev
kiste workspace add github owner/api --name api --type backend --token github-dev
kiste workspace add hf owner/model --name model --type model --token hf-dev
kiste connect api model --relation loads-model
kiste workspace scan
kiste workspace graph
kiste workspace token matrix
kiste workspace check --target k8s
```

Image and target security planning:

```bash
kiste image inspect docker.io/owner/api:latest
kiste image tags docker.io/owner/api
kiste image check-pull docker.io/owner/api:latest
kiste link image api docker.io/owner/api:latest
kiste target plan --service api
kiste workspace check --include-images --target-plan
```

Dataset workflow:

```bash
kiste token add roboflow --name roboflow-dev --scope dataset:read,export:read --expires 2026-08-01 --metadata-only
kiste data add kaggle owner/dataset --name traffic-data
kiste data add roboflow workspace/project/1 --name weld-detection --token roboflow-dev
kiste data add hf owner/dataset --name hf-dataset
kiste data add dvc ./data.dvc --name training-data
kiste data add local ./datasets/weld-seams --name weld-seams
kiste data inspect weld-detection
kiste data scan
kiste data connect weld-detection model --relation trains-model
kiste data check weld-detection
kiste data card weld-detection --output docs/others/datasets/weld-detection.md
kiste data export-plan weld-detection --format yolov8
kiste workspace graph --include-data
kiste workspace check --include-data
```

Standard v0.9.2 contract, bootstrap, cloud, Argo CD, and self-hosting workflow:

```bash
kiste init --name robot-platform
kiste config validate
kiste config print
kiste doctor
kiste module list
kiste module doctor
kiste module lock --write
kiste module verify
kiste bootstrap plan
kiste bootstrap apply --confirm
kiste bootstrap validate
kiste bootstrap status
kiste target plan
kiste bootstrap apply --confirm --level argocd
kiste cloud plan --provider aws
kiste cloud check --provider aws
kiste argocd init --repo-url https://github.com/example/kiste-gitops.git
kiste argocd generate
kiste argocd validate
kiste self init
kiste self scan
kiste self check
kiste self improve
kiste architecture check
kiste architecture graph
kiste architecture doctor
kiste release check
kiste release freeze
kiste release package
kiste release candidate --version 0.9.2-rc1
```

Phase 9.2 / Kiste v0.9.2 follows the canonical public spec in [`KisteBox/kiste_spec`](https://github.com/KisteBox/kiste_spec). In this release, `kiste doctor <path>` is the unified quality gate for config, repo/workspace, lint, security, performance-readiness, integration, release, and configured custom-script diagnostics.

GitOps and secrets workflow:

```bash
kiste gitops init --workspace robot-platform --output ./robot-platform-gitops
kiste secrets init sops --age
kiste gitops plan
kiste gitops generate --env dev
kiste secrets add api --from-env DATABASE_URL --from-env SECRET_KEY --encrypt sops
kiste secrets scan
kiste gitops validate
kiste review
```

## Features

- Stable Python API under `kiste.scanner`, `kiste.workspace`, `kiste.gitops`,
  `kiste.secrets`, and `kiste.tokens`
- Typed data models under `kiste.models`
- Local Docker repo scanning
- GitHub repo scanning through selected important files
- Hugging Face model, dataset, and Space scanning with type inference
- App spec extraction: language, framework, ports, services, env vars, GPU hints
- Token metadata and capability checks
- Multi-repo workspace registry
- Workspace dependency graph with auto and manual connections
- Token access matrix for scan/build/package readiness
- GitOps repo generation with Kustomize-style folders
- SOPS-style encrypted Secret manifests
- Plaintext secret detection before GitOps commit
- Phase 4.5 architecture, command, data model, security, GitOps, and roadmap reviews
- Container image reference validation and mutable-tag/digest warnings
- Source repo to image linking and Kubernetes/serverless target recommendation
- Dataset registry for Kaggle, Roboflow, Hugging Face Datasets, DVC, local
  folders, and cloud object-store references
- Dataset metadata profiling, access checks, lineage connections, dataset-card
  generation, and export/download planning without automatic bulk downloads
- BlobRef metadata inspection and manifest writing under `.kiste/blobs/`
  without automatic large downloads
- Module registry, staged bootstrap, cloud readiness plans, Argo CD
  Application/AppProject generation, and safe self-improvement reports
- Standard `kiste.yaml`, `kiste.lock`, and `.kiste/` contract with static
  plugin metadata and split-ready architecture checks
- Release readiness reports, API/command freeze snapshots, package artifacts,
  release notes, and release candidate checklists
- CI workflow with tests, lint, type check, and CLI smoke test
- Manual GitHub Actions provider-key integration lane that exercises Kiste
  GitHub, Hugging Face, AWS, and Roboflow workflows with real credentials

## Design Docs

Canonical design references:

- [Getting Started](docs/getting-started.md): install, first scan, remote SDK setup, tokens, workspace, data-safe workflow, GitOps, and release checks.
- [Technical Design](docs/technical-design.md): architecture, command groups, data flows, storage, security, bootstrap, modules, and release stabilization.
- [SDK Design](docs/sdk-design.md): Python API, CLI-to-SDK mapping, provider SDK behavior, token API, return types, and SDK extension rules.
- [Provider Key Integrations](docs/others/provider-key-integrations.md): manual GitHub Actions tests that drive Kiste workflows with real provider secrets.

## Python API

Scan a repo:

```python
from kiste.scanner import scan_repo

spec = scan_repo("tests/fixtures/fastapi_docker_repo")
print(spec.app.framework)
print(spec.app.port)
```

Work with a workspace:

```python
from kiste.workspace import load_workspace, scan_workspace, build_graph

workspace = load_workspace(".")
scan_workspace(".")
connections = build_graph(".")
```

Load and check token metadata:

```python
from kiste.tokens import add_token, check_token, list_tokens, load_token_store

add_token(
    "github",
    name="github-dev",
    scopes="repo:read,actions:read",
    expires_at="2026-08-01",
    token_value="ghp_example",
    repo_root=".",
)
for token in list_tokens("."):
    print(token.name, token.provider, token.secret_reference)

report = check_token("github-dev", target="scan")
print(report["status"])

store = load_token_store(".")
print(store["tokens"])
```

Work with datasets:

```python
from kiste.data import add_dataset, check_dataset, inspect_dataset

add_dataset("local", "./datasets/weld-seams", name="weld-seams")
dataset = inspect_dataset("weld-seams")
report = check_dataset("weld-seams")
print(dataset.task_type)
print(report["status"])
```

Generate and validate GitOps output:

```python
from kiste.gitops import generate_gitops_repo, validate_gitops_repo
from kiste.workspace import load_workspace

workspace = load_workspace(".")
generate_gitops_repo(workspace, "./robot-platform-gitops", env="dev")
report = validate_gitops_repo("./robot-platform-gitops")
print(report.ok)
```

Encrypt a SOPS-style secret:

```python
from kiste.secrets import SopsSecretBackend

backend = SopsSecretBackend(config_file=".sops.yaml")
encrypted = backend.encrypt_secret(
    name="api-secret",
    data={"DATABASE_URL": "...", "SECRET_KEY": "..."},
)
```

## Generated Files

Single-repo scan:

```text
kiste.yaml
.kiste/
  spec.json
  scan-report.json
  tokens.json
```

Workspace scan:

```text
.kiste/
  workspace.json
  graph.json
  token-matrix.json
  workspace-report.json
  datasets/
    weld-detection.dataset.json
  repos/
    api.spec.json
    frontend.spec.json
```

GitOps/SOPS setup:

```text
.kiste/
  gitops.json
  secrets.json
  review/
    kiste-phase-4.5-review.md
  release/
    release-check.json
    release-readiness.md
    api-freeze.md
    command-freeze.md
    test-report.md
    security-report.md
    packaging-report.md
    docs-report.md
    release-notes.md
    release-candidate-checklist.md

robot-platform-gitops/
  README.md
  .sops.yaml
  clusters/
  apps/
  infrastructure/
    namespaces/
    network-policies/
    secrets/
      api-secret.enc.yaml
```

## Security Notes

Raw tokens should come from environment variables when possible:

PowerShell:

```powershell
$env:KISTE_TOKEN_GITHUB_DEV="..."
$env:KISTE_TOKEN_HF_DEV="..."
```

Bash/zsh:

```bash
export KISTE_TOKEN_GITHUB_DEV="..."
export KISTE_TOKEN_HF_DEV="..."
```

The local token store is metadata-only and writes `.kiste/tokens.json`. Raw token
values are not stored; records use `env://KISTE_TOKEN_NAME` references. Do not
commit token metadata that reveals private account, project, or expiry details.

GitOps secret files must use the `*.enc.yaml` convention. `kiste secrets scan`
fails on plaintext Kubernetes Secret manifests, `.env` files, `secret.yaml`,
`*-secret.yaml`, and unencrypted YAML under `infrastructure/secrets/`.

Image references are parsed and validated before registry inspection. Kiste warns
on mutable tags such as `latest` and recommends digest pinning for deployment
inputs. Phase 6 planning does not deploy or pull images automatically.

Dataset providers are metadata-first. Kiste records dataset refs, profiles local
and DVC files, checks declared token scopes, generates dataset cards, and plans
exports, but it does not download large datasets automatically.

Phase 8 cloud and Argo CD commands are also safe-by-default. They generate plans,
reports, and YAML only. They do not provision cloud resources, sync production
apps, enable pruning, delete clusters, rotate secrets, or auto-merge changes.

## Development

Install dev dependencies:

```bash
uv sync --group dev
```

Run checks:

```bash
uv run pytest -q
uv run ruff check .
uv run mypy src/kiste
```

Publish to PyPI:

1. Configure PyPI Trusted Publishing for this repository with workflow
   `.github/workflows/pypi-publish.yml` and environment `pypi`.
2. Push a version tag, for example `v0.9.2`.
3. The workflow runs checks, builds `dist/`, validates metadata, and publishes
   through GitHub OIDC. No PyPI API token is required.

Dependency updates:

Dependabot is configured in `.github/dependabot.yml` to update both PyPI
package versions and workflow/action versions weekly:

- PyPI dependencies: `pyproject.toml` and `uv.lock` through the `uv` ecosystem.
- Workflows: GitHub Actions used by CI and PyPI publishing.

Python dev tools, provider SDK extras, and GitHub Actions are grouped into
focused PRs with `pypi`, `uv`, `workflow`, and `github-actions` labels.

Phase 4.5 review reports:

```bash
kiste review
kiste review architecture
kiste review commands
kiste review data-model
kiste review security
kiste review gitops
kiste review roadmap
```

These write Markdown reports to `.kiste/review/`.

## Current Limits

Kiste still does not:

- Deploy to Kubernetes
- Apply Terraform
- Push to a container registry
- Download large datasets automatically
- Provision or delete cloud resources automatically
- Auto-sync Argo CD production apps with pruning
- Auto-merge self-improvement changes
- Commit or push the generated GitOps repo
- Integrate with real SOPS, age, KMS, or Vault cryptography
- Rotate secrets or fetch runtime credentials

Phase 4 intentionally produces a GitOps source tree and validates secret hygiene,
but deployment and production secret backends are later phases.

## License

Kiste uses a split license model:

- Source code is licensed under the Apache License 2.0. See [LICENSE](LICENSE).
- Documentation, including this README and files under [docs/](docs/), is
  licensed under Creative Commons Attribution 4.0 International. See
  [docs/LICENSE.md](docs/LICENSE.md).
