Metadata-Version: 2.4
Name: stem-ai
Version: 1.8.1
Summary: STEM BIO-AI deterministic evidence-surface scanner for bio/medical AI repositories
Author: Flamehaven
License-Expression: Apache-2.0
Project-URL: Homepage, https://github.com/flamehaven01/STEM-BIO-AI
Project-URL: Issues, https://github.com/flamehaven01/STEM-BIO-AI/issues
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: bio
Requires-Dist: rdkit>=2023.3; extra == "bio"
Provides-Extra: pdf
Requires-Dist: reportlab>=4.0; extra == "pdf"
Provides-Extra: demo
Requires-Dist: gradio<6,>=5.0; extra == "demo"
Requires-Dist: reportlab>=4.0; extra == "demo"
Dynamic: license-file

# STEM BIO-AI

<p align="center">
  <img src="docs/assets/logo.png" alt="STEM BIO-AI logo" width="390">
</p>

<p align="center">
  <b>Deterministic evidence-surface scanner for bio/medical AI repositories.</b><br>
  No LLM. No API key. No model runtime. No secrets sent anywhere.
</p>

<p align="center">
  <a href="https://github.com/flamehaven01/STEM-BIO-AI/actions/workflows/python-package.yml"><img src="https://github.com/flamehaven01/STEM-BIO-AI/actions/workflows/python-package.yml/badge.svg" alt="CI"></a>
  <a href="CHANGELOG.md"><img src="https://img.shields.io/badge/stable-v1.8.1-informational.svg" alt="v1.8.1"></a>
  <a href="pyproject.toml"><img src="https://img.shields.io/badge/python-3.9%2B-blue.svg" alt="Python 3.9+"></a>
  <a href="https://pypi.org/project/stem-ai/"><img src="https://img.shields.io/pypi/v/stem-ai.svg" alt="PyPI"></a>
  <a href="LICENSE"><img src="https://img.shields.io/badge/license-Apache--2.0-blue.svg" alt="Apache 2.0"></a>
  <a href="https://huggingface.co/spaces/Flamehaven/stem-bio-ai"><img src="https://img.shields.io/badge/demo-Hugging%20Face%20Space-yellow.svg" alt="HF Space"></a>
  <a href="https://doi.org/10.5281/zenodo.20154479"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.20154479.svg" alt="DOI"></a>
</p>

---

**Navigation:**
[Why](#why-stem-bio-ai) •
[Quick Start](#quick-start) •
[Verification](#verification-path) •
[Architecture](docs/ARCHITECTURE.md) •
[Trust Boundary](#runtime--security--compliance-boundary) •
[CLI Reference](docs/CLI_REFERENCE.md) •
[Scoring Rationale](docs/SCORING_RATIONALE.md)

---

## Why STEM BIO-AI

Bio and medical AI repositories vary enormously in evidence quality — from rigorous academic tools to marketing-grade demos that carry clinical language with no data provenance, no reproducibility path, and no clinical-use disclaimer. Manual review is slow and inconsistent.

STEM BIO-AI scans the **observable repository surface** — README, docs, code structure, CI configuration, dependency manifests, changelogs — and maps detected signals to a structured evidence tier (T0–T4). The scan runs in seconds on a local clone, produces machine-readable JSON and PDF reports, and makes every scoring decision traceable to a specific file, line, and pattern.

> A T4 score means strong observable evidence signals. It does not mean the repository is safe for clinical deployment — that requires independent expert validation.

---

## Quick Start

```bash
git clone https://github.com/flamehaven01/STEM-BIO-AI.git
cd STEM-BIO-AI
pip install stem-ai
```

```bash
# editable local install with PDF output support
pip install -e .[pdf]

# fastest path: scan a local repository
stem /path/to/bio-ai-repo

# 7-page full evidence packet with proof trace
stem scan /path/to/bio-ai-repo --level 3 --format all --explain
```

```bash
# workflow-oriented CLI
stem scan /path/to/bio-ai-repo --level 2
stem scan /path/to/bio-ai-repo --policy strict_clinical_adjacency
stem gate /path/to/bio-ai-repo --min-tier T2
stem policy list
stem policy explain strict_clinical_adjacency
stem policy derive --clinical-strictness 4 --code-integrity-priority 3 --reproducibility-priority 2 --structured-limitations-requirement 3
stem policy simulate /path/to/bio-ai-repo --clinical-strictness 4 --code-integrity-priority 3 --reproducibility-priority 2 --structured-limitations-requirement 3
stem policy simulate /path/to/bio-ai-repo --profile-file policy/drafts/scoring_profile.reproducibility_first.v1.json
stem advisory validate /path/to/bio-ai-repo
stem advisory packet /path/to/bio-ai-repo --output advisory_out
stem advisory check-response /path/to/bio-ai-repo --response provider_advisory.json
```

```bash
# backward-compatible shortcuts still work
stem /path/to/bio-ai-repo --level 3 --format all --explain
stem audit /path/to/bio-ai-repo --tier-gate T3 --quiet
```

Clone the target repository first; the CLI operates on local paths only.

Calibration profiles are implemented in `mirror_only` mode in `1.8.0`. `--policy` changes what profile is surfaced in artifacts, while `policy derive` and `policy simulate` provide governed preview lanes without mutating the authoritative deterministic score path. `policy simulate --profile-file <path>` allows local schema-valid profile experiments without registering a new named policy. In the current rule scope, `strict_clinical_adjacency` is the only release-grade named recommendation; stronger reproducibility postures still fall back to `preview_only` simulation deltas rather than a named profile.

Researchers and domain specialists are expected to influence calibration through `derive`, `simulate`, and documented preview/profile proposals. The intent interview uses a governed `1–5` posture scale, while official score-affecting policy changes still require profile promotion rather than direct ad hoc tuning.

Full CLI reference: [`docs/CLI_REFERENCE.md`](docs/CLI_REFERENCE.md)

## Verification Path

Use the same verification surface exposed in CI and package smoke tests:

```bash
pip install -e ".[pdf]"
python -m py_compile stem_ai/cli.py stem_ai/scanner.py stem_ai/render.py stem_ai/app.py
stem --help
python -m stem_ai --help
python -m pytest -q
python -m build
```

Primary references:

- [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md)
- [`docs/API_CONTRACT.md`](docs/API_CONTRACT.md)
- [`docs/SCORING_RATIONALE.md`](docs/SCORING_RATIONALE.md)
- [`docs/ADVISORY_RUNTIME.md`](docs/ADVISORY_RUNTIME.md)
- [`SECURITY.md`](SECURITY.md)

## Document Map

Use these docs by review purpose:

**Core operation**
- [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md)
- [`docs/CLI_REFERENCE.md`](docs/CLI_REFERENCE.md)
- [`docs/DETERMINISTIC_DIAGNOSTICS.md`](docs/DETERMINISTIC_DIAGNOSTICS.md)
- [`docs/UI_HTML_REPORT.md`](docs/UI_HTML_REPORT.md)

**Scoring and evidence**
- [`docs/SCORING_RATIONALE.md`](docs/SCORING_RATIONALE.md)
- [`docs/EXAMPLE_AUDITS.md`](docs/EXAMPLE_AUDITS.md)
- [`docs/CALIBRATION_PROFILE_DESIGN.md`](docs/CALIBRATION_PROFILE_DESIGN.md)
- [`docs/regulatory_basis_registry.v1.json`](docs/regulatory_basis_registry.v1.json)

**Trust boundary and governance**
- [`SECURITY.md`](SECURITY.md)
- [`docs/API_CONTRACT.md`](docs/API_CONTRACT.md)
- [`docs/ADVISORY_RUNTIME.md`](docs/ADVISORY_RUNTIME.md)
- [`docs/ADVISORY_SECRET_HANDLING.md`](docs/ADVISORY_SECRET_HANDLING.md)
- [`docs/REGULATORY_MAPPING.md`](docs/REGULATORY_MAPPING.md)
- [`docs/AIRI_DATA_GOVERNANCE.md`](docs/AIRI_DATA_GOVERNANCE.md)
- [`docs/THIRD_PARTY_DATA.md`](docs/THIRD_PARTY_DATA.md)

**Public proof surfaces**
- Demo: [Hugging Face Space](https://huggingface.co/spaces/Flamehaven/stem-bio-ai)
- Example audits: [`docs/EXAMPLE_AUDITS.md`](docs/EXAMPLE_AUDITS.md)
- Scoring rationale: [`docs/SCORING_RATIONALE.md`](docs/SCORING_RATIONALE.md)


---

## Triage Tiers

- **T0 Rejected (0–39):** insufficient evidence — do not rely on without independent expert validation
- **T1 Quarantine (40–54):** exploratory review only — expert validation required before any use
- **T2 Caution (55–69):** research reference and supervised non-clinical technical review only
- **T3 Supervised (70–84):** supervised institutional review candidate
- **T4 Candidate (85–100):** strong evidence posture — clinical deployment still requires independent validation

Clinical-adjacent repositories without an explicit disclaimer are **hard-capped at T2** (score ≤ 69).
Repositories with unbounded CA-DIRECT claims are **hard-capped at T0** (score ≤ 39).

Tier boundary derivation and calibration gap disclosures: [`docs/SCORING_RATIONALE.md`](docs/SCORING_RATIONALE.md).

---

## Scoring Model

```
Final = (Stage 1 × 0.40) + (Stage 2R × 0.20) + (Stage 3 × 0.40) − C1 Penalty
```

| Stage | Weight | What Is Measured |
|-------|-------:|-----------------|
| **Stage 1** README Evidence | 40% | Bio-domain vocabulary; H1–H6 hype-claim penalties; R1–R5 responsibility signals (limitations, regulatory framing, clinical disclaimer, demographic-bias, reproducibility) |
| **Stage 2R** Repo-Local Consistency | 20% | Vocabulary overlap across README, docs, package metadata, CI, and tests; limitation repetition; contradiction, staleness, and unsupported-workflow deductions |
| **Stage 3** Code/Bio Responsibility | 40% | CI presence; domain test coverage; changelog hygiene (T3); data provenance and IRB/dataset citation (B1); bias/limitation measurement evidence (B2); conflict-of-interest disclosure (B3) |
| **Stage 4** Replication Evidence | Separate lane | Containers; reproducibility targets; dependency locks/pins; dataset and model artifact references; seed, CLI, and citation signals; license/use-scope restrictions |
| **C1–C6** Code Integrity | Penalty / advisory | Hardcoded credentials (C1, −10 pts); dependency pinning and external-service fragility (C2); deprecated patient-adjacent paths (C3); fail-open exception handlers (C4); compliance and clinical-boundary integrity (C5); mock-auth or no-auth local/self-host boundary warnings (C6) |

Stage 4 is reported as `replication_score` / `replication_tier` and does **not** affect `score.final_score`. Full scoring rationale and calibration gap disclosures are in [`docs/SCORING_RATIONALE.md`](docs/SCORING_RATIONALE.md).

---

## Architecture

```mermaid
flowchart LR
    A[Target repository] --> B[LOCAL_ANALYSIS scanner]
    B --> C[Stage 1\nREADME evidence]
    B --> D[Stage 2R\nRepo-local consistency]
    B --> E[Stage 3\nCode/bio responsibility]
    B --> F[Stage 4\nReplication lane]
    B --> K[C1–C6\nCode integrity]
    B --> CC[CC1–CC3\nAST contract detectors]
    C --> G[Weighted evidence score]
    D --> G
    E --> G
    K --> G
    CC --> R[code_contract + AIRI coverage]
    F --> H[replication_score / tier]
    G --> I[Canonical JSON result]
    H --> I
    R --> I
    I --> L[Evidence ledger]
    I --> M[Explain trace]
    I --> N[Markdown report]
    I --> O[PDF packets 1p / 5p / 7p]
    I --> P[Interactive HTML dashboard]
```

Core modules: `stem_ai/scanner.py`, `stem_ai/render.py`, `stem_ai/cli.py`, `stem_ai/detectors.py`, `stem_ai/detector_surface.py`, `stem_ai/detector_ast.py`, `stem_ai/detector_bio.py`, `stem_ai/detector_contract.py`, `stem_ai/detector_stage4.py`, `stem_ai/evidence.py`, `stem_ai/airi_risk_mapping.py`, `stem_ai/app.py`

---

## Output Artifacts

Each run writes to `--out DIR` (default: `stem_output/`).
The plain `stem <repo>` and `stem scan <repo>` path now defaults to `--level 3`, which emits the full 7-page evidence packet unless you select a lower level explicitly.
`audits/` is retained only for historical benchmark and reference artifacts; routine CLI output should land in `stem_output/<repo_slug>/`.

| Level | Pages | Audience | Artifacts |
|-------|------:|---------|-----------|
| `--level 1` | 1 | Executive / triage (legacy) | Score, tier, stage cards, code integrity summary |
| `--level 2` | 5 | Standard audit review | Level 1 + Stage 1/2R/3/4 breakdown, AIRI summary, closeout page |
| `--level 3` | 7 | Full evidence packet | Level 2 + Stage 4 replication page, code integrity deep dive, remediation roadmap, metadata page |

```
<repo>_experiment_results.json   # machine-readable score + full evidence object
<repo>_report.html               # interactive 5-section HTML dashboard (v1.7.0+)
<repo>_report.md                 # human-readable audit report
<repo>_brief_1p.pdf              # Level 1 executive dashboard
<repo>_detailed_5p.pdf           # Level 2 standard review packet
<repo>_detailed_7p.pdf           # Level 3 full review packet
<repo>_explain.txt               # --explain: file/line/snippet proof trace
```

---

## HTML Report Dashboard

`--format html` generates a self-contained interactive dashboard (v1.7.0+). Single `.html` file — no network, no external dependencies.

<p align="center">
  <img src="docs/assets/html_report_preview.png" alt="STEM BIO-AI interactive HTML dashboard" width="760">
</p>

**Example interactive HTML audit**
- Open in browser: <https://htmlpreview.github.io/?https://raw.githubusercontent.com/flamehaven01/STEM-BIO-AI/main/docs/assets/report-preview/yorkeccak_bio_report.html>
- Raw HTML artifact: [`docs/assets/report-preview/yorkeccak_bio_report.html`](docs/assets/report-preview/yorkeccak_bio_report.html)

**5 sections:** Executive Summary · Decision Path · Code Integrity · AIRI Risk Triggers · Evidence Detail

Interactive features: sticky scroll-spy nav · repo hyperlink in the hero header · `?` tooltip icons on every metric · click-to-expand integrity cards · covered/gaps + domain filtering for AIRI risks · FAIL/WARN/PASS/INFO filter on the evidence ledger.

Current `1.8.0` HTML semantics:

- `Decision Path` explains score construction and policy posture with `Configured, Not Rewritten`
- `Code Integrity` surfaces the split between `C4` fail-open exceptions, `C5` compliance/boundary integrity, and `C6` mock-auth/no-auth trust boundaries
- `AIRI Risk Triggers` distinguishes the **full local AIRI registry**, the **curated runtime bundle**, and the **detector mapping registry**
- covered AIRI rows carry bounded `why mapped` reasoning derived from detector-trigger evidence plus the local detector-mapping registry

This is a review aid, not a claim that AIRI independently verified the repository.

---

## Report Preview

<p align="center">
  <img src="docs/assets/report-preview/7p-1.png" alt="STEM BIO-AI full 7-page packet — page 1" width="760">
</p>

**Sample PDF:** [Download the 7-page full packet preview](docs/assets/report-preview/yorkeccak_bio_detailed_7p.pdf)

<details>
<summary>View all 7 full-packet preview pages</summary>

| Page 1 | Page 2 |
|--------|--------|
| <img src="docs/assets/report-preview/7p-1.png" alt="Page 1"> | <img src="docs/assets/report-preview/7p-2.png" alt="Page 2"> |

| Page 3 | Page 4 |
|--------|--------|
| <img src="docs/assets/report-preview/7p-3.png" alt="Page 3"> | <img src="docs/assets/report-preview/7p-4.png" alt="Page 4"> |

| Page 5 | Page 6 |
|--------|--------|
| <img src="docs/assets/report-preview/7p-5.png" alt="Page 5"> | <img src="docs/assets/report-preview/7p-6.png" alt="Page 6"> |

| Page 7 |
|--------|
| <img src="docs/assets/report-preview/7p-7.png" alt="Page 7"> |

</details>

---

## Detection Methods

Every scored item maps to a concrete, inspectable detection method. No inference, no LLM judgment.

<details>
<summary>Full detection table</summary>

| Component | Detection Method |
|-----------|-----------------|
| Stage 1 baseline | Non-zero README present (+60 base) |
| Stage 1 domain signal | Bio-domain keyword regex in README and package metadata |
| Stage 1 hype penalties (H1–H6) | Regex: clinical certainty, regulatory approval, autonomous replacement, breakthrough marketing, universal generalization, perfect accuracy claims |
| Stage 1 responsibility signals (R1–R5) | Regex: limitations section, regulatory framework, clinical disclaimer (CA-severity-weighted), demographic-bias disclosure, reproducibility provisions |
| Stage 2R consistency | Vocabulary set intersection across README/docs/package/tests; limitation repetition; clinical-boundary contradiction, version-staleness, and workflow-support deductions |
| Stage 3 T1 CI | `.github/workflows/` contains at least one file |
| Stage 3 T2 domain tests | `tests/` directory text contains bio-domain vocabulary (regex) |
| Stage 3 T3 changelog | CHANGELOG file presence + bug-fix/patch/security entry detection (3-tier: 0/+5/+15) |
| Stage 3 B1 data provenance | Dependency manifest presence + IRB/dataset-citation language detection (3-tier: 0/+10/+15) |
| Stage 3 B2 bias measurement | Bias/limitations vocabulary + quantitative measurement evidence (subgroup analysis, AUROC, demographic parity) (3-tier: 0/+8/+15) |
| Stage 3 B3 COI/funding | Funding, grant, sponsor, conflict-of-interest language in README/docs/FUNDING.md |
| Stage 4 containers | Dockerfile or compose file present |
| Stage 4 reproducibility target | Makefile with reproduce/eval/benchmark/test targets |
| Stage 4 dependency lock | Environment/lock/requirements file; exact pins or hash evidence |
| Stage 4 artifact references | Dataset/model/checkpoint URLs or checksum files |
| Stage 4 citation/interface | CITATION.cff; argparse CLI entry points (AST) |
| Stage 4 license restriction | Non-commercial, research-only, academic-only, no-clinical-use restrictions in LICENSE/README |
| CA severity | Clinical/diagnostic phrase regex in README, docs, and package metadata |
| C1 credentials | AWS `AKIA*`, OpenAI `sk-*`, GitHub `ghp_*`, `api_key=...` patterns; obvious placeholders excluded from penalty |
| C2 dependency pinning | `==` or hash pin vs. loose `>=`, `~=`, `<`, `>` ranges |
| C3 deprecated paths | Patient-metadata patterns in `deprecated/`, `legacy/`, `archive/` directories |
| C4 fail-open | `except Exception: pass` or `except: pass` in Python source (AST) |
| C5 compliance boundary integrity | Unsupported legal/compliance claims or missing clinical-boundary integrity in reviewed sources |
| **CC1** clinical zero default | AST scan of function defaults: keyword-only and positional params named `confidence_threshold`, `score_threshold`, `min_confidence`, etc. defaulted to `0.0` |
| **CC2** API contract | README-declared names cross-checked against `__all__` exports; phantom APIs flagged |
| **CC3** shallow validator | `validate_*` / `check_*` functions using only `len()` (no regex structure check) flagged as insufficient for clinical/PII validation |

Stage 2R and Stage 3 rubric artifacts now surface additive `detector_id` and `decision_basis` fields so reviewers can see which bounded detector or contradiction rule produced a deduction or credit.

</details>

---

## AI Advisory Contract

The advisory system exports a sanitized, provider-neutral handoff packet and validates provider responses — without making any provider API call.

```bash
stem advisory validate /path/to/repo                # offline contract check
stem advisory packet /path/to/repo                  # export sanitized input packet
stem advisory check-response /path/to/repo --response FILE
```

**Non-negotiable rules (enforced by the validator):**
- Provider output cannot override `score.final_score` or `score.formal_tier`
- Every advisory item must cite exact `finding_id` strings from `allowed_finding_ids`
- Raw repository source text is not included in provider packets
- Responses containing clinical safety, efficacy, regulatory, or medical-advice claims are rejected
- `allowed_finding_ids` is capped at 40 entries per packet

**Packet hardening added in v1.5.7:**
- `provider_request` now carries a secret-free request schema plus deterministic argument-validation status
- `contract_schemas` exports the advisory input/output contract shapes for downstream validators
- `packet_contract` confirms allowlist parity, snippet omission, and non-negative omission counts before handoff

**Secret boundary hardening added in v1.5.9:**
- provider-specific environment variables are recognized before the generic advisory key fallback
- provider handoff metadata exports endpoint-policy validation and the expected env-var name, never the key value
- embedded-credential URLs are rejected; cloud providers require `https`; plain `http` is limited to localhost
- `.env` files are ignored by default; `.env.example` documents supported variable names only
- `--advisory call` is now the explicit provider-call boundary, with centralized redaction, logging-policy export, child-env allowlist reporting, and artifact pre-write sanitization

Full contract: [`docs/API_CONTRACT.md`](docs/API_CONTRACT.md)
Secret policy: [`docs/ADVISORY_SECRET_HANDLING.md`](docs/ADVISORY_SECRET_HANDLING.md)
Runtime boundary: [`docs/ADVISORY_RUNTIME.md`](docs/ADVISORY_RUNTIME.md)

---

## The AI Risk Repository (AIRI)

STEM BIO-AI uses local derived data from the MIT **AI Risk Repository (AIRI)** as a broader risk-vocabulary layer around deterministic repository findings.

Upstream references:

- MIT AI Risk Repository: <https://airisk.mit.edu/>
- AI Incident Tracker: <https://airisk.mit.edu/ai-incident-tracker>

How AIRI is used here:

- AIRI does **not** replace the local scoring and audit system
- AIRI does **not** prove harm, causality, clinical safety, or regulatory status
- AIRI helps place local findings into a wider risk vocabulary for review

In the current `1.8.0` line, AIRI is used through three local governed layers:

1. full normalized local registry
2. curated runtime bundle used by deterministic scans
3. detector-to-risk mapping registry plus known-gap tracking

This allows STEM BIO-AI to keep scan behavior local and deterministic while still surfacing broader AI risk language, provenance, and bundle-scope boundaries in runtime artifacts.

License / provenance note:

- Upstream AIRI source license: `MIT`
- Local attribution and usage details: [`docs/AIRI_DATA_GOVERNANCE.md`](docs/AIRI_DATA_GOVERNANCE.md), [`docs/THIRD_PARTY_DATA.md`](docs/THIRD_PARTY_DATA.md)

---

## Runtime / Security / Compliance Boundary

STEM BIO-AI can help teams become more **audit-ready**, but it does not by itself create certification, attestation, or legal compliance.

What can be prepared internally:

- runtime and security evidence review
- control-matrix and evidence-room preparation
- validation-package assembly for electronic records / signature workflows
- gap assessment for logging, access control, change control, retention, and traceability
- independent third-party audit readiness and penetration-test readiness

What still requires external review or attestation:

- SOC 2 report issuance
- ISO 13485 certification
- strong `21 CFR Part 11 compliant` claims
- `independent audit passed` claims

In other words: internal teams can do substantial readiness work, but external claims still require external auditors, certification bodies, or independent assessors.

Related boundary guidance: [`docs/REGULATORY_MAPPING.md`](docs/REGULATORY_MAPPING.md)

---

## MICA Memory Layer

The repository keeps a versioned MICA memory layer under `memory/` for agent-session initialization,
drift control, and release provenance. Historical snapshots are retained as archive; the active layer
is selected by `memory/mica.yaml`.

The active package now follows the non-breaking `MICA v0.2.4` runtime contract:

- `memory/mica.yaml` is the composition contract
- `python tools/mica_pct.py .` validates package integrity
- `python tools/mica_runtime.py . --format text` emits a portable session summary
- `python tools/mica_runtime.py . --format session-report` emits an opening-state gate packet
- `python tools/mica_invoke.py . --mode guided --format json` compiles a host-consumable activation packet
- `mica_invoke.bat . --mode forced` is the Windows forced-preflight entry point
- DI binding remains progressive rather than speculative
  critical invariants are not mass-rewritten just to satisfy schema formality

Operational reference: [`docs/MICA_MEMORY.md`](docs/MICA_MEMORY.md)

---

## Web Demo

Live demo: [huggingface.co/spaces/Flamehaven/stem-bio-ai](https://huggingface.co/spaces/Flamehaven/stem-bio-ai)

<p align="center">
  <img src="docs/assets/HF-STEM-BIO_AI.png" alt="STEM BIO-AI Hugging Face Space" width="760">
</p>

The Space runs the same deterministic local scanner on public GitHub repositories. No provider API call is made.

Run locally:

```bash
pip install -e .[demo]
python app.py
```

---

## Repository Structure

```
STEM-BIO-AI/
  stem_ai/              # Core Python package
  docs/                 # API contract, advisory runtime/secret policy, scoring rationale, MICA policy, report previews
  memory/               # Versioned MICA archive/playbook/lessons; active layer selected by mica.yaml
  audits/               # Historical benchmark/reference artifacts only
  stem_output/          # Default live CLI output root (generated, ignored)
  scripts/              # Benchmark and validation scripts
  tests/                # Regression test suite
  app.py                # HuggingFace Spaces / Gradio entry point
  pyproject.toml        # Package metadata and extras
  SKILL.md              # Universal agent skill definition
  CHANGELOG.md          # Version history
```

---

## Agent Skill Install

```bash
# Claude Code
git clone --depth 1 https://github.com/flamehaven01/STEM-BIO-AI.git ~/.claude/skills/stem-bio-ai

# Generic agent frameworks
git clone --depth 1 https://github.com/flamehaven01/STEM-BIO-AI.git ~/.agents/skills/stem-bio-ai
```

---

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md). High-value areas: rubric discrimination examples, clinical-adjacency trigger refinements, additional bio-domain benchmark repositories, report rendering improvements.

---

## Citation

Preferred citation metadata lives in [`CITATION.cff`](CITATION.cff).

Current concept DOI-backed archive for the `1.8.0` line:
- <https://doi.org/10.5281/zenodo.20154479>

```bibtex
@software{stem-bio-ai,
  author  = {Yun, Kwansub},
  title   = {STEM BIO-AI: Deterministic Evidence-Surface Scanner for Bio/Medical AI Repositories},
  version = {1.8.0},
  year    = {2026},
  doi     = {10.5281/zenodo.20154479},
  url     = {https://doi.org/10.5281/zenodo.20154479}
}
```

---

## License

Apache 2.0. See [LICENSE](LICENSE).

Maintained by [flamehaven01](https://github.com/flamehaven01)







