Metadata-Version: 2.4
Name: babappa
Version: 0.5.2a0
Summary: Simulation-trained branch-site selection support from user-supplied codon MSAs and trees
Author: Krishnendu Sinha
License-Expression: MIT
Project-URL: Homepage, https://github.com/sinhakrishnendu/BABAPPA
Project-URL: Repository, https://github.com/sinhakrishnendu/BABAPPA
Project-URL: Issues, https://github.com/sinhakrishnendu/BABAPPA/issues
Keywords: molecular-evolution,positive-selection,branch-site,codon,deep-learning
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: typer>=0.9
Requires-Dist: rich>=13
Requires-Dist: numpy>=1.23
Provides-Extra: dev
Requires-Dist: pytest>=7; extra == "dev"
Requires-Dist: build>=1; extra == "dev"
Requires-Dist: twine>=5; extra == "dev"
Provides-Extra: neural
Requires-Dist: torch; extra == "neural"
Dynamic: license-file

# BABAPPA

BABAPPA is the **Branch-site Alignment-Bias-Aware Probabilistic Positive-selection Analyzer**.

Current source version: `0.5.2-alpha`  
Release archive label: `0.5.0-alpha`  
Status: research-alpha, simulation-trained, guarded empirical diagnostic workflow

BABAPPA supports branch-site positive-selection investigation from a user-supplied codon MSA and treefile. The main user-facing command treats the supplied MSA as the authoritative alignment, scores requested foreground branches, and reports candidate branch-site episodic-selection support using a deployable simulation-trained model. Alignment ensembles, codeml/HyPhy comparison, and matched-null calibration are optional diagnostic layers for deeper evaluation.

BABAPPA is **not** a finalized empirical positive-selection discovery engine. Empirical positive-selection claims remain blocked until simulation-matched null calibration, reference-tool comparison, biological controls, and dataset-specific interpretation are complete.

## Contents

- Project status and scientific boundary
- What BABAPPA does
- What BABAPPA does not do
- Installation
- Quick start
- Typical workflows
- Input requirements
- Aligners
- Output interpretation
- Reproducibility
- Storage cleanup and maintenance
- Troubleshooting
- Citation and manuscript status
- Developer notes

## Project Status And Scientific Boundary

BABAPPA has completed conservative explicit branch-truth simulation validation at 100,000 families on Apple Silicon/MPS. It has a validated deployable simulation-trained model package:

```text
deployable_model_conservative_branch_site_100k_mps
```

The deployable package validates successfully:

- status: `ok`
- failures: `0`
- warnings: `0`

The empirical bridge can process small real empirical diagnostic pilots, but BABAPPA scores are not final discovery claims.

Historical validation note: Branch-conditioned 10K streamed validation completed before the final 100K MPS run. Branch-conditioned labels may be proxy-derived in older or non-explicit workflows, so BABAPPA now distinguishes those cases from explicit branch-site simulator truth. A previous gate stated, "Final 100K is deferred until explicit branch-truth validation passes"; that gate has now been satisfied with a conditional-pass 100K explicit-truth validation, while empirical discovery claims remain blocked.

The simulation phase is oracle-supervised because simulator truth is known during validation. That oracle-supervised evidence is never supplied as an empirical inference input.

> **Empirical interpretation warning**
>
> A BABAPPA diagnostic-positive result is not, by itself, a publishable empirical positive-selection claim. It must be interpreted with matched-null calibration, reference-tool comparison, biological controls, and dataset-specific justification.

## What BABAPPA Does

BABAPPA can:

- predict branch-site support directly from a user-provided aligned codon MSA and matching treefile;
- score one foreground tip, a comma-separated set of foreground tips, or all tree tips;
- validate empirical CDS FASTA and tree inputs;
- run optional alignment ensembles for diagnostic sensitivity analysis;
- construct site maps and method-policy reports;
- extract conservative empirical branch-site features;
- audit empirical feature tables for forbidden truth-derived columns;
- score branch-site rows using a packaged simulation-trained model;
- classify empirical inputs as `in_domain`, `borderline`, or `out_of_domain`;
- mark OOD cases as `diagnostic_only`;
- produce guarded diagnostic reports;
- prepare and parse codeml/HyPhy-style reference workflows;
- plan simulation-matched empirical calibration;
- audit storage and generate safe cleanup scripts for large reproducible outputs.

BABAPPA helps decide whether a dataset is suitable for deeper positive-selection analysis. It is a diagnostic decision-support framework, not an automatic discovery machine.

## What BABAPPA Does Not Do

BABAPPA does not:

- prove positive selection by itself;
- replace codeml, HyPhy, biological controls, or expert interpretation;
- make final empirical discovery claims without calibration and controls;
- use simulator truth during empirical inference;
- silently accept out-of-domain empirical inputs as positive-selection calls;
- serve as a clinical, agricultural, regulatory, or policy decision tool.

## Long-Run Handoff Policy

Codex and other assisted-maintenance sessions should not execute heavy empirical calibration, broad empirical scans, retraining, 10K/100K simulations, or long aligner/reference batches. The expected workflow is to generate reproducible USER-RUN scripts, validators, parsers, and reports; the user runs long jobs locally or offline and returns summaries/logs for interpretation.

## Installation

After PyPI release:

```bash
python -m pip install babappa
```

Clone and install from source:

```bash
git clone <REPOSITORY_URL> BABAPPA
cd BABAPPA
python -m pip install -e .
```

For neural scoring, install BABAPPA in an environment with PyTorch available, for example the `molevo` conda environment used during development. The PyPI/source package includes the lightweight deployable model package used by the default predictor.

For development and tests:

```bash
python -m pip install -e ".[dev]"
```

Check the installed version:

```bash
babappa --version
```

Run tests:

```bash
python -m pytest -q
```

Current expected test state from the handoff:

```text
351 passed, 58 skipped
```

## External Dependencies

Required Python dependencies are installed through the package. Empirical and reference workflows may also need external command-line tools:

- MAFFT
- MUSCLE
- BABAPPAlign
- optional IQ-TREE2/IQ-TREE for tree building
- optional codeml from PAML
- optional HyPhy
- optional PyTorch for deployable model scoring

Check aligners:

```bash
babappa check-aligners
```

BABAPPAlign requires the BABAPPAScore model cache:

```bash
mkdir -p "$HOME/.cache/babappalign/models"
curl -L "https://zenodo.org/record/18053201/files/babappascore.pt" -o "$HOME/.cache/babappalign/models/babappascore.pt"
```

The BABAPPAlign model is small enough to keep. Generated BABAPPAlign embedding caches can be very large and may be safely regenerated.

## Apple Silicon / MPS

Apple Silicon/MPS support is research-alpha. It is useful for smoke tests, lightweight empirical scoring, and the completed 100K MPS validation.

Recommended shell settings:

```bash
export PYTORCH_ENABLE_MPS_FALLBACK=1
export OMP_NUM_THREADS=8
export MKL_NUM_THREADS=8
export OPENBLAS_NUM_THREADS=8
export NUMEXPR_NUM_THREADS=8
```

Check neural environment:

```bash
babappa check-neural-env
```

Run MPS smoke:

```bash
babappa smoke-mps-training --outdir mps_smoke --device auto --batch-size 32 --max-items 512
babappa validate-mps-smoke --smoke-dir mps_smoke
```

Light benchmark:

```bash
babappa benchmark-apple-silicon --outdir apple_silicon_benchmark --device auto --batch-sizes 32,64,128 --max-items 1024
```

If MPS fails, retry the relevant scoring stage with `--device cpu` or a smaller batch size.

## Quick Start

Inspect commands:

```bash
babappa --help
```

Launch the interactive predictor:

```bash
babappa
```

BABAPPA will ask for:

1. aligned codon MSA FASTA path
2. treefile path
3. foreground mode: `leaves`/`all`/`specific`

`leaves` is the default and scores every tree tip. `specific` asks for comma-separated tree-tip labels.

### Main End-User Command: MSA + Tree To Branch-Site Calls

If you already have a codon MSA and a tree whose tip labels match the MSA IDs, this is the intended front door:

```bash
babappa predict-branch-sites \
  --msa my_gene.codon_aligned.fasta \
  --tree my_gene.treefile \
  --foreground all \
  --model-package deployable_model_conservative_branch_site_100k_mps \
  --outdir my_gene_babappa_prediction \
  --device auto
```

To score only selected tree tips as foreground branches:

```bash
babappa predict-branch-sites \
  --msa my_gene.codon_aligned.fasta \
  --tree my_gene.treefile \
  --foreground Arabidopsis_thaliana,Arabidopsis_lyrata \
  --model-package deployable_model_conservative_branch_site_100k_mps \
  --outdir my_gene_babappa_prediction \
  --device mps
```

BABAPPA does not realign input for this command. The user-supplied MSA is the alignment used for prediction. The prediction table reports both `msa_codon_site`/`aligned_codon_site` and `branch_degapped_codon_site`, so users can locate a call in the alignment column and in the de-gapped sequence coordinate of the scored branch.

Main outputs:

- `branch_site_predictions.tsv`: site-by-branch scores and calls
- `branch_predictions.tsv`: branch-level support summary
- `gene_summary.tsv`: gene-level diagnostic summary
- `prediction_report.md`: human-readable report
- `qc_report.md`: input/applicability summary

Dry-run mode validates the MSA/tree and builds the feature table without model scoring:

```bash
babappa predict-branch-sites \
  --msa my_gene.codon_aligned.fasta \
  --tree my_gene.treefile \
  --foreground all \
  --outdir my_gene_babappa_dryrun \
  --dry-run
```

### Internal Pipeline Commands

Validate the deployable package:

```bash
babappa validate-deployable-model-package --package-dir deployable_model_conservative_branch_site_100k_mps
```

Validate a tiny empirical input:

```bash
babappa validate-empirical-input \
  --cds-fasta tests/data/empirical_smoke/tiny_empirical.cds.fasta \
  --tree tests/data/empirical_smoke/tiny_empirical.treefile \
  --foreground taxon1 \
  --outdir empirical_input_smoke
```

Run a tiny empirical alignment ensemble:

```bash
babappa run-empirical-alignment-ensemble \
  --cds-fasta tests/data/empirical_smoke/tiny_empirical.cds.fasta \
  --tree tests/data/empirical_smoke/tiny_empirical.treefile \
  --foreground taxon1 \
  --outdir empirical_alignment_smoke \
  --methods identity,mafft,babappalign,muscle \
  --require-babappalign true \
  --threads 4
```

Extract empirical branch-site features:

```bash
babappa extract-empirical-branch-site-features \
  --empirical-validation-dir empirical_input_smoke \
  --alignment-dir empirical_alignment_smoke \
  --deployable-model-package deployable_model_conservative_branch_site_100k_mps \
  --outdir empirical_features_smoke \
  --foreground taxon1
```

Audit feature safety:

```bash
babappa audit-empirical-features \
  --features empirical_features_smoke/empirical_branch_site_features.tsv \
  --deployable-model-package deployable_model_conservative_branch_site_100k_mps \
  --outdir empirical_feature_audit_smoke
```

Run applicability/OOD gate:

```bash
babappa empirical-applicability \
  --empirical-validation-dir empirical_input_smoke \
  --empirical-feature-dir empirical_features_smoke \
  --deployable-model-package deployable_model_conservative_branch_site_100k_mps \
  --outdir empirical_applicability_smoke
```

Score only after validation, feature audit, and applicability have run:

```bash
babappa score-empirical-branch-sites \
  --features empirical_features_smoke/empirical_branch_site_features.tsv \
  --deployable-model-package deployable_model_conservative_branch_site_100k_mps \
  --applicability-dir empirical_applicability_smoke \
  --outdir empirical_scores_smoke \
  --device auto
```

Plan simulation-matched calibration before writing the final diagnostic report:

```bash
babappa plan-simulation-matched-calibration \
  --empirical-validation-dir empirical_input_smoke \
  --deployable-model-package deployable_model_conservative_branch_site_100k_mps \
  --outdir simulation_matched_calibration_plan_smoke
```

Generate report:

```bash
babappa make-empirical-branch-site-report \
  --outdir empirical_report_smoke \
  --empirical-validation-dir empirical_input_smoke \
  --alignment-dir empirical_alignment_smoke \
  --feature-dir empirical_features_smoke \
  --feature-audit-dir empirical_feature_audit_smoke \
  --applicability-dir empirical_applicability_smoke \
  --scoring-dir empirical_scores_smoke \
  --simulation-matched-calibration-plan simulation_matched_calibration_plan_smoke \
  --deployable-model-package deployable_model_conservative_branch_site_100k_mps
```

## Typical Workflows

### 1. Simulation Validation Workflow

Use simulation commands for development and validation, not empirical discovery.

Tiny simulation:

```bash
babappa simulate --outdir sim_smoke --n-families 3 --n-taxa 6 --n-codons 60 --seed 42 --positive-rate 0.5 --saturation-tier moderate
babappa validate-sim --sim-dir sim_smoke
babappa audit-sim --sim-dir sim_smoke --outdir sim_smoke/audit
```

Alignment and feature-building commands include:

```bash
babappa align-sim --sim-dir sim_smoke --outdir align_smoke
babappa validate-align --align-dir align_smoke
babappa build-site-map --sim-dir sim_smoke --align-dir align_smoke --outdir site_map_smoke
babappa validate-site-map --site-map-dir site_map_smoke
```

Heavy 10K/100K plans are user-run only and should not be launched casually.

### 2. Deployable Model Package Validation

The validated package is:

```text
deployable_model_conservative_branch_site_100k_mps
```

Validate package integrity:

```bash
babappa validate-deployable-model-package --package-dir deployable_model_conservative_branch_site_100k_mps
```

Smoke-load package:

```bash
babappa smoke-load-deployable-model \
  --package-dir deployable_model_conservative_branch_site_100k_mps \
  --device auto \
  --outdir deployable_model_load_smoke
```

The package includes:

- `model_manifest.json`
- `model_card.md`
- `feature_schema.json`
- `calibration_schema.json`
- `training_envelope.json`
- `tier_models/`
- `tier_calibrations/`
- `checksums.sha256`
- `validation_summary.json`
- `limitations.md`
- `README.md`

### 3. Real Empirical Input Staging

Prepare a real pilot workspace:

```bash
babappa prepare-real-empirical-pilot-workspace --workspace real_empirical_pilot --max-families 12
babappa prepare-real-pilot-inputs --workspace real_empirical_pilot --manifest real_empirical_pilot_panel.tsv --outdir real_empirical_pilot/input_staging
```

Canonical input paths:

```text
real_empirical_pilot/input/cds/<panel_id>.cds.fasta
real_empirical_pilot/input/trees/<panel_id>.treefile
```

Import one family:

```bash
babappa import-real-pilot-family \
  --workspace real_empirical_pilot \
  --panel-id FAMILY_ID \
  --gene-family "GENE_FAMILY" \
  --species-group "SPECIES_GROUP" \
  --cds-fasta /path/to/family.cds.fasta \
  --tree-file /path/to/family.treefile \
  --foreground TAXON_NAME \
  --expected-category likely_positive \
  --reference-status planned \
  --notes "real pilot candidate"
```

Batch import:

```bash
babappa import-real-pilot-batch --workspace real_empirical_pilot --batch-manifest real_empirical_pilot/import_batch.tsv
```

Validate readiness:

```bash
babappa validate-real-pilot-readiness \
  --workspace real_empirical_pilot \
  --manifest real_empirical_pilot_panel.tsv \
  --outdir real_empirical_pilot/readiness
```

Do not run the pilot until readiness says `ready_to_run: true`.

### 4. Empirical Diagnostic Workflow

Screen a family before scoring:

```bash
babappa prefilter-empirical-family \
  --cds-fasta real_empirical_pilot/input/cds/FAMILY_ID.cds.fasta \
  --tree-file real_empirical_pilot/input/trees/FAMILY_ID.treefile \
  --foreground TAXON_NAME \
  --outdir real_empirical_pilot/prefilter/FAMILY_ID \
  --max-mean-pdistance 0.35 \
  --min-taxa 6 \
  --min-codons 100
```

Run a small guarded panel:

```bash
babappa run-empirical-pilot-panel \
  --panel-manifest real_empirical_pilot/manifest/real_empirical_pilot_panel.tsv \
  --deployable-model-package deployable_model_conservative_branch_site_100k_mps \
  --outdir real_empirical_pilot/babappa_run \
  --methods identity,mafft,babappalign,muscle \
  --device auto \
  --max-families 12
```

Summarize and validate the panel:

```bash
babappa summarize-empirical-pilot-panel --panel-run real_empirical_pilot/babappa_run --outdir real_empirical_pilot/summary
babappa validate-empirical-pilot-summary --summary-dir real_empirical_pilot/summary
```

### 5. WRKY-Style Close-Taxa Pilot Workflow

For Arabidopsis-like WRKY families, do not mix very distant plant taxa at first. Start with closer Brassicaceae-heavy taxa:

```bash
babappa recommend-target-taxa --pilot-type plant_close --outdir real_empirical_pilot/target_taxa_recommendations
```

Plan an OOD-aware family build:

```bash
babappa plan-ood-aware-family-build \
  --family-id WRKY_candidate_02_close \
  --query-species Arabidopsis_thaliana \
  --query-gene-or-locus AT2G38470 \
  --target-taxa-file real_empirical_pilot/target_taxa_recommendations/recommended_target_taxa.tsv \
  --outdir real_empirical_pilot/acquisition_plans/WRKY_candidate_02_close \
  --max-mean-pdistance 0.35 \
  --min-taxa 6 \
  --min-codons 100
```

Current WRKY interpretation:

- `WRKY_candidate_01`: OOD stress test, mean p-distance `0.725799`, diagnostic-only, no positive call.
- `WRKY_candidate_02_close`: in-domain close-taxa WRKY33/AT2G38470 diagnostic pilot, BABAPPA diagnostic-positive, max gene support `0.177189`, called branch-site rows `6954`.
- codeml Model A vs null: LRT `0.0`, p-value `1.0`, negative.
- HyPhy aBSREL foreground p-value: `1.0`, negative.
- HyPhy MEME minimum p-value: `0.0641705`, negative at 0.05.
- Concordance: `BABAPPA_only`.
- Matched-null calibration: 100 feature-level matched nulls completed and validated with the deployable model package.
- Null result: called branch-site rows were unusual versus the feature-matched null (`p_empirical_called_rows=0.009900990099009901`), but max gene support was not unusual (`p_empirical_support=1.0`).

Correct interpretation: BABAPPA-only with mixed feature-level null support; still inconclusive as an empirical discovery claim because codeml and HyPhy are negative and the null calibration is feature-level rather than full raw sequence simulation/alignment replay.

### 6. Simulation-Matched Calibration Planning

Plan calibration from empirical QC:

```bash
babappa plan-simulation-matched-calibration \
  --empirical-validation-dir real_empirical_pilot/babappa_run/per_family/FAMILY_ID/empirical_input_validation \
  --deployable-model-package deployable_model_conservative_branch_site_100k_mps \
  --outdir real_empirical_pilot/babappa_run/per_family/FAMILY_ID/simulation_matched_calibration_plan
```

Summarize plan:

```bash
babappa summarize-simulation-matched-calibration-plan \
  --plan-dir real_empirical_pilot/babappa_run/per_family/FAMILY_ID/simulation_matched_calibration_plan \
  --outdir real_empirical_pilot/babappa_run/per_family/FAMILY_ID/simulation_matched_calibration_summary
```

The WRKY 100-null feature-level matched calibration has completed once under user control. It should be treated as diagnostic support only, not as a final empirical p-value system or discovery proof.

Dry-run the evidence-pack calibration command before launching anything long:

```bash
babappa run-simulation-matched-null-calibration \
  --evidence-pack real_empirical_pilot/evidence_packs/WRKY_candidate_02_close \
  --outdir real_empirical_pilot/calibration_runs/WRKY_candidate_02_close_null100_dryrun \
  --n-null 100 \
  --seed 20260530 \
  --device mps \
  --dry-run
```

Dry-run mode validates the evidence pack and writes:

- `calibration_run_plan.json`
- `calibration_run_plan.md`
- `calibration_input_validation.tsv`
- `calibration_status.json`
- `calibration_status.md`

It does not write null distributions, null percentiles, or discovery-supporting results.

To rerun the feature-level matched-null calibration:

```bash
babappa run-simulation-matched-null-calibration \
  --evidence-pack real_empirical_pilot/evidence_packs/WRKY_candidate_02_close \
  --outdir real_empirical_pilot/calibration_runs/WRKY_candidate_02_close_null100 \
  --n-null 100 \
  --seed 20260530 \
  --device mps
```

Current implementation note: the evidence-pack command is operational for safe dry-run/planning and for conservative feature-level matched-null scoring through the deployable model package. This is diagnostic calibration support, not a full raw sequence simulation plus alignment replay. Do not interpret staged or dry-run files as completed calibration, and do not treat feature-level null support as a standalone empirical discovery claim.

### 7. Classical Reference Workflow Planning

Plan codeml/HyPhy templates:

```bash
babappa plan-classical-reference-workflows \
  --panel-manifest real_empirical_pilot/manifest/real_empirical_pilot_panel.tsv \
  --outdir real_empirical_pilot/reference_plan \
  --tools codeml,hyphy
```

Check reference tools:

```bash
babappa check-reference-tools --outdir real_empirical_pilot/reference_runs/WRKY_candidate_02_close/tool_check
```

Parse prepared outputs:

```bash
babappa parse-codeml-reference \
  --codeml-dir real_empirical_pilot/reference_runs/WRKY_candidate_02_close/codeml \
  --outdir real_empirical_pilot/reference_runs/WRKY_candidate_02_close/codeml_parsed

babappa parse-hyphy-reference \
  --hyphy-dir real_empirical_pilot/reference_runs/WRKY_candidate_02_close/hyphy \
  --outdir real_empirical_pilot/reference_runs/WRKY_candidate_02_close/hyphy_parsed
```

Build reference results:

```bash
babappa build-reference-results-table \
  --panel-id WRKY_candidate_02_close \
  --codeml-parsed real_empirical_pilot/reference_runs/WRKY_candidate_02_close/codeml_parsed \
  --hyphy-parsed real_empirical_pilot/reference_runs/WRKY_candidate_02_close/hyphy_parsed \
  --outdir real_empirical_pilot/reference_results/WRKY_candidate_02_close
```

Compare:

```bash
babappa compare-empirical-reference-results \
  --babappa-panel-run real_empirical_pilot/babappa_run_wrky_close_raw_alignmentaware \
  --reference-results real_empirical_pilot/reference_results/WRKY_candidate_02_close/reference_results.tsv \
  --outdir real_empirical_pilot/comparison/WRKY_candidate_02_close
```

## Input Requirements

Empirical inputs should include:

- CDS FASTA with codon-valid sequences;
- tree file with tips matching FASTA IDs;
- foreground taxon or branch label;
- optional metadata describing expected category and reference status;
- close enough taxa for the current training envelope;
- at least 6 taxa preferred;
- at least 100 codons preferred.

Input checks include:

- duplicate sequence IDs;
- CDS length divisibility by 3;
- internal stop codons;
- ambiguous base fraction;
- gap fraction;
- pairwise p-distance;
- saturation proxy;
- foreground validity;
- tree-tip compatibility.

Do not provide simulator truth or oracle labels during empirical inference. Forbidden empirical input columns include:

- `branch_site_truth`
- `selected_sites`
- `truth`
- `branch_truth`
- `oracle`
- `y_branch_site`
- `y_site`
- `gene_label`
- `positive_label`
- `simulated_label`

## Aligners

For the main command, BABAPPA does not run aligners. The supplied codon MSA is the authoritative input:

```bash
babappa predict-branch-sites --msa aligned.codon.fasta --tree treefile --foreground all --outdir prediction
```

Optional diagnostic alignment/sensitivity workflows can use:

- `identity`
- `mafft`
- `babappalign`
- `muscle`

Diagnostic-only aligners:

- PRANK
- T-Coffee

Alignment ensemble robustness matters only when the user wants to test sensitivity to homology uncertainty. It is not required for the core user-supplied-MSA prediction workflow.

## Output Interpretation

Common terms:

- `diagnostic-positive`: BABAPPA scored support above its current diagnostic threshold. This is not a discovery claim.
- `diagnostic_only`: output may be useful for stress testing or triage but should not be interpreted as positive selection.
- `in_domain`: empirical input appears compatible with the training envelope.
- `borderline`: empirical input has warnings and should be interpreted cautiously.
- `out_of_domain`: empirical input falls outside the current training envelope; abstain from biological interpretation.
- `BABAPPA_only`: BABAPPA is positive but reference tools are negative or pending; treat as inconclusive until calibration and controls.
- `concordant_positive`: BABAPPA and at least one reference workflow support compatible evidence, subject to calibration and controls.
- `reference_only`: reference tool positive but BABAPPA not supportive; inspect alignment, OOD, and model limitations.
- `calibration_pending`: matched-null empirical calibration has not completed; do not report calibrated empirical significance.
- `feature_matched_calibration_complete`: feature-level matched null scoring has completed; interpret as diagnostic calibration support, not as a full raw sequence simulation/alignment replay.

Responsible reporting language:

- use "diagnostic support" or "guarded empirical score";
- report applicability/OOD status;
- report aligner/method-policy status;
- report codeml/HyPhy comparison;
- state whether simulation-matched calibration is pending or complete;
- avoid "BABAPPA discovered positive selection" unless future calibration/reference/control criteria are met.

## Reproducibility

Important retained artifacts:

- deployable package: `deployable_model_conservative_branch_site_100k_mps`
- final 100K validation report: `explicit_branch_truth_100k_mps_final_validation_report.md/json/tsv`
- cross-tier summary: `explicit_branch_truth_100k_mps_cross_tier_summary/`
- truth audit: `branch_truth_status_audit_explicit_branch_truth_100k_mps/`
- WRKY evidence pack: `real_empirical_pilot/evidence_packs/WRKY_candidate_02_close/`
- Git readiness report: `GIT_PUSH_READINESS_REPORT.md`

Existing Zenodo-ready archive:

```text
BABAPPA_0.5.0-alpha_release_zenodo_20260530.tar.xz
```

Checksum:

```text
cc259617f19d9634fd6e11906903910498ab78d3797a10df1bb24b7db014dc30
```

Validate package:

```bash
babappa validate-deployable-model-package --package-dir deployable_model_conservative_branch_site_100k_mps
```

Validate WRKY evidence pack:

```bash
babappa validate-empirical-evidence-pack --evidence-pack real_empirical_pilot/evidence_packs/WRKY_candidate_02_close
```

Run tests:

```bash
python -m pytest -q
```

## Storage Cleanup And User Maintenance

BABAPPA simulations can generate very large reproducible outputs. Audit before deleting anything:

```bash
babappa audit-storage --root . --outdir storage_cleanup_audit --target-size-gb 10
```

Outputs include:

- `storage_inventory.tsv`
- `storage_inventory.json`
- `storage_summary.md`
- `keep_list.tsv`
- `remove_candidates.tsv`
- `archive_candidates.tsv`
- `cleanup_dry_run.md`
- `du_top_100.txt`
- `quarantine_large_reproducible_outputs.sh`
- `delete_quarantine_after_review.sh`
- `archive_key_reports.sh`
- `validate_after_cleanup.sh`

Move candidates to quarantine only:

```bash
bash storage_cleanup_audit/quarantine_large_reproducible_outputs.sh
```

Validate after cleanup:

```bash
bash storage_cleanup_audit/validate_after_cleanup.sh
```

Do not run the permanent delete script until the quarantine has been manually reviewed. The delete script requires `CONFIRM_DELETE=YES`.

Recent storage note: the large system storage issue was caused by a generated BABAPPAlign embeddings cache at `$HOME/.cache/babappalign/embeddings`, not by the BABAPPA Git checkout. The required model file `$HOME/.cache/babappalign/models/babappascore.pt` should be preserved.

## Troubleshooting

### Missing aligners

Run:

```bash
babappa check-aligners
```

If BABAPPAlign reports a missing model, install `babappascore.pt` into `$HOME/.cache/babappalign/models/`.

### MPS/CUDA/CPU device problems

Run:

```bash
babappa check-neural-env
```

Use `--device cpu` if MPS/CUDA fails or if a tensor operation is unsupported.

### Very high p-distance or OOD input

Use closer taxa. For plant WRKY pilots, start with close Brassicaceae panels rather than broad monocot/dicot/legume mixtures.

### codeml/HyPhy disagreement

Treat disagreement conservatively. BABAPPA-only positive signals require matched-null calibration, controls, and biological review.

### Pruned intermediates

Some raw 100K intermediates were intentionally pruned after validation. Use retained summaries, audits, stage markers, model artifacts, checksums, and cleanup manifests for reproducibility.

### Package validation failure

Check that `model_manifest.json`, schemas, checksums, tier models, tier calibrations, and validation summary are present.

### Git cleanup confusion

Generated heavy outputs should not be committed. Use:

```bash
git status --short
git diff --stat
git diff --cached --stat
```

## Citation And Manuscript Status

BABAPPA is currently described by a research-alpha software/methods manuscript in:

```text
Manuscript/BABAPPA_method_paper_auxiliary_saturation.tex
```

No final publication DOI is available yet. Use the repository and release archive metadata until a formal citation is assigned.

Citation placeholder:

```text
Sinha K. BABAPPA: a research-alpha, simulation-trained framework for guarded branch-site positive-selection support under alignment uncertainty. Manuscript in preparation.
```

## PyPI Release Workflow

The package metadata lives in `pyproject.toml`, and the console entry point is:

```text
babappa = "babappa.cli:main"
```

Build locally:

```bash
python -m pip install -e ".[dev]"
python -m build
python -m twine check dist/*
```

Upload to TestPyPI first:

```bash
python -m twine upload --repository testpypi dist/*
```

Then test installation in a fresh environment. Upload to PyPI only after the TestPyPI package installs and `babappa --version` plus `babappa --help` work.

## Developer Notes

Check version:

```bash
babappa --version
```

Run tests:

```bash
python -m pytest -q
```

Inspect Git state:

```bash
git status --short
git diff --stat
git diff --cached --stat
```

Do not commit:

- raw 10K/100K simulations;
- raw alignments;
- tensor shards;
- branch-site datasets;
- prediction tables from heavy runs;
- logs;
- temporary work directories;
- generated BABAPPAlign embeddings caches;
- raw empirical downloads;
- BLAST databases or downloaded genomes/proteomes.

Commit and archive:

- source code;
- tests;
- docs;
- examples;
- manuscript source/PDF;
- deployable package metadata and selected lightweight model artifacts;
- final validation reports;
- evidence-pack manifests and summaries;
- checksums;
- cleanup manifests.

## Scientific Bottom Line

BABAPPA is ready for guarded research-alpha software/methods communication and reproducible evaluation. It is not ready for unsupported empirical positive-selection discovery claims. The next empirical step is to add close-taxa negative controls and interpret BABAPPA outputs jointly with codeml/HyPhy references, biological controls, and any future full raw sequence matched-null calibration.
