Metadata-Version: 2.4
Name: daylily-ephemeral-cluster
Version: 5.1.28
Summary: Infrastructure-as-code for ephemeral AWS ParallelCluster environments for bioinformatics
Author-email: Daylily Informatics <daylily@daylilyinformatics.com>
License-Expression: GPL-3.0-only
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: boto3>=1.26.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: ruamel.yaml>=0.18.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pydantic-settings>=2.0.0
Requires-Dist: typer<1,>=0.12
Requires-Dist: cli-core-yo==2.1.1
Requires-Dist: daylily-omics-analysis==2.0.41
Requires-Dist: setuptools<81
Requires-Dist: requests>=2.31.0
Requires-Dist: tabulate>=0.8.10
Requires-Dist: python-dateutil>=2.8.2
Requires-Dist: aws-parallelcluster==3.13.2
Requires-Dist: typeguard~=2.13.3
Requires-Dist: pytest>=7.4.0
Requires-Dist: pytest-cov>=4.1.0
Requires-Dist: moto>=4.2.0
Requires-Dist: black>=23.0.0
Requires-Dist: ruff>=0.1.0
Requires-Dist: mypy>=1.5.0
Requires-Dist: boto3-stubs[ec2,s3]>=1.28.0
Requires-Dist: ipython>=8.0.0
Requires-Dist: prompt-toolkit<3.0.52,>=3.0.41
Requires-Dist: yamllint>=1.35.0
Dynamic: license-file

# Daylily Ephemeral Cluster

[![Latest release](https://img.shields.io/badge/dynamic/yaml?url=https%3A%2F%2Fraw.githubusercontent.com%2FDaylily-Informatics%2Fdaylily-ephemeral-cluster%2Fmain%2Fconfig%2Fdaylily_cli_global.yaml&query=%24.daylily.git_ephemeral_cluster_repo_release_tag&label=latest%20release&cacheSeconds=300&color=teal)](https://github.com/Daylily-Informatics/daylily-ephemeral-cluster/releases) [![Latest tag](https://img.shields.io/badge/dynamic/yaml?url=https%3A%2F%2Fraw.githubusercontent.com%2FDaylily-Informatics%2Fdaylily-ephemeral-cluster%2Fmain%2Fconfig%2Fdaylily_cli_global.yaml&query=%24.daylily.git_ephemeral_cluster_repo_tag&label=latest%20tag&color=pink&cacheSeconds=300)](https://github.com/Daylily-Informatics/daylily-ephemeral-cluster/tags)

Daylily Ephemeral Cluster, usually called DYEC or DayEC, is the Daylily control plane for short-lived AWS ParallelCluster environments. It renders cluster configuration, validates AWS prerequisites, creates FSx for Lustre storage, connects to headnodes through AWS Systems Manager, stages inputs, launches workflow repositories, exports completed analysis directories, and optionally registers exported evidence with Dewey for downstream QEO ingestion.

The cluster is disposable. The S3 inputs, reference bucket, analysis-export bucket, command catalog, and evidence receipts are durable. Do not delete a cluster until the export receipt and expected S3 outputs are verified.

## Philosophy

DYEC is deliberately not a dogma-locked workflow manager. It provisions and exports the execution environment. The checked-out repository owns its workflow engine, command syntax, containers, profile, and final file layout below the analysis root. DayOA/Snakemake is the first-class Daylily workflow repository, and nf-core/Nextflow repositories such as `daylily-sarek` can also run on the same cluster when they honor the same FSx analysis-root and export contract.

The operating contract is strict. Missing config, credentials, references, run mounts, licenses, runtime assets, invalid sample identity, unsafe path segments, non-empty export destinations, and malformed command catalog rows should fail hard. DYEC should not guess a bucket, invent a credential, choose a replacement reference, or silently fall back to a legacy launch path.

## Architecture

```mermaid
flowchart LR
  Operator["operator or service<br/>dyec CLI"] --> Config["explicit config<br/>AWS profile, region, buckets"]
  Config --> Pcluster["AWS ParallelCluster"]
  Pcluster --> Headnode["headnode<br/>ubuntu via SSM"]
  Pcluster --> FSx["FSx for Lustre"]
  RefBucket["reference S3 bucket"] -->|reference DRA| References["/fsx/references"]
  RunBucket["run S3 prefix"] -->|optional run DRA| RunMount["/fsx/run_dir_mounts/<mount_id>"]
  Headnode --> Repo["workflow repository checkout"]
  References --> Repo
  RunMount --> Repo
  Repo --> Results["/fsx/analysis_results/<entity>/<analysis_id>"]
  Results -->|temporary export DRA| AnalysisBucket["analysis S3 bucket"]
  AnalysisBucket --> Receipt["fsx_export.yaml"]
  Receipt --> Dewey["Dewey registration"]
  Dewey --> QEO["QEO ingestion"]
```

## Filesystem Contract

| Path | Owner | Purpose |
|---|---|---|
| `/fsx/references` | DYEC cluster config | Reference and runtime assets mounted from the configured reference bucket. |
| `/fsx/control_data` | optional cluster config | Repeated-test or control assets when configured. |
| `/fsx/run_dir_mounts/<mount_id>` | `dyec mounts` | Read-oriented S3 run-folder Data Repository Associations. |
| `/fsx/analysis_results/<executing_entity>/<analysis_id>` | workflow repository | Repository checkout, logs, work state, outputs, reports, and benchmarks. |
| `s3://<analysis-bucket>/<prefix>/<executing_entity>/<analysis_id>/` | `dyec export` | Durable export destination for one completed analysis directory. |

Run mounts and references are inputs. They are not export sources. The export source is exactly one completed analysis directory under `/fsx/analysis_results/<executing_entity>/<analysis_id>`.

## Setup

Prerequisites:

- AWS credentials for a non-default profile with ParallelCluster, EC2, IAM, CloudFormation, S3, FSx, SSM, CloudWatch, and related read/write permissions.
- AWS region and availability zone selected for the cluster.
- AWS Session Manager plugin installed locally.
- AWS ParallelCluster CLI available through this repo environment.
- Configured S3 buckets for references, optional control data, staging, and analysis exports.
- A Daylily config file, normally `~/.config/daylily/daylily_ephemeral_cluster.yaml`, with explicit bucket and cluster settings.

Activate the checkout and inspect the live CLI:

```bash
cd /path/to/daylily-ephemeral-cluster
source ./activate
dyec --json version
dyec --help
dyec runtime status
dyec --json repositories commands
```

Use placeholders in examples until your environment has real values:

```bash
export AWS_PROFILE=<non-default-profile>
export REGION=us-west-2
export REGION_AZ=us-west-2d
export CLUSTER_NAME=<cluster-name>
export DAY_EX_CFG="$HOME/.config/daylily/daylily_ephemeral_cluster.yaml"
export REF_S3_URI=s3://<reference-bucket>
export CONTROL_DATA_S3_URI=s3://<control-data-bucket>
export STAGE_S3_URI=s3://<staging-bucket>/<prefix>
export ANALYSIS_RESULTS_S3_URI=s3://<analysis-results-bucket>/<prefix>
export EXECUTING_ENTITY=ubuntu
export ANALYSIS_ID=<analysis-id>
export EXPORT_S3_URI="$ANALYSIS_RESULTS_S3_URI/$EXECUTING_ENTITY/$ANALYSIS_ID/"
```

## Lifecycle

```bash
dyec preflight \
  --profile "$AWS_PROFILE" \
  --region-az "$REGION_AZ" \
  --config "$DAY_EX_CFG"

dyec create \
  --profile "$AWS_PROFILE" \
  --region-az "$REGION_AZ" \
  --config "$DAY_EX_CFG"

dyec headnode connect \
  --profile "$AWS_PROFILE" \
  --region "$REGION" \
  --cluster "$CLUSTER_NAME"
```

After connection, the supported headnode user is `ubuntu` in an interactive bash login shell. Manual DayOA workflow work belongs in a persistent `tmux` session and uses separate commands:

```bash
source dyoainit
dy-a slurm hg38_broad
dy-r help -p -k -j 1 -n
```

For catalog-backed sample analysis, prefer `dyec samples run`:

```bash
dyec samples run ./analysis_samples.tsv \
  --command-id illumina_snv_alignstats \
  --profile "$AWS_PROFILE" \
  --region "$REGION" \
  --cluster "$CLUSTER_NAME" \
  --reference-s3-uri "$REF_S3_URI" \
  --control-data-s3-uri "$CONTROL_DATA_S3_URI" \
  --stage-s3-uri "$STAGE_S3_URI" \
  --analysis-id "$ANALYSIS_ID" \
  --executing-entity "$EXECUTING_ENTITY" \
  --export-destination-s3-uri "$EXPORT_S3_URI" \
  --export-trigger on-success \
  --dry-run
```

For run-folder analysis, attach a read-only run mount before launching a run-context command:

```bash
dyec --json mounts create "s3://<sequencing-run-bucket>/<run-prefix>/" \
  --profile "$AWS_PROFILE" \
  --region "$REGION" \
  --cluster "$CLUSTER_NAME" \
  --platform ILMN \
  --read-only \
  --wait \
  --timeout-seconds 3600

dyec --json mounts verify \
  --profile "$AWS_PROFILE" \
  --region "$REGION" \
  --cluster "$CLUSTER_NAME" \
  --mount-id <mount_id>
```

Export exactly one completed analysis directory:

```bash
dyec export \
  --profile "$AWS_PROFILE" \
  --region "$REGION" \
  --cluster "$CLUSTER_NAME" \
  --source-path "/fsx/analysis_results/$EXECUTING_ENTITY/$ANALYSIS_ID" \
  --destination-s3-uri "$EXPORT_S3_URI" \
  --output-dir "./tmp-export/$ANALYSIS_ID"
```

Inspect `fsx_export.yaml` before cleanup. Delete is destructive; run `dyec delete --dry-run` first and perform live deletion only after the intended effect is approved and understood.

## CLI Surface

Use `dyec --help` for the current root command list. Current major groups include:

- `preflight`, `create`, `delete`, `drift`, `cluster-info`
- `cluster`, `headnode`, `samples`, `workflow`
- `repositories`, `mounts`, `mount`, `export`, `exports`
- `slurm-accounting`, `aws`, `pricing`, `runtime`, `env`, `state`, `resources-dir`

Important inspection commands:

```bash
dyec --json version
dyec --json cluster describe --profile "$AWS_PROFILE" --region "$REGION" --cluster "$CLUSTER_NAME"
dyec --json repositories commands
dyec repositories commands --command-id illumina_snv_alignstats
dyec workflow status --profile "$AWS_PROFILE" --region "$REGION" --cluster "$CLUSTER_NAME" --session <session>
dyec workflow logs --profile "$AWS_PROFILE" --region "$REGION" --cluster "$CLUSTER_NAME" --session <session> --lines 100
```

The Slurm accounting helper manages external accounting infrastructure when configured. A running cluster can have the `sacct` binary installed while accounting storage is disabled; in that state `sacct` cannot provide job accounting records even though the command exists.

## Repository Catalog

`config/daylily_pipeline_command_catalog.yaml` is the source of truth for blessed repositories and commands. The packaged copy under `daylily_ec/resources/payload/config/` must match it. The current catalog default for DayOA is `2.0.41`; `daylily-sarek` is also present as a Nextflow/nf-core Sarek repository entry.

Catalog command classes:

- `utility`: no sample or run inputs, usually used for smoke tests.
- `sample_analysis`: consumes `analysis_samples.tsv`, stages sample/unit manifests, and launches a repository command.
- `run_analysis`: consumes `runs.tsv` and requires a matching `/fsx/run_dir_mounts/<mount_id>` input mount.

## Reference Bucket Contract

The reference bucket is mounted to `/fsx/references` at cluster creation. It should contain:

- organism references and indexes for supported genome builds
- GIAB truth resources and high-confidence BEDs where concordance targets need them
- slim sample read fixtures used by catalog validation
- runtime assets that must be present before workflow activation, such as pinned tool installs, container caches, and licensed commercial tool assets
- tool-specific resource directories for annotation, STR, contamination, metagenomics, or other optional targets

DYEC does not choose alternate references at runtime. If a command catalog row points to a missing path, the launch should fail during staging, profile activation, or workflow execution with a clear missing-asset error.

## Supporting Services

- **Dewey**: DYEC can register exported DayOA evidence after a successful export when the command catalog declares an explicit `artifact_registration` policy.
- **QEO**: QEO loading is requested through Dewey/outbox events. DayOA emits local evidence; DYEC maps that evidence to exported S3 artifacts.
- **Ursa**: Ursa can own operator worksets and launch UX above DYEC. DYEC remains the cluster and export control plane.
- **PCUI**: PCUI-style interfaces should call the same catalog and CLI/API surfaces rather than duplicating launch policy.
- **Slurm**: Slurm is cluster infrastructure. Monitoring with `squeue`, `sacct` when configured, logs, and DYEC status commands is allowed. Scheduler, node, job, drain/resume, requeue, cancel, or service interventions require explicit operator approval.

## Contributing

When adding a runnable pipeline repository:

- add a repository row to the command catalog with a pinned `default_ref`
- add explicit command rows for supported launch profiles
- declare input contract, required columns, genome build, targets, jobs, and runtime parameters
- make the repository write all durable outputs below `/fsx/analysis_results/<executing_entity>/<analysis_id>`
- document export-relevant reports, logs, benchmarks, and manifests
- add tests for catalog rendering, command validation, and dry-run behavior where possible
- avoid compatibility aliases, inferred defaults, or fallback command paths

Historical plans and terminal working docs live under `docs/jem_working_docs/`. Active ledgers remain in `docs/plans/`.

## Further Reading

- [Quickest Start](docs/quickest_start.md)
- [CLI Reference](docs/cli_reference.md)
- [Pipeline Manager Launches](docs/pipeline_manager_launches.md)
- [DRA and FSx Strategy](docs/dra_fsx_strategy.md)
- [HG003 Benchmarking And Costs](docs/hg003_benchmarking_and_costs.md)
- [Working Docs Archive Index](docs/jem_working_docs/INDEX.md)
