Metadata-Version: 2.4
Name: aws-waste-radar
Version: 0.2.4
Summary: Scan AWS accounts for storage and EC2 cost-waste findings (snapshot lifecycle, io2->gp3, Spot/Graviton candidates) with tag-based remediation scoping.
Project-URL: Homepage, https://github.com/sarimor/aws-waste-radar
Author: Mor Michaeli
License: MIT
Keywords: aws,cost-optimization,ebs,ec2,finops,graviton,karpenter,spot
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: System :: Systems Administration
Requires-Python: >=3.9
Requires-Dist: boto3>=1.28
Description-Content-Type: text/markdown

# aws-waste-radar

Scan an AWS account for **storage** and **EC2** cost-waste findings using purely
technical signals (EC2/EBS APIs + CloudWatch). Tag taxonomy is used **only to
scope remediation** — never for detection — and untagged resources are flagged
`NEEDS_CLASSIFICATION` instead of being silently excluded.

Companion to the Hyperscaler Radar suite (`aws-radar`, `gcp-radar`, `azure-radar`, `oci-radar`).

## Install

```bash
pip install aws-waste-radar
```

## Usage

Every result set is written as **both CSV and JSON** (same basename) plus a `summary.json`.
All findings from whichever checks ran are also merged into a single consolidated
**`findings.csv`** / **`findings.json`** (long format: `category, source, resource_type,
resource_id, name, finding, scope, details`).

```bash
# Everything
aws-waste-radar --all --profile prod --region us-east-1

# Category flags
aws-waste-radar --storage          # all storage checks
aws-waste-radar --ec2              # all EC2 checks
aws-waste-radar --billing          # daily account cost, by service (Cost Explorer)

# Per-discovery flags (mix and match freely)
aws-waste-radar --snapshots --io2-gp3            # storage: just these two
aws-waste-radar --graviton --spot                # ec2: just these two
aws-waste-radar --unattached --prev-gen --json   # cross-category + JSON summary to stdout
```

| Flag | Category | Output files |
|---|---|---|
| `--snapshots` | storage | `storage_snapshots.{csv,json}` + DLM presence in summary |
| `--io2-gp3` | storage | `storage_io_to_gp3.{csv,json}` |
| `--pvc` | storage | `storage_k8s_pvc_volumes.{csv,json}` |
| `--unattached` | storage | `storage_unattached_volumes.{csv,json}` |
| `--spot` | ec2 | `ec2_spot_candidates.{csv,json}` |
| `--graviton` | ec2 | `ec2_graviton_candidates.{csv,json}` |
| `--low-cpu` | ec2 | `ec2_low_cpu_candidates.{csv,json}` |
| `--prev-gen` | ec2 | `ec2_prev_gen_candidates.{csv,json}` |
| `--schedule` | ec2 | `ec2_schedule_candidates.{csv,json}` |
| `--billing` | billing | `billing_daily.{csv,json}` + `billing_by_service.{csv,json}` |

Any EC2 check also produces the full `ec2_inventory.{csv,json}` and combined `ec2_findings.{csv,json}`.
`--billing` defaults to a 30-day window; override with `--billing-days N`.

## What it detects

### `--storage`

| CSV | Finding | Signal |
|---|---|---|
| `storage_snapshots.csv` | Snapshots older than N days (default 7) and orphaned snapshots | snapshot `StartTime`, missing source volume |
| (summary) | Missing DLM lifecycle policy | `dlm get-lifecycle-policies` empty |
| `storage_io_to_gp3.csv` | io1/io2 volumes eligible for gp3 | peak observed IOPS < 16,000 and throughput < 1,000 MB/s over the lookback window |
| `storage_k8s_pvc_volumes.csv` | PVC-backed EBS volumes (incl. legacy gp2) | CSI `kubernetes.io/created-for/*` tags |
| `storage_unattached_volumes.csv` | Unattached (`available`) volumes | volume state |

> PVC filesystem utilization needs kubelet stats from inside the cluster; the AWS
> API only shows capacity. The CSV maps volumes → PVC/namespace so the in-cluster
> check is a join, not a hunt.

### `--ec2`

| CSV | Finding | Signal |
|---|---|---|
| `ec2_findings.csv` | `SPOT_CANDIDATE` | on-demand instance with Karpenter/ASG tags |
| | `GRAVITON_CANDIDATE` | x86 family with an arm64 equivalent (m5→m7g, c5a→c7g, …) |
| | `LOW_CPU` | CloudWatch avg < 10% and peak < 40% over lookback |
| | `PREV_GEN` | m4/c4/r4/i3-era families with modern successors |
| `ec2_schedule_candidates.csv` | `SCHEDULE_CANDIDATE` | daily busy/idle pattern: ≥6 idle hours/day with a clear active window (start/stop candidate) |
| | `NEEDS_MONITORING` | too little CPU history to judge (new/short-lived instance) |
| `ec2_inventory.csv` | Full running-instance inventory with all signals | |

> **`--schedule` costs nothing and needs no setup.** It reads `CPUUtilization` from
> **basic monitoring** — free, automatic (hypervisor-level, no agent), retained 63 days —
> aggregated to an hour-of-day profile. So it works on *existing* history immediately;
> there is no 48-hour wait or customer opt-in. The active window is reported in local
> time via `--schedule-tz-offset` (e.g. `-5` for US-East). Memory- or disk-based
> scheduling is the only case that would need the CloudWatch agent — CPU does not.

### `--billing`

| CSV | Content | Source |
|---|---|---|
| `billing_daily.csv` | Daily unblended cost over the window (default 30 days) | Cost Explorer `GetCostAndUsage`, `DAILY` granularity |
| `billing_by_service.csv` | Per-service cost and % of total over the window | same call, grouped by `SERVICE` |
| (summary) | `window_total`, `daily_avg`, `top_services`, `currency` | aggregated |

> Cost Explorer is a global, opt-in service billed at ~$0.01 per request; `--billing`
> makes a single call. This view is account-level **daily** spend — use the daily
> trend to spot cost creep and the per-service split to see where it lands. (Hourly
> 24h granularity and per-instance start/stop schedule detection are intentionally
> not included in this build.)

## Scope classification

- `SERVICES/LOGS (safe to remediate)` — tagged, no customer-data hints
- `CUSTOMER_DB (do not touch)` — tag values hint at customer/tenant data plane
- `NEEDS_CLASSIFICATION` — no matching tags: this is itself a tagging-gap finding

## IAM permissions

The tool is **strictly read-only** — it only describes resources and reads metrics,
and makes no mutating API calls. The minimal least-privilege policy matching exactly
what the code calls is:

```json
{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": [
      "ec2:DescribeInstances",
      "ec2:DescribeVolumes",
      "ec2:DescribeSnapshots",
      "cloudwatch:GetMetricStatistics",
      "dlm:GetLifecyclePolicies",
      "ce:GetCostAndUsage"
    ],
    "Resource": "*"
  }]
}
```

## License

MIT
