Metadata-Version: 2.4
Name: crunr
Version: 0.2.0
Summary: Run any compute job on AWS with a single command
License-Expression: MIT
Keywords: aws,ec2,cloud,gpu,machine-learning,spot
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: System :: Distributed Computing
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: boto3>=1.34
Requires-Dist: botocore>=1.34
Requires-Dist: rich>=13.7
Provides-Extra: dev
Requires-Dist: pytest>=7.4; extra == "dev"
Requires-Dist: pytest-mock>=3.12; extra == "dev"
Requires-Dist: moto[ec2,sts]>=5.0; extra == "dev"
Requires-Dist: ruff>=0.3; extra == "dev"
Requires-Dist: mypy>=1.8; extra == "dev"
Requires-Dist: boto3-stubs[ec2,sts]>=1.34; extra == "dev"

# crunr

Run any compute job on AWS EC2 with a single command — no DevOps required.

```
crunr run train.py --gpu
```

crunr provisions an instance, uploads your code, streams live output, downloads results, and terminates the instance automatically. Zero idle cost. Outputs survive network failures via S3 persistence.

## Install

```bash
pip install crunr
```

Requires Python 3.10+ and an AWS account.

## Quick start

```bash
# 1. Configure AWS credentials (one-time)
crunr auth

# 2. Run a script on the cheapest CPU instance
crunr run script.py

# 3. Run on a GPU instance
crunr run train.py --gpu

# 4. Specify minimum VRAM
crunr run train.py --gpu --memory 24

# 5. Pass environment variables
crunr run train.py --env EPOCHS=50 --env LR=0.001
```

## How it works

1. **Provision** — selects the cheapest matching spot instance, falls back to on-demand
2. **Sync** — uploads your local directory to the instance (rsync or scp+tar)
3. **Execute** — runs your command with live log streaming
4. **Collect** — downloads any `outputs/` directory back to your machine
5. **Terminate** — instance is always destroyed, even on Ctrl+C or crash

## Commands

| Command | Description |
|---|---|
| `crunr auth` | Configure AWS credentials |
| `crunr run <script>` | Run a job on EC2 |
| `crunr jobs` | Show local job history |
| `crunr ps` | List running instances |
| `crunr clean` | Terminate all orphaned instances |
| `crunr s3 setup` | Create an S3 bucket for output persistence |
| `crunr s3 list` | List jobs stored in S3 |
| `crunr s3 pull <JOB_ID>` | Download a job's outputs from S3 |
| `crunr s3 status` | Show bucket usage and saved config |
| `crunr s3 rm <JOB_ID>` | Delete a job from S3 |

## `crunr run` options

```
--gpu               Request a GPU instance (cheapest available)
--memory GB         Minimum GPU VRAM or RAM in GB
--instance TYPE     Exact EC2 instance type (e.g. g5.xlarge)
--disk GB           Root EBS volume size (default: 8 GB CPU, 100 GB GPU)
--env KEY=VALUE     Environment variable passed to the job (repeatable)
--dir PATH          Local directory to sync (default: current directory)
--on-demand         Use on-demand pricing instead of spot
--profile NAME      AWS credential profile
--region REGION     Override AWS region

# S3 output persistence
--s3                Back up outputs to S3 using saved config
--s3-bucket NAME    S3 bucket name (auto-created if needed)
--s3-prefix PREFIX  Key prefix inside the bucket (default: crunr-jobs)
--s3-no-local       Skip local download — outputs in S3 only
--s3-ttl DAYS       Auto-delete this job's S3 data after N days
```

## Saving outputs

Your script writes files to an `outputs/` directory. crunr downloads it automatically after the job finishes:

```python
import os
os.makedirs("outputs", exist_ok=True)
with open("outputs/result.txt", "w") as f:
    f.write("done")
```

## S3 output persistence

If your network drops during a job or download, outputs are gone when the instance terminates — unless you use S3. With S3 enabled, the EC2 instance pushes outputs directly to S3 (using its own IAM role) before local download begins. Your results are safe regardless of client connectivity.

```bash
# One-time setup
crunr s3 setup --bucket crunr-yourname-outputs

# Run with S3 backup
crunr run train.py --gpu --s3

# Recover outputs after a network failure
crunr s3 list
crunr s3 pull <JOB_ID>
```

S3 key layout:
```
s3://your-bucket/crunr-jobs/<job-id>/outputs/   ← your output files
                              /stdout.log         ← full job log
                              /metadata.json      ← cost, duration, exit code
```

## AWS IAM permissions

```json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "RunrVerify",
      "Effect": "Allow",
      "Action": ["sts:GetCallerIdentity"],
      "Resource": "*"
    },
    {
      "Sid": "RunrDescribe",
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeInstances", "ec2:DescribeImages", "ec2:DescribeKeyPairs",
        "ec2:DescribeSecurityGroups", "ec2:DescribeSpotPriceHistory",
        "ec2:DescribeAvailabilityZones", "ec2:DescribeVpcs",
        "ec2:DescribeSubnets", "ec2:DescribeInstanceTypes", "ec2:DescribeInstanceStatus"
      ],
      "Resource": "*"
    },
    {
      "Sid": "RunrInstances",
      "Effect": "Allow",
      "Action": [
        "ec2:CreateKeyPair", "ec2:DeleteKeyPair",
        "ec2:CreateSecurityGroup", "ec2:AuthorizeSecurityGroupIngress",
        "ec2:RunInstances", "ec2:TerminateInstances", "ec2:CreateTags",
        "ec2:RequestSpotInstances", "ec2:DescribeSpotInstanceRequests",
        "ec2:CancelSpotInstanceRequests"
      ],
      "Resource": "*"
    },
    {
      "Sid": "CrunrS3Bucket",
      "Effect": "Allow",
      "Action": [
        "s3:CreateBucket", "s3:ListBucket", "s3:GetBucketLocation",
        "s3:PutBucketPublicAccessBlock", "s3:PutBucketPolicy",
        "s3:PutEncryptionConfiguration", "s3:PutBucketOwnershipControls",
        "s3:PutLifecycleConfiguration", "s3:GetLifecycleConfiguration"
      ],
      "Resource": "arn:aws:s3:::crunr-*"
    },
    {
      "Sid": "CrunrS3Objects",
      "Effect": "Allow",
      "Action": ["s3:PutObject", "s3:GetObject", "s3:DeleteObject"],
      "Resource": "arn:aws:s3:::crunr-*/*"
    },
    {
      "Sid": "CrunrIAMRole",
      "Effect": "Allow",
      "Action": [
        "iam:CreateRole", "iam:GetRole", "iam:PutRolePolicy",
        "iam:GetRolePolicy", "iam:DeleteRolePolicy", "iam:DeleteRole", "iam:TagRole"
      ],
      "Resource": "arn:aws:iam::*:role/crunr-s3-writer"
    },
    {
      "Sid": "CrunrIAMProfile",
      "Effect": "Allow",
      "Action": [
        "iam:CreateInstanceProfile", "iam:GetInstanceProfile",
        "iam:AddRoleToInstanceProfile", "iam:RemoveRoleFromInstanceProfile",
        "iam:DeleteInstanceProfile", "iam:TagInstanceProfile"
      ],
      "Resource": "arn:aws:iam::*:instance-profile/crunr-instance-profile"
    },
    {
      "Sid": "CrunrPassRole",
      "Effect": "Allow",
      "Action": "iam:PassRole",
      "Resource": "arn:aws:iam::*:role/crunr-s3-writer",
      "Condition": {
        "StringEquals": {"iam:PassedToService": "ec2.amazonaws.com"}
      }
    }
  ]
}
```

The S3, IAM, and PassRole blocks are only needed if you use `--s3`. The EC2 blocks are the minimum for `crunr run`.

## Cost

You only pay for the time the instance runs. Spot instances are typically **60–90% cheaper** than on-demand.

| Instance | GPU | Spot/hr |
|---|---|---|
| `t3.micro` | — | ~$0.003 |
| `g4dn.xlarge` | T4 16GB | ~$0.16 |
| `g5.xlarge` | A10G 24GB | ~$0.34 |
| `p3.2xlarge` | V100 16GB | ~$0.92 |

S3 storage costs ~$0.023/GB/month. A typical job's outputs (log + artifacts) cost fractions of a cent per month.

**crunr is free.** You pay AWS directly for compute and storage — no subscriptions, no markup.

## License

MIT
