Metadata-Version: 2.4
Name: seedforge
Version: 0.1.0
Summary: AI-powered test data generator that reads your database schema and fills it with realistic, FK-valid data in seconds
Author-email: SilkHorizon <hello@silkhorizon.uz>
License: MIT
Project-URL: Homepage, https://github.com/silkhorizonstudios/seedforge
Project-URL: Documentation, https://seedforge.dev
Project-URL: Repository, https://github.com/silkhorizonstudios/seedforge
Keywords: database,testing,faker,seed,synthetic-data,postgresql
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: typer>=0.12.0
Requires-Dist: rich>=13.0.0
Requires-Dist: psycopg2-binary>=2.9.0
Requires-Dist: faker>=25.0.0
Requires-Dist: pyyaml>=6.0
Provides-Extra: mysql
Requires-Dist: pymysql>=1.1.0; extra == "mysql"
Provides-Extra: ai
Requires-Dist: anthropic>=0.40.0; extra == "ai"
Provides-Extra: all
Requires-Dist: pymysql>=1.1.0; extra == "all"
Requires-Dist: anthropic>=0.40.0; extra == "all"

# SeedForge

**One command to fill your database with realistic test data.**

SeedForge connects to your database, reads the schema (tables, columns, foreign keys, constraints), and generates realistic, FK-valid data — no code, no config, no seed scripts.

```bash
pip install seedforge
seedforge connect postgresql://user:pass@localhost/mydb
seedforge generate --rows 1000
# Done. 40 tables filled in 3 seconds.
```

## Features

- **Zero-config** — reads your DB schema automatically, no setup needed
- **FK integrity** — resolves foreign keys via topological sort, inserts in correct order
- **Smart heuristics** — 80+ column name patterns for realistic data (`email` → real email, `price` → decimal, `role` → admin/user/editor)
- **Deterministic** — use `--seed` to get the same data every time
- **AI-powered** — optional Claude AI integration for maximum realism
- **Export** — SQL or JSON file output
- **Privacy-first** — runs entirely locally, your data never leaves your machine

## Installation

```bash
pip install seedforge

# With MySQL support
pip install seedforge[mysql]

# With AI support (Claude API)
pip install seedforge[ai]

# Everything
pip install seedforge[all]
```

## Quick Start

### 1. Connect

```bash
seedforge connect postgresql://user:pass@localhost:5432/mydb
```

Saves the connection to `.seedforge.yaml` so you don't have to type it again.

### 2. Inspect

```bash
seedforge inspect
```

Shows all tables, columns, types, foreign keys, and insertion order:

```
Found 18 tables (insertion order):

         1. users
┏━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━┓
┃ Column     ┃ Type      ┃ Nullable ┃ FK →  ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━┩
│ id         │ serial    │ NO       │       │
│ email      │ varchar   │ NO       │       │
│ name       │ varchar   │ YES      │       │
└────────────┴───────────┴──────────┴───────┘

             2. orders
┏━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Column     ┃ Type      ┃ Nullable ┃ FK →       ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━┩
│ id         │ serial    │ NO       │            │
│ user_id    │ integer   │ NO       │ users.id   │
│ total      │ numeric   │ NO       │            │
└────────────┴───────────┴──────────┴────────────┘
```

### 3. Generate

```bash
# Generate and insert 100 rows per table
seedforge generate --rows 100

# Preview without inserting
seedforge generate --rows 10 --dry-run

# Export to SQL file
seedforge generate --rows 1000 --export sql

# Export to JSON
seedforge generate --rows 1000 --export json

# Deterministic (same data every time)
seedforge generate --rows 100 --seed 42

# Only specific tables (auto-includes FK parents)
seedforge generate --tables orders,payments --rows 50

# Clean tables before generating
seedforge generate --rows 100 --clean
```

### 4. AI Generate (optional)

```bash
export ANTHROPIC_API_KEY=sk-...
seedforge ai-generate --rows 20
```

Uses Claude AI to generate context-aware data with maximum realism.

## How It Works

1. **Schema introspection** — connects to your database, reads `information_schema` to get tables, columns, types, FK relationships, constraints, ENUMs
2. **Dependency graph** — builds a directed graph from FK relationships, runs topological sort to determine insertion order (parents first)
3. **Smart heuristics** — maps column names to appropriate generators (`email` → realistic email, `phone` → phone number, `created_at` → recent datetime)
4. **FK resolution** — child rows automatically reference real IDs from already-generated parent rows
5. **Batch insert** — fast bulk insertion with proper transaction handling

## Column Name Heuristics

SeedForge automatically detects what kind of data to generate based on column names:

| Column name | Generated data |
|---|---|
| `email` | `john.smith@example.com` |
| `phone`, `mobile` | `+1-555-0123` |
| `first_name` | `John` |
| `last_name` | `Smith` |
| `username` | `jsmith42` |
| `address`, `street` | `123 Main St, Apt 4` |
| `city` | `San Francisco` |
| `country` | `United States` |
| `price`, `amount`, `total` | `49.99` |
| `url`, `website` | `https://example.com` |
| `avatar`, `image_url` | `https://picsum.photos/seed/123/400/300` |
| `role` | `admin`, `user`, `moderator` |
| `status` | `active`, `pending`, `completed` |
| `plan` | `free`, `pro`, `enterprise` |
| `created_at`, `updated_at` | Recent datetime |
| `is_active`, `verified` | `true` (85% bias) |
| `is_deleted`, `archived` | `false` (90% bias) |
| `password` | SHA-256 hash |
| `token`, `api_key` | Random hex string |
| `uuid`, `guid` | Valid UUID v4 |
| ...and 60+ more patterns | |

## Configuration

`.seedforge.yaml` (auto-created by `seedforge connect`):

```yaml
db_url: postgresql://user:pass@localhost:5432/mydb
default_rows: 100
default_schema: public
seed: 42  # optional, for deterministic generation
exclude_tables:
  - _prisma_migrations
  - django_migrations
```

## Supported Databases

- [x] PostgreSQL
- [x] MySQL / MariaDB
- [ ] SQLite (planned)

## Data Privacy

**Your data never leaves your machine.** SeedForge runs entirely locally — it connects directly to your database, generates data in memory, and inserts it. No cloud, no telemetry, no data collection.

## License

MIT

## Contributing

Issues and PRs welcome at [github.com/silkhorizonstudios/seedforge](https://github.com/silkhorizonstudios/seedforge).
