Metadata-Version: 2.4
Name: job-scrapper
Version: 0.1.0
Summary: Scrape job offers and extract structured data using AI
License-Expression: MIT
Requires-Python: >=3.14
Requires-Dist: anthropic>=0.50.0
Requires-Dist: selenium>=4.40.0
Provides-Extra: dev
Requires-Dist: pre-commit>=4.2.0; extra == 'dev'
Provides-Extra: lint
Requires-Dist: ruff>=0.11.0; extra == 'lint'
Description-Content-Type: text/markdown

# job-scrapper

Scrape job offers and extract structured data using AI (Claude).

## Features

- Scrapes job listing pages using Selenium with the standard Chrome WebDriver
- Extracts structured data (title, company, skills, stack, process…) via Claude LLM
- Outputs a formatted Markdown fiche de poste
- Caches results by URL hash under `var/jobs/`
- Opens a live browser by default for manual interaction (Cloudflare, login walls)

## Install

```bash
uv sync
```

## Usage

```bash
uv run job-scrapper <URL>                            # opens browser (default)
uv run job-scrapper <URL> --no-live                  # headless mode
uv run job-scrapper <URL> --output-dir ~/out         # custom output directory
uv run job-scrapper <URL> --model claude-haiku-4-5   # cheaper/faster model
```

## Development

```bash
uv sync --extra dev --extra lint
pre-commit install
```

Lint:

```bash
ruff check src/
ruff format src/
```

## Environment

| Variable            | Required | Description    |
|---------------------|----------|----------------|
| `ANTHROPIC_API_KEY` | Yes      | Claude API key |
