Metadata-Version: 2.4
Name: grabberlib2
Version: 0.15.1
Requires-Python: <4.0,>=3.10
Requires-Dist: ada-url>=1.15.3
Requires-Dist: aerich==0.7.2
Requires-Dist: aiogram==3.22.0
Requires-Dist: aiograph>=0.2
Requires-Dist: aiosqlite>=0.20.0
Requires-Dist: alive-progress==3.1.5
Requires-Dist: beautifulsoup4==4.12.3
Requires-Dist: boltons>=24.1.0
Requires-Dist: boto3==1.35.38
Requires-Dist: browser-cookie3>=0.20.1
Requires-Dist: casefy==0.1.7
Requires-Dist: cement==3.0.12
Requires-Dist: cloudscraper>=1.2.71
Requires-Dist: colorama==0.4.6
Requires-Dist: colorlog==6.8.2
Requires-Dist: crawlee[all,beautifulsoup,parsel,playwright]==0.6.9
Requires-Dist: defusedxml==0.7.1
Requires-Dist: environs==14.1.1
Requires-Dist: fastapi>=0.116.0
Requires-Dist: ftfy>=6.3.1
Requires-Dist: httpx[http2]==0.28.1
Requires-Dist: jinja2==3.1.3
Requires-Dist: lxml==5.2.1
Requires-Dist: nest-asyncio>=1.6.0
Requires-Dist: olefile==0.47
Requires-Dist: patool==2.1.1
Requires-Dist: pgcli>=4.1.0
Requires-Dist: pillow==10.3.0
Requires-Dist: pybypass==2.0.1
Requires-Dist: pydoll-python==2.9.0
Requires-Dist: python-telegram-bot[callback-data,job-queue,rate-limiter,webhooks]>=21.7
Requires-Dist: requests-html==0.10.0
Requires-Dist: requests==2.31.0
Requires-Dist: rich==13.7.1
Requires-Dist: selenium-stealth>=1.0.6
Requires-Dist: selenium==4.24.0
Requires-Dist: streamlit-extras==0.4.3
Requires-Dist: streamlit==1.35.0
Requires-Dist: telegraph==2.2.0
Requires-Dist: telethon>=1.39.0
Requires-Dist: tenacity==8.2.3
Requires-Dist: tortoise-orm[asyncpg,psycopg]>=0.22.1
Requires-Dist: tqdm==4.66.4
Requires-Dist: tweepy>=4.16.0
Requires-Dist: undetected-chromedriver>=3.5.5
Requires-Dist: unidecode>=1.3.8
Requires-Dist: unipath==1.1
Requires-Dist: url-parser>=3.0.3
Requires-Dist: uvicorn>=0.34.0
Requires-Dist: webdriver-manager==4.0.2
Requires-Dist: you-get>=0.4.1730
Requires-Dist: yt-dlp>=2025.8.11
Description-Content-Type: text/markdown

# Grabber

Utilities to extract media from supported sources and publish to Telegram/Telegraph.

## Quick Start

1. Install dependencies:

```bash
uv sync --dev
```

2. Configure environment in `.env` (at minimum):
- `BOT_TOKEN`
- `TELEGRAPH_TOKEN`
- `SHORT_NAME`
- `AUTHOR_NAME`
- `AUTHOR_URL`

3. Show supported entities:

```bash
uv run grabber -a
```

## Run (CLI)

Example:

```bash
uv run grabber --publish --channel @your_channel --sources "https://example.com/post"
```

Send plain text to a channel:

```bash
uv run grabber --channel @your_channel --text "Some text"
```

Post text into a thread (discussion group):

```bash
uv run grabber --channel @your_channel --thread-id 123 --text "Some text in thread"
```

If `--thread-id` is omitted, `grabber` prints the resolved thread id so you can reuse it:

```bash
uv run grabber --channel @your_channel --text "Some text"
```

Output:

```text
Text sent to @your_channel
Resolved thread id for reuse: --thread-id 123
```

## Run Interactive Bot

```bash
uv run python -m grabber.core.bot.interactive_bot
```

Then use `/post`, select channel, and send source URLs.

## Run Web UI

The web interface is served in two services:

- `web-api`: FastAPI backend (session-protected endpoints)
- `web-ui`: React dashboard

### Setup

1. Add web environment variables:

```bash
WEB_UI_PASSWORD=your-strong-password
WEB_SESSION_SECRET=a-long-random-string
WEB_SESSION_TTL_SECONDS=86400
```

2. Start the stack:

```bash
docker compose up --build web-api web-ui
```

3. Open `http://<your-vps-ip>:8888` (or map another public port in `docker-compose.yml`).

4. Log in with `WEB_UI_PASSWORD`. After login:

- Enter one or more URLs and target channel
- Optionally enable tag mode/video mode
- Send plain text by checking **Post text only**
- Optionally provide a thread id (`--thread-id`) for text replies/comments

Notes:

- The channel list is pulled from the existing bot `CHANNELS` mapping (`src/grabber/core/settings.py`).
- Entity list comes from `QUERY_MAPPING`.
- API routes are exposed under `/api/*` and proxied by the UI.

## Pornhub Video-to-Channel

Single-video Pornhub URLs are supported for direct Telegram delivery.

### Requirements

- A target channel is mandatory (`--channel` in CLI or bot channel picker).
- `yt_dlp` must be available in the runtime environment.
- `PORNHUB_COOKIES_FILE_PATH` must point to a valid Netscape-format cookies file.

Add to `.env`:

```bash
PORNHUB_COOKIES_FILE_PATH=/absolute/path/to/pornhub_cookies.txt
```

### Supported Pornhub hosts

- `pornhub.com`
- `www.pornhub.com`
- `m.pornhub.com`

### CLI example

```bash
uv run grabber \
  --publish \
  --channel -1003835153909 \
  --sources "https://www.pornhub.com/view_video.php?viewkey=ph5abc123"
```

Note: CLI `--channel` must be a Telegram username/ID (for example `@channel_name` or `-100...`), not the interactive bot display label.

### Delivery behavior

- Chooses the best format expected to fit Telegram Bot API upload limits (`<= 50MB`).
- If the selected media is oversized, it sends a link message fallback to the channel.
- If `--channel` is missing, Pornhub execution fails fast with an explicit error.
- If `PORNHUB_COOKIES_FILE_PATH` is missing/invalid, execution fails fast.
- Checkpoint dedupe is applied, and successful sends are committed.

## Troubleshooting

- `PORNHUB_COOKIES_FILE_PATH is required...`
  - Set `PORNHUB_COOKIES_FILE_PATH` in `.env`.
- `...does not exist or is not a file`
  - Fix the path and permissions for the cookies file.
- `yt_dlp is required to process Pornhub sources`
  - Install dependencies in the same runtime where `grabber` executes.
- `ERROR: [PornHub] ... Unable to extract title`
  - The adapter now auto-falls back to HTML media parsing when `yt_dlp` metadata extraction fails.
  - If it still fails, refresh/export cookies again and update `yt-dlp` in your runtime.
- Oversized media warnings
  - Expected behavior; the source falls back to link delivery.
