Metadata-Version: 2.4
Name: TeLLMgramBot
Version: 3.14.1
Summary: LLM-powered Telegram bot (OpenAI + Anthropic)
Home-page: https://github.com/Digital-Heresy/TeLLMgramBot
Author: Digital Heresy
Author-email: ronin.atx@gmail.com
License: MIT
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: openai>2.0
Requires-Dist: anthropic>=0.40
Requires-Dist: PyYAML
Requires-Dist: httpx
Requires-Dist: beautifulsoup4
Requires-Dist: validators
Requires-Dist: tiktoken>=0.12
Requires-Dist: python-telegram-bot>=20.8
Requires-Dist: aiosqlite>=0.19
Requires-Dist: tzdata>=2025.2
Dynamic: author
Dynamic: author-email
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# TeLLMgramBot
The basic goal of this project is to create a bridge between a Telegram Bot and a Large Language Model (LLM), supporting both OpenAI's GPT models and Anthropic's Claude models.
* To use this library, you must have a Telegram account **with a user name**, not just a phone number. If you don't have one, [create one online](https://telegram.org/).
* If added to a Telegram group, the bot must be [administrator](https://www.alphr.com/add-admin-telegram/) in order to respond to a user calling out its name, initials, or nickname.
<img src="assets/TeLLMgramBot_Logo.png" width=200 align=center />

## Telegram Bot + LLM Encapsulation
* The Telegram interface handles special commands and basic "chatty" responses that don't require an LLM, like "Hello". Dynamic conversations are handed off to the LLM while Telegram acts as the interaction broker.
* Pass URLs in [square brackets] and mention how the bot should interpret them.
  * Example: "What do you think of this article? [https://some_site/article]"
  * Uses a separate model (configurable via `url_model`) to handle larger URL content.
* Ask questions about message history across all your chats using natural language; the bot will search, attribute messages to speakers, and include messages from other bots.
  * Example: "Who said thanks for the breakdown?" or "What did George say about the project?" or "Show me the last few messages."
  * All search filters (speaker, chat, date) are optional. Results are ordered most-recent-first. Configure `search_limit` to control how many results to return (default: 30).
  * Search automatically finds users and chats by their current or past names, so you can reference them however you remember them.
* Token limits measure conversation length and determine when to prune oldest messages to stay within model limits.
  * The bot loads the user's full history across all chats up to 50% of the token budget. In private chats, shared group context fills the remaining budget, enabling the bot to reference group conversations from a private context.
  * This eliminates amnesia when switching between private and group chats.
* Conversation archive preserves long-term context without consuming token budget.
  * Older messages are automatically distilled into concise daily summaries (Tier 1), then progressively compressed into monthly digests (Tier 2). Raw messages are never deleted; archive rows surface seamlessly in search results and context loading.
  * Configurable via `archive_days` (default 60 days before Tier 1 triggers; Tier 2 triggers at 2x this value).
* Users can manage privacy via two commands:
  * `/forget` - In private chats, clears your full conversation and resets all active sessions. In group chats, removes only your messages and cleans up paired bot replies.
  * `/private` - Toggle private mode (private chats only). When ON, your messages in private chats are excluded from group conversation contexts, enabling selective privacy even in shared groups.

## Why Telegram?
Using Telegram as the interface not only solves "exposing" the interface, but gives you boatloads of interactivity over a standard Command Line interface, or trying to create a website with input boxes and submit buttons to try to handle everything:
1. Telegram already lets you paste in verbose, multiline messages.
2. Telegram already lets you paste in pictures, videos, links, etc.
3. Telegram already lets you react with emojis, stickers, etc.
4. Telegram message reactions (👀) provide a lightweight read receipt without breaking conversation flow.

## Supported LLM Providers
TeLLMgramBot selects the LLM provider automatically based on the model name:

| Model prefix | Provider | Example models |
|---|---|---|
| `gpt-` | OpenAI | `gpt-4o`, `gpt-4o-mini`, `gpt-5-mini` |
| `claude-` | Anthropic | `claude-sonnet-4-6`, `claude-haiku-4-5` |

Simply set `chat_model` (and optionally `url_model`) in your `config.yaml` to any supported model and supply the corresponding API key - no other changes needed.

## Directories
TeLLMgramBot creates the following directories:

- **`configs`** - Bot configuration and model parameters (path configurable via `TELLMGRAMBOT_CONFIGS_PATH`)
  - `config.yaml` - Default bot configuration file (filename used throughout this README); can be changed by passing `config_file` to `TelegramBot.set()`
  - `models.yaml` - Token limits for each LLM model (pre-populated on first run)
- **`prompts`** - Bot personas (path configurable via `TELLMGRAMBOT_PROMPTS_PATH`)
  - `test_personality.prmpt` - Default bot persona file (filename used throughout this README); can be changed by passing `prompt_file` to `TelegramBot.set()`
  - A system appendix is automatically appended to every persona at runtime, teaching the LLM about cross-chat memory and search behavior. User messages include speaker annotations with chat context and timestamps so the LLM always knows who is speaking, in which chat, and when.
- **`logs`** - Bot instance logs (one per startup, named after the bot's Telegram username or `instance_name` config, e.g. `my_bot_2026-03-29_10-30-45.log`)
  - Logs include anonymized Telegram IDs for privacy. Console shows INFO-level TeLLMgramBot messages only, prefixed with an `[identity label]` (the bot's Telegram username by default, or `instance_name` when configured).
  - Log file timestamps are UTC in `[YYYY-MM-DD HH:MM:SS.mmm]` format.
  - Bot keeps the 10 most recent logs per bot instance, automatically pruning older ones.
  - Pass `-v` or `--verbose` on startup for DEBUG-level logging.
- **`data`** - SQLite database (default `conversations.db`, customizable via `instance_name` config) storing all messages, users, and chats
  - Users manage their data via `/forget` and `/private` commands.

### Environment Variables for Paths
Override default directory locations by setting these environment variables (useful for containerized deployments):

| Variable | Purpose | Default |
|----------|---------|---------|
| `TELLMGRAMBOT_CONFIGS_PATH` | Directory containing `config.yaml` and `models.yaml` | `{exec_dir}/configs` |
| `TELLMGRAMBOT_PROMPTS_PATH` | Directory containing prompt files | `{exec_dir}/prompts` |
| `TELLMGRAMBOT_LOGS_PATH` | Directory for log files | `{exec_dir}/logs` |
| `TELLMGRAMBOT_DATA_PATH` | Directory containing `conversations.db` | `{exec_dir}/data` |

If unset, all paths default to subdirectories of the execution directory (the directory containing your entry-point script).

## API Keys
TeLLMgramBot supports four API keys. OpenAI, Anthropic, and VirusTotal keys load from environment variables or `.key` files. The Telegram key loads from the bot config field `telegram_api_key` or its env var (no `.key` file); config wins when explicitly set:

| Key | Env Var | File/Config | When required |
|-----|---------|-------------|---------------|
| [OpenAI](https://platform.openai.com/api-keys) | `TELLMGRAMBOT_OPENAI_API_KEY` | `openai.key` | For `gpt-*` models |
| [Anthropic](https://console.anthropic.com/settings/keys) | `TELLMGRAMBOT_ANTHROPIC_API_KEY` | `anthropic.key` | For `claude-*` models |
| [Telegram](https://t.me/BotFather) | `TELLMGRAMBOT_TELEGRAM_API_KEY` | bot config `telegram_api_key` in `config.yaml` | Always required |
| [VirusTotal](https://www.virustotal.com/gui/my-apikey) | `TELLMGRAMBOT_VIRUSTOTAL_API_KEY` | `virustotal.key` | For URL analysis |

Missing provider keys (OpenAI or Anthropic) disable chat and URL analysis but allow the bot to start. Missing VirusTotal disables URL analysis. Telegram key is required - the bot will not start without it.

Key files are created in the execution directory (or `TELLMGRAMBOT_KEYS_PATH` for legacy deployments). Alternatively, set environment variables before launching, e.g.:
```python
os.environ['TELLMGRAMBOT_OPENAI_API_KEY'] = my_vault.get('openai_key')
os.environ['TELLMGRAMBOT_ANTHROPIC_API_KEY'] = my_vault.get('anthropic_key')
os.environ['TELLMGRAMBOT_TELEGRAM_API_KEY'] = my_vault.get('telegram_key')
os.environ['TELLMGRAMBOT_VIRUSTOTAL_API_KEY'] = my_vault.get('virustotal_key')
```

## Commands and Interactions

### Available Commands
- `/nick <name>` - Set your nickname (for bot use in group chats).
- `/forget` - Clear your conversation history. Shows a confirmation prompt before deletion. In private chats, clears everything and resets all active sessions. In group chats, removes only your messages.
- `/private` - Toggle private mode (private chats only). When ON, your messages are excluded from group context loading.
- `/tools` - List all registered tools available to this bot instance (admin-only, private chat only). Shows the built-in search_messages tool and any webhook or MCP tools defined in config.yaml.
- `/help` - Display available commands and usage information. In private chats, if you are a bot owner, also shows administrator-only commands (`/start`, `/stop`, `/wipe`, `/tools`).

### Group Chat Triggers
The bot responds in groups when you:
- Mention the bot by username (e.g., `@botname`)
- Mention the bot by nickname or initials (configured via `config.yaml`)
- Reply directly to one of the bot's messages

When multiple bots are @mentioned in the same message, the bot coexists: if you mention the bot's nickname or initials, or reply to its message, the bot always engages (you may be intentionally addressing both bots). If the only trigger is a reply to the bot's message AND the message exclusively addresses a different bot via @mention (no mention of this bot), the bot yields silently - this supports threaded context without redundant responses.

### Private Chat Behavior
In private chats, the bot responds to all your messages. If you reply to an earlier message in the conversation that is not already in the bot's context window, that message is automatically surfaced as inline context so the bot can understand the full conversation thread.

### Read Receipt (Group Chats Only)
When the bot is triggered in a group and about to respond (not deferring to another bot), it immediately sends a 👀 emoji reaction on your message as a read receipt acknowledgement (falls back to "Got it!" text reply on older Telegram clients). This confirmation arrives before the full LLM response, providing quick feedback that the bot received your message.

## Bot Setup
1. Ensure API keys are set up and your Telegram bot is created via BotFather.
2. Install TeLLMgramBot: `pip install TeLLMgramBot`
3. Configure the bot via `config.yaml` (created on first run):
   - `bot_owner`: Telegram username(s) with admin access (required, no `@`). Accepts a single string or a YAML list of usernames.
   - `chat_model`: LLM model for conversation (e.g. `gpt-4o-mini` or `claude-sonnet-4-6`)
   - `url_model`: LLM model for URL analysis (e.g. `gpt-4o` or `claude-haiku-4-5`)
   - `telegram_api_key`: Telegram bot API key (required). Lookup order: (1) `telegram_api_key` in `config.yaml`, (2) `TELLMGRAMBOT_TELEGRAM_API_KEY` env var. Config wins when explicitly set. Exits on placeholder, missing, or malformed key.
   - `bot_nickname` / `bot_initials`: Names the bot responds to in groups
   - `instance_name`: Optional label for console prefix, log filename, and database name (e.g. `MyBot` produces `[MyBot] INFO: ...` on console, `MyBot_{timestamp}.log` logs, and `MyBot.db` database); omit to use bot's Telegram username for logging and `conversations.db` for database. Use distinct names when running multiple bot instances in the same directory.
   - `token_limit`: Max tokens (optional; defaults to model's maximum)
   - `search_limit`: Max search results (optional; defaults to 30)
   - `archive_days`: Days before messages are eligible for archival (optional; default 60, minimum 1). Older messages are distilled into daily summaries, then progressively compressed into monthly digests. Once archived their respective raw messages do not return to the LLM context any more, only when searching messages.
   - `allow_local_webhooks`: Set to `true` to permit webhook/MCP URLs targeting loopback or link-local addresses (optional; default `false`). Useful when tools like Home Assistant run on the same host.
   - `tools`: Optional list of webhook and MCP tool definitions (admin-only, private chat only). See [docs/tools.md](docs/tools.md) for schema and examples.
4. **Disable group privacy mode in BotFather:**
   ```
   /setprivacy -> select your bot -> Disable
   ```
   With privacy mode enabled (default), the bot won't receive group messages that don't mention it, so it can't index other bots or load cross-chat context.
5. Run the bot:
   ```python
   from TeLLMgramBot import TelegramBot
   telegram_bot = TelegramBot.set()
   telegram_bot.poll()
   ```
   Once you see `TeLLMgramBot polling started`, the bot is online.
6. Type `/help` in Telegram to see all available commands.

## Resources
* GitHub repository [python-telegram-bot](https://github.com/python-telegram-bot/python-telegram-bot) has guides to create a Telegram bot.
* For more information on OpenAI models and token limits:
  * [OpenAI model overview and maximum tokens](https://platform.openai.com/docs/models)
  * [OpenAI message conversion to tokens](https://github.com/openai/openai-python)
  * [OpenAI custom fine-tuning](https://platform.openai.com/docs/guides/model-optimization)
  * [OpenAI's tiktoken library](https://github.com/openai/tiktoken/tree/main)
* For more information on Anthropic Claude models:
  * [Anthropic model overview and context windows](https://docs.anthropic.com/en/docs/about-claude/models)
  * [Anthropic Python SDK](https://github.com/anthropic/anthropic-sdk-python)
