Metadata-Version: 2.4
Name: achem
Version: 1.2.0
Summary: Deep Web Research Tool - Auto-classifies topics, prioritizes relevant sources, generates 300+ word AI conclusions
Author: Sarok
License-Expression: MIT
Project-URL: Homepage, https://github.com/sarok-exe/achem
Project-URL: Documentation, https://github.com/sarok-exe/achem#readme
Project-URL: Repository, https://github.com/sarok-exe/achem
Project-URL: Issues, https://github.com/sarok-exe/achem/issues
Keywords: research,deep-web,web-scraping,summarization,ai,cli,tool,ordinal-distance,positional-search,ollama
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Text Processing :: General
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: wikipedia-api>=0.5.4
Requires-Dist: rich>=13.0.0
Requires-Dist: psutil>=5.9.0
Requires-Dist: prompt_toolkit>=3.0.0
Requires-Dist: pyfiglet>=0.8.0
Requires-Dist: openai>=1.0.0
Requires-Dist: groq>=1.0.0
Requires-Dist: google-genai>=1.0.0
Requires-Dist: ddgs>=3.0.0
Requires-Dist: beautifulsoup4>=4.12.0
Requires-Dist: requests>=2.31.0
Requires-Dist: trafilatura>=1.6.0
Requires-Dist: textual>=0.50.0
Requires-Dist: duckduckgo-search>=4.0.0
Provides-Extra: arabic
Requires-Dist: arabic-reshaper>=3.0.0; extra == "arabic"
Requires-Dist: python-bidi>=0.14.0; extra == "arabic"
Provides-Extra: ollama
Requires-Dist: ollama>=0.1.0; extra == "ollama"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Provides-Extra: all
Requires-Dist: arabic-reshaper>=3.0.0; extra == "all"
Requires-Dist: python-bidi>=0.14.0; extra == "all"
Requires-Dist: ollama>=0.1.0; extra == "all"
Dynamic: license-file

# ACHEM - Deep Web Research Tool

![ACHEM Banner](https://img.shields.io/badge/ACHEM-v1.2.0-blue?style=for-the-badge)

> **ACHEM** (Arabic: آشم) is an intelligent research tool that automatically classifies your query, finds relevant specialized sources, and generates comprehensive 300+ word AI conclusions.

## Features

### Smart Topic Classification
Automatically detects your query type and prioritizes relevant sources:

| Topic | Priority Sites |
|-------|----------------|
| Anime & Manga | MyAnimeList, Crunchyroll, Anime News Network |
| Football/Soccer | The Analyst, ESPN, Sky Sports, Goal.com |
| MMA/Fighting | Tapology, Sherdog, MMA Junkie |
| Gaming | IGN, GameSpot, Metacritic, Steam |
| Movies/TV | IMDb, Rotten Tomatoes, Metacritic |
| Health/Medicine | Mayo Clinic, WebMD, WHO |
| Science | Scientific American, Nature, NASA |
| Technology | Stack Overflow, GitHub, TechCrunch |
| History | Britannica, History.com |

### Comprehensive Research
- **100+ Sources**: Searches DuckDuckGo with topic-prioritized results
- **Full Content Extraction**: Scrapes complete articles from specialized sites
- **Smart Filtering**: Removes ads/boilerplate, keeps relevant content
- **300+ Word Conclusions**: Detailed AI-generated analysis

### How Results Are Sorted
Results are sorted by relevance:
1. Most relevant sources for your topic first
2. Specialized sites for your query type
3. General sources last

## Installation

```bash
git clone https://github.com/sarok-exe/achem.git
cd achem
uv venv .venv && source .venv/bin/activate
uv pip install -e .
```

### API Configuration

Create `~/.ACHEM/api.env`:

```bash
OPENROUTER_API_KEY=your_openrouter_key_here
OPENROUTER_MODEL=google/gemma-4-31b-it:free
```

Get API key: https://openrouter.ai/settings

## Usage

```bash
achem "What are the health risks of smoking?" --ddg-limit 50
achem "Who will win Bayern vs Real Madrid?" --ddg-limit 100
achem "Latest One Piece chapter summary" --ddg-limit 50
```

### Options

```bash
--ddg-limit N     Number of sources (default: 100)
--mode ai         AI conclusions (default)
--mode local      Local TF-IDF (no API)
--lang en/fr/ar   Response language
```

## How It Works

```
┌──────────────────────────────────────────────────────┐
│ 1. CLASSIFY                                          │
│    Detects topic: anime, football, health, etc.       │
│    Identifies priority sites for your topic           │
├──────────────────────────────────────────────────────┤
│ 2. SEARCH (100+ sources)                             │
│    Prioritizes specialized sites                      │
│    Sorts by topic relevance                          │
├──────────────────────────────────────────────────────┤
│ 3. SCRAPE                                            │
│    Extracts full article text                        │
│    Removes ads and boilerplate                       │
├──────────────────────────────────────────────────────┤
│ 4. ANALYZE & CONCLUDE                                │
│    Generates 300+ word comprehensive analysis        │
│    Synthesizes all sources into detailed paragraphs  │
└──────────────────────────────────────────────────────┘
```

## Output

Reports saved to `~/Documents/ACHEM/`:
- **AI Conclusion**: 300+ word detailed analysis
- **All Articles**: Full extracted content
- **Topic Classification**: Shows detected category

## License

MIT License

## Acknowledgments

- [OpenRouter](https://openrouter.ai/) - Free AI models
- [DuckDuckGo](https://duckduckgo.com/) - Web search
- [Trafilatura](https://trafilatura.readthedocs.io/) - Content extraction
