Metadata-Version: 2.4
Name: contentintelpy
Version: 0.1.5
Summary: Production-grade NLP library for unified content intelligence.
Author-email: Ronit Fulari <ronitfulari31@gmail.com>
License: MIT
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.24.0
Requires-Dist: tqdm>=4.66.0
Provides-Extra: core
Requires-Dist: transformers<5.0.0,>=4.30.0; extra == "core"
Requires-Dist: torch<3.0.0,>=2.0.0; extra == "core"
Requires-Dist: sentence-transformers>=2.2.0; extra == "core"
Requires-Dist: scikit-learn>=1.0.0; extra == "core"
Provides-Extra: ner
Requires-Dist: spacy>=3.7.0; extra == "ner"
Requires-Dist: gliner>=0.1.0; extra == "ner"
Provides-Extra: translation
Requires-Dist: argostranslate>=1.9.0; extra == "translation"
Provides-Extra: summarization
Requires-Dist: sumy>=0.11.0; extra == "summarization"
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: isort; extra == "dev"
Dynamic: license-file

# contentintelpy

**Production-grade NLP library for unified content intelligence.**

`contentintelpy` provides a unified, DAG-based engine for multilingual sentiment analysis, NER, translation, and summarization using real transformer models (RoBERTa, GLiNER, NLLB).

## Features

- **Real Models**: No heuristics. Uses State-of-the-Art Transformers.
    - Sentiment: RoBERTa
    - NER: GLiNER
    - Translation: NLLB (GPU) + ArgosTranslate (Offline CPU)
- **Hybrid Execution**: Models download on first run (lazy-loaded). Offline fallback available.
- **Deterministic Pipelines**: DAG-based execution guarantees order.
- **Dual API**: 
    - **Pipeline-first** for complex workflows.
    - **Service-first** for quick scripts.
- **Production Ready**: Thread-safe, standard error handling, sparse outputs.

## Installation

Install the base library:
```bash
pip install contentintelpy
```

### 🧠 Capability Extras (Recommended)
`contentintelpy` uses optional "extras" to keep the base installation lightweight. Depending on which features you need, use the following commands:

| Feature | Target Extras | Install Command |
| :--- | :--- | :--- |
| **All Features** | `core,ner,translation,summarization` | `pip install "contentintelpy[core,ner,translation,summarization]"` |
| **Search & Keywords** | `core` | `pip install "contentintelpy[core]"` |
| **Entity Extraction** | `ner` | `pip install "contentintelpy[ner]"` |
| **Translation** | `translation` | `pip install "contentintelpy[translation]"` |
| **Summarization** | `summarization` | `pip install "contentintelpy[summarization]"` |

> [!TIP]
> **Minimal Install**: If you only need language detection and simple text processing, you only need `pip install contentintelpy`.

> [!IMPORTANT]
> **GPU Support**: If you have an NVIDIA GPU, installing `torch` manually with CUDA support before installing the extras will significantly speed up Translation and Classification.

> [!IMPORTANT]
> **spaCy Model Requirement**
> If you use NER or language features, you must install a spaCy model manually:
> ```bash
> python -m spacy download en_core_web_sm
> ```

---

## Quick Start

Ideal for simple tasks in notebooks or scripts.

```python
from contentintelpy import SentimentService, TranslationService

# Sentiment
service = SentimentService()
result = service.analyze("This library is amazing!")
print(result) 
# {'value': 'positive', 'confidence': 0.99, ...}

# Translation
translator = TranslationService()
text = translator.translate("Hola mundo", target="en")
print(text)
# "Hello world"
```

## Production Usage (Pipeline-First)

Recommended for Backends, APIs, and Data Pipelines.

```python
import contentintelpy as ci

# 1. Create the canonical pipeline
pipeline = ci.create_default_pipeline()

# 2. Run it (Thread-safe)
result = pipeline.run({
    "text": "गूगल ने बेंगलुरु में नया कार्यालय खोला"
})

# 3. Access Sparse Output
print(result)
```

**Output Example:**
```json
{
  "text": "...",
  "text_translated": "Google opened a new office in Bengaluru",
  "language": "hi",
  "entities": [
    {"text": "Google", "label": "ORG"},
    {"text": "Bengaluru", "label": "LOC"}
  ],
  "sentiment": {
    "value": "neutral",
    "value_en": "neutral",
    "confidence": 0.95
  },
  "summary": "..."
}
```

## Error Handling

Nodes **never crash** the pipeline. Errors are collected in `errors` dict.

```python
{
    "text": "...",
    "errors": {
        "TranslationNode": "Model download failed: Connection error"
    }
}
```

## Architecture

This library is pure logic. It does **NOT** contain:
- Flask / FastAPI routes
- Database models
- Authentication

It is designed to be **consumed** by your backend application.
