Metadata-Version: 2.4
Name: cognitor
Version: 0.0.0
Summary: Python SDK to extract relevant metrics from Small Language Model inference calls.
Author-email: Riccardo <riccardo@tanaos.com>
Project-URL: Homepage, https://github.com/riccardo/cognitor-py
Project-URL: Bug Tracker, https://github.com/riccardo/cognitor-py/issues
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3.10
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: transformers
Requires-Dist: psutil
Requires-Dist: torch
Requires-Dist: pydantic
Requires-Dist: psycopg2-binary
Requires-Dist: sqlalchemy
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: isort; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Requires-Dist: python-dotenv>=1.0.1; extra == "dev"

# cognitor-py

`cognitor-py` is a Python SDK that wraps `transformers` inference calls to extract useful metadata and performance metrics.

## Features

- **Model Information**: Automatically captures the model name.
- **Performance Metrics**: Tracks CPU and RAM usage during inference.
- **GPU Monitoring**: Captures peak GPU memory usage (if CUDA is available).
- **Token Counting**: Calculates input and output token counts for common pipeline tasks.
- **Latency Tracking**: Measures inference duration.
- **Error Handling**: Captures and reports errors during inference.
- **Flexible Logging Targets**: Automatically saves all inference logs to either a local PostgreSQL database or a local file (JSON lines).
- **Graceful Error Handling**: Ensures the program continues to run even if the database is unreachable.

## Installation

```bash
pip install cognitor-py
```

## Usage

### Using the Inference Monitor

```python
from transformers import pipeline, AutoTokenizer
from cognitor import Cognitor

# Initialize your model and tokenizer
model_name = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
pipe = pipeline("text-generation", model=model_name, tokenizer=tokenizer)

# Initialize Cognitor with PostgreSQL configuration (default)
# Or use log_type="file" and log_path="logs.jsonl" for file logging
cognitor = Cognitor(
    model_name=model_name,
    tokenizer=tokenizer,
    log_type="database", # or "file"
    host="localhost",
    port=5432,
    user="postgres",
    password="postgres",
    dbname="cognitor"
)

# Run inference within the monitor context
with cognitor.monitor() as m:
    input_text = "Once upon a time,"
    # Use track() to capture only the inference duration
    with m.track():
        output = pipe(input_text, max_length=50)
    m.capture(input_data=input_text, output=output)

# The metadata is now available via the cognitor instance
metadata = cognitor.get_last_metadata()
print(output)
print(metadata)
```

### Metadata Structure

The extracted metadata follows this structure:

```python
{
    "model_name": "gpt2",
    "timestamp": "2026-04-01T14:34:14+0200",
    "input_tokens": 5,
    "output_tokens": 45,
    "cpu_percent": 12.5,
    "ram_usage_percent": 1.2,
    "gpu_usage_percent": 5.5, # Optional
    "duration": 0.45, # Inference-only duration
    "input": "Once upon a time,",
    "output": [...],
    "error": None
}
```

## License

MIT
