Metadata-Version: 2.4
Name: artifex
Version: 0.10.0
Summary: Small Language Model Inference, Fine-Tuning, Evaluation and Observability.
Author-email: Riccardo Lucato <riccardo@tanaos.com>
License: # License
        
        Copyright (c) 2025-present Riccardo Lucato.
        
        This software is available under the "Sustainable Use License" as defined below.
        
        ## Sustainable Use License
        
        Version 1.0
        
        ### Acceptance
        
        By using the software, you agree to all of the terms and conditions below.
        
        ### Copyright License
        
        The licensor grants you a non-exclusive, royalty-free, worldwide, non-sublicensable, non-transferable license
        to use, copy, distribute, make available, and prepare derivative works of the software, in each case subject
        to the limitations below.
        
        ### Limitations
        
        You may use or modify the software only for your own internal business purposes or for non-commercial or
        personal use. You may distribute the software or provide it to others only if you do so free of charge for non-commercial purposes. You may not alter, remove, or obscure any licensing, copyright, or other notices of the licensor in the software. Any use of the licensor’s trademarks is subject to applicable law. Commercial use of this software is only permitted under the [Tanaos Enterprise License](LICENSE_EE.md).
        
        ### Patents
        
        The licensor grants you a license, under any patent claims the licensor can license, or becomes able to
        license, to make, have made, use, sell, offer for sale, import and have imported the software, in each case
        subject to the limitations and conditions in this license. This license does not cover any patent claims that
        you cause to be infringed by modifications or additions to the software. If you or your company make any
        written claim that the software infringes or contributes to infringement of any patent, your patent license
        for the software granted under these terms ends immediately. If your company makes such a claim, your patent
        license ends immediately for work on behalf of your company.
        
        ### Notices
        
        You must ensure that anyone who gets a copy of any part of the software from you also gets a copy of these
        terms. If you modify the software, you must include in any modified copies of the software a prominent notice
        stating that you have modified the software.
        
        ### No Other Rights
        
        These terms do not imply any licenses other than those expressly granted in these terms.
        
        ### Termination
        
        If you use the software in violation of these terms, such use is not licensed, and your license will
        automatically terminate. If the licensor provides you with a notice of your violation, and you cease all
        violation of this license no later than 30 days after you receive that notice, your license will be reinstated
        retroactively. However, if you violate these terms after such reinstatement, any additional violation of these
        terms will cause your license to terminate automatically and permanently.
        
        ### No Liability
        
        As far as the law allows, the software comes as is, without any warranty or condition, and the licensor will
        not be liable to you for any damages arising out of these terms or the use or nature of the software, under
        any kind of legal claim.
        
        ### Definitions
        
        The “licensor” is the entity offering these terms.
        
        The “software” is the software the licensor makes available under these terms, including any portion of it.
        
        “You” refers to the individual or entity agreeing to these terms.
        
        “Your company” is any legal entity, sole proprietorship, or other kind of organization that you work for, plus
        all organizations that have control over, are under the control of, or are under common control with that
        organization. Control means ownership of substantially all the assets of an entity, or the power to direct its
        management and policies by vote, contract, or otherwise. Control can be direct or indirect.
        
        “Your license” is the license granted to you for the software under these terms.
        
        “Use” means anything you do with the software requiring your license.
        
        “Trademark” means trademarks, service marks, and similar rights.
Project-URL: homepage, https://github.com/tanaos/artifex
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3.10
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE.md
License-File: LICENSE_EE.md
Requires-Dist: aiohttp>=3.13.5
Requires-Dist: cognitor>=0.1.0
Requires-Dist: datasets>=3.6.0
Requires-Dist: filelock>=3.20.3
Requires-Dist: jupyterlab>=4.4.8
Requires-Dist: nbconvert>=7.17.0
Requires-Dist: protobuf>=6.33.5
Requires-Dist: pygments>=2.20.0
Requires-Dist: rich>=14.1.0
Requires-Dist: sentencepiece>=0.2.1
Requires-Dist: synthex>=0.4.2
Requires-Dist: tiktoken>=0.12.0
Requires-Dist: torch>=2.8.0
Requires-Dist: tornado>=6.5.5
Requires-Dist: transformers[torch]==5.0.0
Requires-Dist: tzlocal>=5.3.1
Requires-Dist: urllib3>=2.6.3
Provides-Extra: evaluation
Requires-Dist: scikit-learn>=1.7.2; extra == "evaluation"
Dynamic: license-file

# Artifex

<p align="center">
    <a href="https://github.com/tanaos/artifex">
        <img src="https://raw.githubusercontent.com/tanaos/artifex/master/assets/hero.png" width="400px" alt="Artifex – Train task specific Small Language Models without training data, for offline NLP and Text Classification">
    </a>
</p>

<p align="center">
    <a href="https://pypi.org/project/artifex/">
        <img src="https://img.shields.io/pypi/dm/artifex" alt="Artifex - Monthly downloads">
    </a>
    <a href="https://pypi.org/project/artifex/">
        <img src="https://img.shields.io/pypi/v/artifex?logo=pypi&logoColor=%23fff&color=%23006dad&label=Pypi"
        alt="Artifex - Latest PyPi package version">
    </a>
    <a href="https://github.com/tanaos/artifex/actions/workflows/python-publish.yml">
        <img src="https://img.shields.io/github/actions/workflow/status/tanaos/artifex/python-publish.yml?logo=github&logoColor=%23fff&label=Tests"
        alt="Artifex - Tests status">
    </a>
    <a href="https://github.com/tanaos/artifex/commits/">
        <img src="https://img.shields.io/github/commit-activity/m/tanaos/artifex?style=flat&color=purple&label=Commit%20Activity" alt="Artifex – GitHub commit activity">
    </a>
    <a href="https://docs.tanaos.com/artifex/intro">
        <img src="https://img.shields.io/badge/%20Docs-Read%20the%20docs-orange?logo=docusaurus&logoColor=white"
        alt="Artifex - Documentation">
    </a>
</p>

<p align="center">
    <strong>Small Language Model Inference, Fine-Tuning and Observability. No GPU, no labeled data needed.</strong>
</p>

---

Artifex is a Python library for:
1. Using **pre-trained task-specific Small Language Models on CPU** 
2. **Fine-tuning them on CPU without any training data** — just based on your instructions for the task at hand.
    <details>
        <summary>How is it possible?</summary>
        Artifex generates synthetic training data on-the-fly based on your instructions, and uses this data to fine-tune Small Language Models for your specific task. This approach allows you to create effective models without the need for large labeled datasets.
    </details>
3. **Tracking model performance locally** with built-in evaluation and monitoring tools.

## Why Artifex?

Modern AI workflows are often

- Expensive (API usage, GPUs)
- Dependent on third-parties
- Data-hungry (require large labeled datasets)

Artifex changes that:

- Run tiny models locally on CPU (100M params, 500MB)
- Keep all data private (no API required)
- Generate synthetic data automatically for fine-tuning
- Fine-tune models for specific tasks (moderation, NER, classification, etc.)

## Available Models & Tasks

At this time, Artifex supports the following models:

| Task | Available Languages | Description | Default Model | How to use |
|------|----------|-------------|---------------|------------|
| Guardrail | English, German, Spanish | Flags unsafe, harmful, or off-topic messages. | [tanaos/tanaos-guardrail-v2](https://huggingface.co/tanaos/tanaos-guardrail-v2) (English version, see the page for the other languages) | [Examples](https://docs.tanaos.com/artifex/guardrail/code-examples/)
| Intent Classification | English | Classifies user messages into predefined intent categories. | [tanaos/tanaos-intent-classifier-v1](https://huggingface.co/tanaos/tanaos-intent-classifier-v1) | [Examples](https://docs.tanaos.com/artifex/intent-classifier/code-examples/)
| Reranker | English | Ranks a list of items or search results based on relevance to a query. | [cross-encoder/mmarco-mMiniLMv2-L12-H384-v1](https://huggingface.co/cross-encoder/mmarco-mMiniLMv2-L12-H384-v1) | [Examples](https://docs.tanaos.com/artifex/reranker/code-examples/)
| Sentiment Analysis | English | Determines the sentiment (positive, negative, neutral) of a given text. | [tanaos/tanaos-sentiment-analysis-v1](https://huggingface.co/tanaos/tanaos-sentiment-analysis-v1) | [Examples](https://docs.tanaos.com/artifex/sentiment-analysis/code-examples/)
| Emotion Detection | English | Identifies the emotion expressed in a given text. | [tanaos/tanaos-emotion-detection-v1](https://huggingface.co/tanaos/tanaos-emotion-detection-v1) | [Examples](https://docs.tanaos.com/artifex/emotion-detection/code-examples/)
| Named Entity Recognition | English | Detects and classifies named entities in text (e.g., persons, organizations, locations). | [tanaos/tanaos-NER-v1](https://huggingface.co/tanaos/tanaos-NER-v1) | [Examples](https://docs.tanaos.com/artifex/named-entity-recognition/code-examples/)
| Text Anonymization | English | Removes personally identifiable information (PII) from text. | [tanaos/tanaos-text-anonymizer-v1](https://huggingface.co/tanaos/tanaos-text-anonymizer-v1) | [Examples](https://docs.tanaos.com/artifex/text-anonymization/code-examples/)
| Spam Detection | English, German, Spanish, Italian | Identifies whether a message is spam or not. | [tanaos/tanaos-spam-detection-v1](https://huggingface.co/tanaos/tanaos-spam-detection-v1) (English version, see the page for the other languages) | [Examples](https://docs.tanaos.com/artifex/spam-detection/code-examples/)
| Topic Classification | English | Classifies text into predefined topics. | [tanaos/tanaos-topic-classification-v1](https://huggingface.co/tanaos/tanaos-topic-classification-v1) | [Examples](https://docs.tanaos.com/artifex/topic-classification/code-examples/)
| Text Summarization | English | Rewrites text in a more concise way. | [tanaos/tanaos-text-summarization-v1](https://huggingface.co/tanaos/tanaos-text-summarization-v1) | [Examples](https://docs.tanaos.com/artifex/text-summarization/code-examples/)

Looking for models in other languages? Our [Enterprise License](#license-paid--enterprise-solutions) includes models in any language. Reach out at [info@tanaos.com](mailto:info@tanaos.com) for more details.

For each model, Artifex provides:
1. **Inference API** to use a default, pre-trained Small Language Model to perform that task out-of-the-box locally on CPU.
2. **Fine-tune API** to fine-tune the default model based on your requirements, without any training data and on CPU.
3. **Load API** to load your fine-tuned model locally on CPU, and use it for inference or further fine-tuning.
4. **Built-in, automatic evaluation** and monitoring tools to track model performance over time, locally on your machine.

## Quick Start

Install Artifex with:

```bash
pip install artifex
```

### Guardrail Model

#### Use the default Guardrail model

Use Artifex's default guardrail model, which is trained to flag unsafe or harmful messages out-of-the-box:

```python
from artifex import Artifex

guardrail = Artifex().guardrail()
print(guardrail("How do I make a bomb?"))

# >>> [{'is_safe': False, 'scores': {'violence': 0.625, 'non_violent_unethical': 0.0066, 'hate_speech': 0.0082, 'financial_crime': 0.0072, 'discrimination': 0.0029, 'drug_weapons': 0.6633, 'self_harm': 0.0109, 'privacy': 0.003, 'sexual_content': 0.0029, 'child_abuse': 0.005, 'terrorism_organized_crime': 0.1278, 'hacking': 0.0096, 'animal_abuse': 0.009, 'jailbreak_prompt_inj': 0.0131}}]
```

Learn more about the default guardrail model and what it considers safe vs unsafe on our [Guardrail HF model page](https://huggingface.co/tanaos/tanaos-guardrail-v2).

#### Create & use a custom Guardrail model

Need more control over what is considered safe vs unsafe? Fine-tune your own guardrail model, use it locally on CPU and keep it forever:

```python
from artifex import Artifex

guardrail = Artifex().guardrail()

model_output_path = "./output_model/"

guardrail.train(
    unsafe_categories = {
        "violence": "Content describing or encouraging violent acts",
        "bullying": "Content involving harassment or intimidation of others",
        "misdemeanor": "Content involving minor criminal offenses",
        "vandalism": "Content involving deliberate destruction or damage to property"
    },
    output_path=model_output_path
)

guardrail.load(model_output_path)
print(guardrail("I want to destroy public property."))

# >>> [{'is_safe': False, 'scores': {'violence': 0.592, 'bullying': 0.0066, 'misdemeanor': 0.672, 'vandalism': 0.772}}]
```

### Reranker model

#### Use the default Reranker model

Use Artifex's default reranker model, which is trained to rank items based on relevance out-of-the-box:

```python
from artifex import Artifex

reranker = Artifex().reranker()

print(reranker(
    query="Best programming language for data science",
    documents=[
        "Java is a versatile language typically used for building large-scale applications.",
        "Python is widely used for data science due to its simplicity and extensive libraries.",
        "JavaScript is primarily used for web development.",
    ]
))

# >>> [('Python is widely used for data science due to its simplicity and extensive libraries.', 3.8346), ('Java is a versatile language typically used for building large-scale applications.', -0.8301), ('JavaScript is primarily used for web development.', -1.3784)]
```

#### Create & use a custom Reranker model

Want to fine-tune the Reranker model on a specific domain for better accuracy? Fine-tune your own reranker model, use it locally on CPU and keep it forever:

```python
from artifex import Artifex

reranker = Artifex().reranker()

model_output_path = "./output_model/"

reranker.train(
    domain="e-commerce product search",
    output_path=model_output_path
)

reranker.load(model_output_path)
print(reranker(
    query="Laptop with long battery life",
    documents=[
        "A powerful gaming laptop with high-end graphics and performance.",
        "An affordable laptop suitable for basic tasks and web browsing.",
        "This laptop features a battery life of up to 12 hours, perfect for all-day use.",
    ]
))

# >>> [('This laptop features a battery life of up to 12 hours, perfect for all-day use.', 4.7381), ('A powerful gaming laptop with high-end graphics and performance.', -1.8824), ('An affordable laptop suitable for basic tasks and web browsing.', -2.7585)]
```

### Other Tasks

For more details and examples on how to use Artifex for the other available tasks, check out our [Documentation](https://docs.tanaos.com/artifex).

## License, Paid & Enterprise solutions

Artifex is [fair code](https://faircode.io/) distributed under the [Sustainable Use License](LICENSE.md) and [Tanaos Enterprise License](LICENSE_EE.md).

- **Source available**: source code is always visible
- **Extensible**: you can add your own models and functionalities

[Enterprise licenses](LICENSE_EE.md) are available for additional features and support. Contact us at [info@tanaos.com](mailto:info@tanaos.com) for more details. Enterprise features include:

- **Higher-Performance Models**
    - Improved accuracy
    - Better handling of edge cases
    - Reduced false positives/negatives
- **Custom Models**
    - Models fine-tuned on your specific data and requirements
    - Support for any language, domain or task
- **Production-Ready Models**
    - Models trained on 1000x more data
    - 10x lower inference latency
- **Dedicated Support**
    - Priority support
    - Custom feature requests
    - Dedicated onboarding and training

Additional information about license can be found [in the docs](https://docs.tanaos.com/licenses).

## Contributing

Contributions are welcome! Whether it's a bug fix or a new feature you want to add, we'd love your help. Check out our [Contribution Guidelines](CONTRIBUTING.md) to get started.

## Documentation & Support

- Full documentation: https://docs.tanaos.com/artifex
- Get in touch: info@tanaos.com
