Metadata-Version: 2.4
Name: sinapsis-vllm
Version: 0.2.0
Summary: Sinapsis templates for text, batch, streaming, and multimodal generation with vLLM.
Author-email: SinapsisAI <dev@sinapsis.tech>
Project-URL: Homepage, https://sinapsis.tech
Project-URL: Documentation, https://docs.sinapsis.tech/docs
Project-URL: Tutorials, https://docs.sinapsis.tech/tutorials
Project-URL: Repository, https://github.com/Sinapsis-AI/sinapsis-chatbots.git
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: sinapsis>=0.2.26
Requires-Dist: sinapsis-chatbots-base
Requires-Dist: vllm>=0.15.0
Provides-Extra: bitsandbytes
Requires-Dist: bitsandbytes>=0.49.2; extra == "bitsandbytes"
Provides-Extra: multimodal
Requires-Dist: pillow>=11.3.0; extra == "multimodal"
Provides-Extra: sinapsis-data-readers
Requires-Dist: sinapsis-data-readers[opencv]>=0.1.28; extra == "sinapsis-data-readers"
Provides-Extra: all
Requires-Dist: sinapsis-vllm[bitsandbytes]; extra == "all"
Requires-Dist: sinapsis-vllm[multimodal]; extra == "all"
Requires-Dist: sinapsis-vllm[sinapsis-data-readers]; extra == "all"
Dynamic: license-file

<h1 align="center">
<br>
<a href="https://sinapsis.tech/">
  <img
    src="https://github.com/Sinapsis-AI/brand-resources/blob/main/sinapsis_logo/4x/logo.png?raw=true"
    alt="" width="300">
</a>
<br>
Sinapsis vLLM
<br>
</h1>

<h4 align="center">Sinapsis templates for text, batch, streaming, and multimodal generation with vLLM.</h4>

<p align="center">
<a href="#installation">🐍 Installation</a> •
<a href="#features">🚀 Features</a> •
<a href="#example">📚 Usage example</a> •
<a href="#documentation">📙 Documentation</a> •
<a href="#license">🔍 License</a>
</p>

The `sinapsis-vllm` package provides Sinapsis templates built on top of [vLLM](https://github.com/vllm-project/vllm) for high-throughput text generation, batch inference, streaming, and vision-language completion.

<h2 id="installation">🐍 Installation</h2>

Install using your preferred package manager. We strongly recommend using <code>uv</code>.

Install the base package:

```bash
uv pip install sinapsis-vllm --extra-index-url https://pypi.sinapsis.tech
```

Or with raw <code>pip</code>:

```bash
pip install sinapsis-vllm --extra-index-url https://pypi.sinapsis.tech
```

> [!IMPORTANT]
> `vLLMMultiModal` requires the optional `multimodal` extra:
>
> ```bash
> uv pip install sinapsis-vllm[multimodal] --extra-index-url https://pypi.sinapsis.tech
> ```
>
> Or install all optional dependencies:
>
> ```bash
> uv pip install sinapsis-vllm[all] --extra-index-url https://pypi.sinapsis.tech
> ```

<h2 id="features">🚀 Features</h2>

<h3>Templates Supported</h3>

- **vLLMTextCompletion**: Standard packet-native text generation with vLLM.
- **vLLMBatchTextCompletion**: Batched generation for multiple conversations in one execution pass.
- **vLLMStreamingTextCompletion**: Async streaming variant that yields partial packets during generation.
- **vLLMMultiModal**: Vision-language completion using image packets from `container.images`.

<details>
<summary><strong><span style="font-size: 1.25em;">🧩 Common Attributes</span></strong></summary>

- `init_args` (`vLLMInitArgs`, required): Engine configuration such as `llm_model_name`, `tensor_parallel_size`,
  `dtype`, `quantization`, `trust_remote_code`, `max_model_len`, and memory settings.
- `completion_args` (`vLLMCompletionArgs`, required): Request-time generation parameters such as `temperature`,
  `top_p`, `top_k`, `min_p`, `max_tokens`, penalties, and structured-output settings.
- `cleanup_response` (`bool`, optional): Whether to clean leaked transport or template markers from the final response.
- `reasoning_start_tag` / `reasoning_end_tag` (`str | None`, optional): Optional tags used to extract reasoning into the
  packet `reasoning` field and keep the final answer in `response`.

</details>

> [!TIP]
> Use CLI command ``` sinapsis info --all-template-names``` to show a list with all the available Template names installed with Sinapsis vLLM.

<h2 id="example">📚 Usage example</h2>

The following agent runs one vLLM text-completion step from an `LLMConversationPacket`.

<details id='usage'><summary><strong><span style="font-size: 1.0em;"> Config</span></strong></summary>

```yaml
agent:
  name: my_test_agent

templates:
  - template_name: InputTemplate
    class_name: InputTemplate
    attributes: {}

  - template_name: LLMConversationInput
    class_name: LLMConversationInput
    template_input: InputTemplate
    attributes:
      prompt: "Give three short tips for staying organized during a busy week."
      system_prompt: "You are a helpful assistant."

  - template_name: vLLMTextCompletion
    class_name: vLLMTextCompletion
    template_input: LLMConversationInput
    attributes:
      init_args:
        llm_model_name: "Qwen/Qwen3-1.7B"
        max_model_len: 4096
        dtype: auto
        seed: 42
        gpu_memory_utilization: 0.9
        cpu_offload_gb: 0
        max_num_seqs: 8
        disable_log_stats: true
      completion_args:
        temperature: 0.7
        top_p: 0.8
        top_k: 20
        min_p: 0
        max_tokens: 2048
        seed: 42

```

</details>

<h2 id="documentation">📙 Documentation</h2>

Documentation for this and other sinapsis packages is available on the [sinapsis website](https://docs.sinapsis.tech/docs)

Tutorials for different projects within sinapsis are available at [sinapsis tutorials page](https://docs.sinapsis.tech/tutorials)

<h2 id="license">🔍 License</h2>

This project is licensed under the AGPLv3 license, which encourages open collaboration and sharing. For more details, please refer to the [LICENSE](LICENSE) file.

For commercial use, please refer to our [official Sinapsis website](https://sinapsis.tech) for information on obtaining a commercial license.
