Metadata-Version: 2.4
Name: llm_summary
Version: 0.6.0
Summary: Use an LLM to summarize paragraphs. 
Author: Paul Eger
Project-URL: homepage, https://github.com/egerpaulj/LlmSummary
Project-URL: repository, https://github.com/egerpaulj/LlmSummary.git
Requires-Python: >=3.10.16
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: typing
Requires-Dist: pydantic
Requires-Dist: ollama
Requires-Dist: pika
Requires-Dist: prometheus_client
Requires-Dist: psutil
Requires-Dist: pymongo
Requires-Dist: mlflow
Requires-Dist: openai
Requires-Dist: google-genai
Requires-Dist: tenacity
Dynamic: license-file

# LLM Summary

Use an LLM to summarize text.

The purpose of the library is to summarize large texts, where core information is captured in the summary

**Input**: Paragraphs of text

**Output**: Summary

## Install and use library

```bash
pip install llm_summary
``` 

```python
from llm_summary.inference_api.summary_inference 
import SummaryInferenceProvider

text = "us business leaders ...."

inference_provider = SummaryInferenceProvider(model="llama3.2") 
summary = inference_provider.summarize(text)

print (summary)

```

Output:
```
Summary inference completed in 13.31 seconds

summary="Business leaders are donating large sums to Donald Trump's second inaugural fund, with predicted total donations exceeding $107m"
``` 


### Environment setup

## Local LLM - Ollama

Run a local LLM using OLLAMA.

- Download and install OLLAMA:
https://ollama.com/download

- Download the LLM Model

```bash
ollama run gemma3:12b
```


## Example
Model: **gemma3:27b**


```python



import logging
import os
from llm_summary.inference_api.summary_inference import SummaryInferenceProvider
from llm_summary.inference_api.mlflow_config import MlFlowConfig

summarizer = SummaryInferenceProvider(
            model="gemma3:27b",
            strategy="ollama"
            ollama_host="http://localhost:11434"
        )

summarizer.summarize(text=text)
                            summary_text = result.summary

```

## Prompt management

Prompts can vary between models/providers. Version controlling the prompts and annotating the prompt helps with faster evaluation.

MlFlow provides a prompt repository and is used to get the user prompt and system prompt.

E.g. prompts for news article summaries

```bash
-e MLFLOW_SYSTEM_PROMPT_ID=SUMMARY_System/@article \
-e MLFLOW_USER_PROMPT_ID=SUMMARY_User/@article \
```



## Data pipelines - Consumers

Two data pipeline examples show how existing mongodb documents can be decorated with a summary:

- Decorate text with a summary

E.g.

### Mongo DB Summary Decorator

In this example text is read from a MongoDb Collection and the a summary is generated. The entry in mongodb is updated with the summary.

The container would read a collection in batches and generates summaries.

```bash
docker run -e OLLAMA_MODEL=gemma3:12b \
           -e OLLAMA_HOST=http://ollama:11434 \
           -e MONGODB_CONNECTION_STRING=mongodb://mongodb:27017 \
           -e MLFLOW_SYSTEM_PROMPT_ID=SUMMARY_System/@article \
           -e MLFLOW_USER_PROMPT_ID=SUMMARY_User/@article \
           -e MLFLOW_TRACKING_HOST=http://mlflow:5000
           --network development_network \
           --rm \
           --name summarizer \
           --hostname summarizer \
           mongodb-summarizer
```



### Dependencies

Ollama should be running in a docker container; or a network route should be available.

Run the following for a CPU-Only ollama instance:

```bash
docker run -d --rm 
    -v /usr/share/ollama/.ollama:/root/.ollama \
    -p 11434:11434 \
    --network development_network \
    --name ollama \
    --hostname ollama  \
    ollama/ollama

```

Note: replace the path **/usr/share/ollama/.ollama** to the host's ollama model path


## Evaluation

Due to the sheer amount of model options and inference providers; it is necessary to evaluate.

Evaluation allows methodical selection of hyper parameters:
- User Prompt (based on model, text source and model provider)
- System Prompt (based on model, text source and model provider)
- Model

### Evaluation criteria
The summary should contain  'key information' extracted from the larger text; the information can be subjective due to the following:

- The context (i.e. what is the subject matter, and how the summary is used)
- Definition 'key information'
- Writing style of the summary

### LLM as a judge

Create an LLM Judge, the judge would judge summaries.

### Evaluate the LLM as a judge
The LLM judge is key during evaluation of models/prompts.

Therefore the LLM judge should have a strong Score.
The LLM judge will be evaluated using the following metrics:

**LLM Judge F1-Score**:
- True positives (**TP**): the summary is "similar" to the original
- False positives (**FP**): the summary contains additional irrelevant information
- True negatives (**TN**): the summary does not include trivial information
- False negatives (**FN**): the summary is missing significant information



### Precision

Proportion of correctly summarized text (accuracy).

High precision means that when the model summarizes, it is likely to be correct.


$$
\begin{aligned}
\text{Precision} &= \frac{TP}{TP + FP}
\end{aligned}
$$

Where:
- \( TP \) = True Positives
- \( FP \) = False Positives

### Recall 

Ability to summarize text.

High recall means the model has summarized significant information from the text.

$$
\begin{aligned}
\text{Recall} &= \frac{TP}{TP + FN}
\end{aligned}
$$

Where:
- \( TP \) = True Positives
- \( FN \) = False Negatives

### F1 Score
$$
\begin{aligned}
F1 &= 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}
\end{aligned}
$$


## Experiments

Publish the evaluation metrics for each "experiment". 

ML-Flow is a good tool for analysis.


## Metrics

Below are some evaluations, the evaluations are based on the following metrics:
- LLM Model
- Quality of summarization
- Local vs. Cloud
- Estimate costs



### Conclude the evaluation

Pick the metric most significant, then short-list the models, and do a cost analysis.

To evaluate several weights can be given to each of the metric.

E.g. load your metrics and use a ranker to evaluate:

**(e.g. Accuracy: 1, Percentage:2, Distance: 3  and Latency: 1)**





## License

Copyright (C) 2026  Paul Eger

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <https://www.gnu.org/licenses/>.
