Metadata-Version: 2.4
Name: discourse-sim
Version: 0.1.1
Summary: LLM-augmented Agent-Based Social Simulation for modelling public discourse dynamics following a critical real-world event.
Author: dreji18
License: MIT
Project-URL: Homepage, https://github.com/dreji18/discourse_sim
Project-URL: Repository, https://github.com/dreji18/discourse_sim
Project-URL: Bug Tracker, https://github.com/dreji18/discourse_sim/issues
Keywords: agent-based-modelling,social-simulation,LLM,public-discourse,opinion-dynamics,ABM,polarisation,NLP
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Sociology
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.24
Requires-Dist: networkx>=3.0
Requires-Dist: tqdm>=4.64
Requires-Dist: pandas>=2.0
Requires-Dist: langchain-core>=0.1
Requires-Dist: langchain-community>=0.0.20
Requires-Dist: duckduckgo-search>=4.0
Provides-Extra: ollama
Requires-Dist: langchain-ollama>=0.1; extra == "ollama"
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: twine; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Dynamic: license-file
Dynamic: requires-python

# Discourse Simulator

**LLM-augmented Agent-Based Social Simulation** for modelling public discourse dynamics following a critical real-world event.

Agents observe live news, reason about it, post to a simulated social network, and update multidimensional beliefs through peer influence and news salience, all powered by a local Ollama LLM.

---

## Installation

### 1. Install discourse-sim

```bash
pip install discourse-sim
```

### 2. Install Ollama

discourse_sim runs LLMs locally via [Ollama](https://ollama.com). You need to install it separately.

**macOS**
```bash
brew install ollama
```
Or download the desktop app from [ollama.com/download](https://ollama.com/download).

**Linux**
```bash
curl -fsSL https://ollama.com/install.sh | sh
```

**Windows**

Download and run the installer from [ollama.com/download](https://ollama.com/download).

### 3. Start the Ollama server

```bash
ollama serve
```

> On macOS with the desktop app, this starts automatically. On Linux/Windows you may need to run this in a separate terminal or set it up as a service.

### 4. Pull the default model

```bash
ollama pull mistral:7b-instruct-q4_0
```

This is the default model (~4 GB). You can use any other Ollama model by passing `ollama_model="..."` to `DiscourseSimulation`.

**Lighter alternatives (if disk space is limited):**
```bash
ollama pull mistral          # ~4 GB  - good balance
ollama pull llama3.2         # ~2 GB  - faster, slightly less capable
ollama pull phi3             # ~2 GB  - Microsoft, very efficient
```

### 5. Verify everything works

```bash
ollama run mistral:7b-instruct-q4_0 "Hello"
```

You should see a response. If you do, you are ready to run simulations.

**All dependencies:** (make sure if everything is installed correctly)
```bash
pip install langchain-core langchain-community langchain-ollama \
            duckduckgo-search networkx numpy tqdm pandas openpyxl
```

---

## Quick Start

```python
from discourse_sim import DiscourseSimulation

sim = DiscourseSimulation(
    critical_event=(
        "In late April 2025, Dublin saw a large anti-immigration march "
        "from the Garden of Remembrance to the Custom House, with thousands "
        "protesting immigration levels and housing pressure..."
    ),
    event_date="2025-04-26",
    topic="immigration in Ireland",
    n_agents=100,
    n_days=15,
)

sim.run()
df = sim.to_dataframe()   # one row per agent per day
df.to_excel("output.xlsx", index=False)
```

---

## All Parameters

### Required

| Parameter | Type | Description |
|---|---|---|
| `critical_event` | `str` | Plain-text description of the event injected into every agent prompt on every day |
| `event_date` | `str` | ISO date `"YYYY-MM-DD"` - this becomes Day 0 |
| `topic` | `str` | Subject domain, e.g. `"immigration in Ireland"`. Anchors search queries and scoring prompts |

### Agent Population

| Parameter | Type | Default | Description |
|---|---|---|---|
| `n_agents` | `int` | `100` | Number of synthetic agents |
| `n_days` | `int` | `15` | Simulation duration in days |
| `agent_distribution` | `dict` | Ireland 2025 empirical split | Proportions of each agent kind - must sum to 1.0 |
| `kind_priors` | `dict` | `DEFAULT_KIND_PRIORS` | Prior distributions for belief/attitude initialisation per kind |

**Default distribution (Ireland 2025):**
```python
{"centrist": 0.45, "far_right": 0.20, "pro_imm": 0.25, "media": 0.10}
```

### Timeline

| Parameter | Type | Default | Description |
|---|---|---|---|
| `timeline` | `dict[int, str]` | `None` | Optional explicit daily news entries keyed by 0-based day index. Missing days auto-generated. If `None`, all days use live search. |

**Example:**
```python
timeline={
    0: "[VERIFIED] The march took place at the Garden of Remembrance...",
    4: "[VERIFIED] Government announced a deportation flight...",
}
```

Any day not in the dict falls back to an auto-generated entry. Agents search live news on every day regardless of timeline.

### LLM & Network

| Parameter | Type | Default | Description |
|---|---|---|---|
| `ollama_model` | `str` | `"mistral:7b-instruct-q4_0"` | Any locally pulled Ollama model |
| `temperature` | `float` | `0.75` | Post generation temperature |
| `use_llm_scoring` | `bool` | `True` | If `False`, uses keyword heuristic for attitude scoring |
| `network_k` | `int` | `6` | Watts-Strogatz nearest-neighbour count |
| `network_p` | `float` | `0.3` | Watts-Strogatz rewiring probability |
| `network_seed` | `int` | `42` | Reproducibility seed for network and agents |

### Belief Update Keywords

| Parameter | Type | Default | Description |
|---|---|---|---|
| `threat_keywords` | `list[str]` | See config.py | Words in daily news that trigger `security_threat_belief` increase and `exposure` accumulation |
| `humanitarian_keywords` | `list[str]` | See config.py | Words that trigger `humanitarian_belief` increase |

Override these when simulating non-immigration topics:
```python
threat_keywords=["eviction", "homeless", "unaffordable", "crisis"],
humanitarian_keywords=["social housing", "rights", "family", "support"],
```

---

## Outputs

```python
sim.run()

# Single DataFrame - one row per agent per day (recommended for analysis)
df = sim.to_dataframe()

# All three DataFrames
dfs = sim.to_dataframes()
dfs["history"]   # one row per timestep - aggregate metrics
dfs["agents"]    # one row per agent per timestep - full belief state
dfs["messages"]  # one row per agent per timestep - posts + beliefs merged
```

### Column reference - `df_messages`

| Column | Description |
|---|---|
| `t` / `date` | Day index and calendar date |
| `agent_id` / `kind` / `quirk` | Agent identity |
| `message` | The generated social media post |
| `interpreted_score` | LLM-scored stance [-1, +1] |
| `attitude` | Agent attitude at end of day t |
| `mood` | Agent mood at end of day t |
| `exposure` | Cumulative threat narrative exposure |
| `economic_threat_belief` | Economic threat sub-belief |
| `cultural_threat_belief` | Cultural threat sub-belief |
| `security_threat_belief` | Security threat sub-belief |
| `humanitarian_belief` | Humanitarian weight sub-belief |
| `avg_attitude` | Population mean attitude that day |
| `polarization` | Mean edge-level attitude divergence |
| `news_summary` | First 120 chars of daily news entry |

---

## Custom Agent Kinds

Add any agent kind by extending `DEFAULT_KIND_PRIORS`:

```python
from discourse_sim import DiscourseSimulation
from discourse_sim.config import DEFAULT_KIND_PRIORS

custom_priors = {
    **DEFAULT_KIND_PRIORS,
    "nationalist": {
        "attitude":        (0.55, 0.95),
        "economic_threat": (0.2,  0.6),
        "cultural_threat": (0.7,  1.0),
        "humanitarian":    (-0.6, 0.0),
        "openness":        (0.1,  0.35),
        "emotional_react": (0.5,  0.85),
    },
}

sim = DiscourseSimulation(
    ...,
    agent_distribution={"centrist": 0.35, "nationalist": 0.30, "pro_imm": 0.35},
    kind_priors=custom_priors,
)
```

---

## Architecture

```
discourse_sim/
├── __init__.py          # Public API: DiscourseSimulation
├── core.py              # DiscourseSimulation class - orchestrator
├── config.py            # SimConfig dataclass - all parameters + validation
├── agents/
│   └── agent.py         # Agent dataclass + make_agents() factory
├── tools/
│   └── search_tools.py  # make_tools() - search, memory, sentiment
├── timeline/
│   └── timeline.py      # Timeline class - user-supplied + auto-generated
├── simulation/
│   └── engine.py        # generate_message(), score_message(), update_beliefs(), run_loop()
└── utils/
    └── dataframes.py    # build_dataframes() - history/agents/messages
```

**Per-day flow for each agent:**

```
OBSERVE  →  search_event_news(query)          # live DuckDuckGo
         →  recall_agent_memory(posts_json)   # last 5 posts

THINK    →  build prompt: event + profile + memory + search result + today's news

ACT      →  LLM call → post text (max 40 words)

SCORE    →  LLM call (temp=0.0) → float in [-1, +1]

UPDATE   →  news salience → sub-beliefs
         →  peer pull     → attitude pressure
         →  mood shock    → affective state
         →  inertia       → resistance to change
         →  composite     → attitude(t)
```

---

## Examples

| File | Description |
|---|---|
| `examples/dublin_march.py` | Full reproduction of the original Dublin April 2025 experiment |
| `examples/minimal_usage.py` | Minimal usage - no timeline, live search only |
| `examples/custom_agent_kinds.py` | Adding a custom `nationalist` kind with its own priors |

---

## Citation

If you use this package in research, please cite:

```
@software{discourse_sim,
  title  = {discourse\_sim: LLM-Augmented Agent-Based Social Simulation of Discourse Dynamics},
  author = {dreji18},
  year   = {2026},
  note   = {Python package}
}
```
