Metadata-Version: 2.4
Name: deal-intel-mcp
Version: 0.2.3
Summary: Self-owned MCP deal intelligence layer for early B2B teams.
Author: JrJuni
License-Expression: MIT
Project-URL: Homepage, https://github.com/JrJuni/deal-intel-mcp
Project-URL: Repository, https://github.com/JrJuni/deal-intel-mcp
Project-URL: Issues, https://github.com/JrJuni/deal-intel-mcp/issues
Keywords: mcp,sales,crm,b2b,deal-intelligence,mongodb,meddpicc
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Office/Business
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastmcp>=2.0
Requires-Dist: pymongo>=4.7
Requires-Dist: dnspython>=2.0
Requires-Dist: anthropic>=0.28
Requires-Dist: httpx>=0.27
Requires-Dist: python-dotenv>=1.0
Requires-Dist: pydantic>=2.7
Requires-Dist: pypdf>=4.2
Requires-Dist: pyyaml>=6.0
Requires-Dist: tzdata>=2024.1
Requires-Dist: typer>=0.12
Provides-Extra: embedding
Requires-Dist: sentence-transformers>=2.7; extra == "embedding"
Provides-Extra: dev
Requires-Dist: pytest>=8; extra == "dev"
Requires-Dist: ruff>=0.4; extra == "dev"
Dynamic: license-file

﻿# deal-intel-mcp

**English** | [Korean](README.ko.md)

Deal Intelligence MCP is a self-owned deal memory layer for solo founders,
early AI teams, and lightweight BD teams.

It turns meeting notes, customer email replies, interviews, call summaries, and
internal notes into structured deal signals: what is known, what is missing,
which accounts need attention, and what should be reviewed before the next
conversation.

For mature sales organizations, it is not a Salesforce replacement. For early
teams whose current system is scattered notes, spreadsheets, email threads, and
memory, it can act as a lightweight first CRM-like layer: your data, your
MongoDB or local storage, your LLM provider, queried from Claude Desktop,
Codex, or another MCP-capable host.

The default operating path is MongoDB Atlas-backed `full` mode, including the
free/M0 tier. A bundled no-MongoDB `sample` mode exists for AI agents, quick
evaluation, and demos, but real team use should start from `full`.

Start here:

- Setup with an AI assistant: [`AI_START_HERE.md`](AI_START_HERE.md).
- Longer full-mode walkthrough: [`AI_FULL_INSTALL_GUIDE.md`](AI_FULL_INSTALL_GUIDE.md).
- Public demo prompts: [`docs/public-demo-script.md`](docs/public-demo-script.md).
- First external tester handoff: [`AI_USER_TEST_GUIDE.md`](AI_USER_TEST_GUIDE.md).
- Fork/customize the project: [`docs/extending.md`](docs/extending.md),
  [`docs/customization-recipes.md`](docs/customization-recipes.md), then
  [`docs/architecture.md`](docs/architecture.md).
- Korean full-mode walkthrough: [`AI_FULL_INSTALL_GUIDE.ko.md`](AI_FULL_INSTALL_GUIDE.ko.md).

---

## What it does

- Stores deal records and customer evidence in your MongoDB Atlas database, or
  in local sample/personal storage for zero-config trials.
- Converts messy customer evidence into structured deal fields, health signals,
  follow-up gaps, customer themes, and weekly review artifacts.
- Lets you add seller-side product/solution context, such as ICP notes,
  positioning, pricing notes, security claims, integrations, and competitor
  notes, so interaction extraction can understand your product better.
- Lets an AI host answer normal questions such as "which deal needs attention
  first?", "what are customers worried about?", or "make this week's pipeline
  report".
- Keeps read-only BI, review, reporting, and export paths deterministic and
  LLM-free, so the host app can narrate results without extra server-side model
  calls.
- Exposes diagnostics (`config_doctor`, storage checks, usage/cost summaries)
  instead of hiding setup and data-quality problems.

## What it is not

- It is not an autonomous closer. It structures evidence; you still make the
  sales judgment.
- It is not yet a mature CRM suite with permissions, contact ownership,
  calendar/email sync, workflow automation, or enterprise integrations.
- It is not a hosted SaaS that owns your deal data. The normal full path uses
  your MongoDB Atlas project and your selected LLM provider.
- It does not claim revenue lift numbers. If the data is incomplete, it says so.

## Architecture At A Glance

This is not just a prompt wrapper. It is a small deal-intelligence backend
exposed through MCP:

```text
[AI host: Claude / Codex / ChatGPT]
        |
        v
[MCP tool surface]
        |
        v
[Domain service layer]
  |-- deal and interaction intake
  |-- qualification extraction
  |-- product context retrieval
  |-- metrics, gaps, themes, reports
  `-- export and diagnostics
        |
        v
[Storage and retrieval]
  |-- MongoDB Atlas full mode
  |-- local sample/personal mode
  `-- Atlas Vector Search pro mode
```

A normal write path looks like this:

```text
1. A meeting note, email reply, interview, or call summary enters through add_interaction.
2. The server-side LLM extracts structured deal signals from customer evidence.
3. Raw evidence, source metadata, and derived qualification fields are stored.
4. Deal summaries and product-context chunks can be embedded for retrieval.
5. Read-only tools compute metrics, gaps, reports, themes, and exports without extra LLM calls.
6. The MCP host narrates the final answer to the user.
```

Important boundaries:

- MCP is the interface layer; Claude, Codex, or ChatGPT can be the UI.
- Customer evidence is separated from derived intelligence such as health,
  gaps, themes, and qualification snapshots.
- LLM-heavy paths are mostly write/enrichment paths. Read/report/export paths
  are deterministic where possible.
- Product context is seller-side RAG context. It helps interpretation but is
  not counted as customer-stated evidence.
- `sample -> full -> pro` is the scaling path: zero-config trial, MongoDB-backed
  real data, then Atlas Vector Search when paid infrastructure is intentional.
- Tool surfaces are profile-filtered so normal users, sample users, and
  maintainers do not need the same visible tool set.

For the deeper module map, read [`docs/architecture.md`](docs/architecture.md).
For fork/customization entry points, read [`docs/extending.md`](docs/extending.md).

## Live demo

Two ways to look at the deal data you've accumulated.

### 1. MongoDB Atlas Charts - Weekly Pipeline Review

![Atlas Charts Weekly Pipeline Review dashboard](docs/images/atlas-dashboard.png)

Active/Attention deal counts, pipeline value by stage, qualification health-band distribution, gap distribution, and open pipeline value - all on one screen. Each chart's aggregation pipeline is generated with the `render-atlas-dashboard` CLI and pasted into Atlas Charts (see the "Atlas Charts Dashboard" section below).

### 2. Claude / Codex in-chat rendered analysis

![Claude in-chat rendered dashboard](docs/images/chat-dashboard.png)

It takes the raw MCP tool output and renders win rate, the stage funnel, qualification gap patterns, data-quality coverage, and attention items right inside the conversation. It all starts from pasting in a single meeting note - no extra app.

### Cost-aware LLM boundary

The MCP server should calculate, store, and retrieve structured deal data. The
host app - Claude Desktop, Codex, or ChatGPT - should usually explain that data
to the user.

Server-side LLM calls are reserved for workflows that create persistent
structured intelligence, especially `add_interaction`, `analyze_deal`, and
theme backfills. Read-only BI/review/reporting tools are designed to be
LLM-free so the host app can do the final narration without extra API cost.

> Company names and figures in these screens are all fictional demo data.

---

## How deal health works

Internally, the project uses an active qualification framework to turn messy
evidence into comparable deal signals. MEDDPICC is the bundled default B2B
framework, and QF-v2 adds guarded custom framework support so teams can copy a
preset and adapt the criteria to their own sales motion. You do not need to be
a sales expert to use the tool; the user-facing output is framed as health,
missing information, risk signals, customer themes, and next questions.

| Dimension | What it measures |
|---|---|
| **M**etrics | The quantified impact the customer expects (ROI, cost-reduction %) |
| **E**conomic Buyer | Who actually holds budget authority |
| **D**ecision Criteria | What the vendor-selection criteria are |
| **D**ecision Process | How the internal approval process runs |
| **I**dentify Pain | The customer's core problem and its urgency |
| **C**hampion | Whether you have an internal advocate |
| **C**ompetition | How you compare with competitors and the status quo |

When you add customer evidence, the server-side LLM extracts these signals and
stores the structured result. In `full` mode, real deal data persists in
MongoDB Atlas and powers pattern analysis. In optional `sample` mode, the same
read/review surfaces run against bundled fictional data so AI agents can
evaluate the tool without setup.

---

## Product / solution context

Customer evidence and product knowledge are intentionally separate.

- Customer evidence is what prospects said in meetings, emails, interviews, or
  calls. It can affect qualification, customer themes, and deal health.
- Product context is seller-side knowledge: your ICP notes, product facts,
  pricing/packaging notes, security posture, integrations, differentiators,
  competitive notes, or disqualifiers. It helps the extraction prompt interpret
  customer evidence, but it is not counted as customer-stated evidence.

There are two normal ways to add product context.

1. Put files in a folder and tell the server where that folder is:

```text
Use update_config(product_context_source_dirs="C:\path\to\product-docs")
Then run index_product_context(dry_run=true)
If the preview looks right, run index_product_context(dry_run=false)
Finally, run get_product_context(query="security posture for healthcare")
```

2. Paste product/solution text into the host app and save it as a managed note:

```text
Use add_product_context_note(title="Healthcare security positioning",
content="...", dry_run=true)
If the preview looks right, call it again with dry_run=false and
confirmed_by_user=true.
Then run index_product_context and verify with get_product_context.
```

The first parser set supports `txt`, `md`, `json`, `csv`, `pdf`, and `docx`.
Presentation and spreadsheet files (`pptx`, `xlsx`) currently return warnings
and are planned for a later parser pass.

Product context is stored and cached locally under
`~/.deal-intel/product-context` by default. Tool responses return snippets and
source metadata, not full raw documents. Large catalog PDFs are supported through
configurable source-file and chunk limits; check `index_product_context`
warnings for `partial_indexed` when a very large file was only partly indexed.
Files or pasted notes with
secret-shaped content are rejected or skipped.

---

## Product profiles

One repo, one package, three operating profiles:

| Profile | Use it for | Requires |
|---|---|---|
| `full` | Real team data on MongoDB Atlas | `MONGODB_URI`, plus ChatGPT OAuth or an API key for LLM tools |
| `sample` | Zero-config AI evaluation, demos, and lightweight local personal use | Python package only |
| `pro` | Paid-infra upgrade with Atlas Vector Search and API-key LLMs | Atlas M10+, `deal_summary_vector` index, `OPENAI_API_KEY` by default |

Start humans in `full`. Use `sample` only when the user explicitly wants a
zero-config trial, or when an AI agent needs to confirm the basic workflow
before asking for MongoDB. It begins with bundled fictional data; once you
create your own local deal, the bundled fixture is archived from the working
view and your local personal dataset becomes the active dataset. Some search
and LLM-heavy paths remain limited in sample mode. Move to `pro` only when paid
infrastructure is intentional.

`pro` defaults to `openai_api` with `gpt-5.4-mini` for lower API cost pressure.
You can still override `llm.openai_api_model` or switch `llm.provider` to
`anthropic` in user config.

MCP tools are profile-filtered by default:

- `sample`: zero-config/local personal tools
- `standard`: normal real-data tools
- `developer`: all registered tools, including demo seed/cleanup helpers

Use `get_tool_catalog` or `config_doctor` to inspect the current visible count
for your profile. Use `tools.surface: developer` or
`DEAL_INTEL_TOOLS_SURFACE=developer` only when you intentionally want the full
maintainer/debug surface.

---

## Fork And Customize

Fork this if your sales process is too specific for a generic CRM, but too
important to live only in notes, spreadsheets, and memory.

The repo is designed as a customizable MCP deal intelligence engine, not only a
fixed demo app. Useful fork paths include:

- early B2B SaaS or AI teams that need structure before adopting a heavy CRM;
- RevOps-minded developers who want BANT, SPICED, or their own qualification
  framework instead of the bundled MEDDPICC default;
- MCP workflow builders experimenting with chat-first deal operations;
- consulting, SI, or agency teams that want meeting, proposal, and risk memory
  across accounts.

Common extension seams:

- qualification frameworks and scoring criteria;
- profile and tool-surface visibility;
- storage backends and MongoDB operational contracts;
- server-side LLM providers and cost tracking;
- report/export formats;
- product/solution context parsers;
- new MCP tools for team-specific workflows.

This project is MIT-licensed. You may use, copy, modify, merge, publish,
distribute, sublicense, and sell modified versions, subject to the license
terms. Keep the license and attribution notices when redistributing a fork.

For implementation entry points, read:

- [`docs/extending.md`](docs/extending.md) - extension seams and contracts;
- [`docs/customization-recipes.md`](docs/customization-recipes.md) - practical
  fork recipes;
- [`docs/architecture.md`](docs/architecture.md) - module ownership, tool
  ownership, and change playbooks.

---

## Install Overview

The normal product path is `full`: MongoDB Atlas-backed real deal data, an MCP
host, and one LLM credential path for extraction.

If you are a non-developer, the easiest route is to ask your AI assistant to
read [`AI_START_HERE.md`](AI_START_HERE.md). It will walk you through Python,
Node.js, MongoDB Atlas, and Claude/Codex/ChatGPT setup.

### Prerequisites

- Python 3.11+ in a conda environment
- Node.js 18+ for the no-git-clone `npx` bootstrapper
- One MCP chat client: Claude Desktop, or Codex/ChatGPT with MCP support
- For `full`: MongoDB Atlas account, Free/M0 cluster, and `MONGODB_URI`
- For LLM extraction/scoring: ChatGPT OAuth from a compatible subscription,
  `ANTHROPIC_API_KEY`, or `OPENAI_API_KEY`

Short version:

```text
Install Python and Node.js, prepare a MongoDB Atlas M0 connection string,
connect an MCP host, and choose one LLM path. Then use npx or an editable
Python install.
```

### npx install path

For most users, start here:

```bash
npx deal-intel-mcp setup --python /path/to/python
npx deal-intel-mcp mcp-config
```

Use the Python path printed by `mcp-config` in Claude Desktop MCPB. Set
`MONGODB_URI` through the MCPB install form, `.env`, or your shell environment.
On Windows, use `npx.cmd` if PowerShell blocks `npx`.

Detailed install guides:

- [`AI_START_HERE.md`](AI_START_HERE.md) - canonical AI-agent setup path,
  including npx, editable install, full/sample choice, and first checks
- [`AI_FULL_INSTALL_GUIDE.md`](AI_FULL_INSTALL_GUIDE.md) - longer full-mode
  walkthrough for non-developer users
- [`mcpb/README.md`](mcpb/README.md) - Claude Desktop MCP bundle

### Git clone / customization path

Clone or download this repository when you want to inspect or modify prompts,
reports, storage, qualification frameworks, or MCP tools. From the repository
root:

```bash
~/miniconda3/envs/deal-intel/python.exe -m pip install -e ".[embedding]"
```

Replace the Python path with the interpreter where you want the package
installed. Adding `[embedding]` installs local semantic-search dependencies.

After install, check the effective config:

```bash
deal-intel config profiles
deal-intel config show
```

### Readiness check

Run these before troubleshooting deeper issues:

```bash
deal-intel config doctor --offline
deal-intel smoke-profile --profile full --offline
```

When network access to Atlas is available, run a live storage ping:

```bash
deal-intel storage-status
```

Use `config_doctor` from the MCP host after installing the bundle. It is the
first recovery tool when paths, profiles, MongoDB, or LLM readiness are unclear.

### Optional zero-config smoke

```bash
$env:DEAL_INTEL_STORAGE_BACKEND='local_sample'
deal-intel smoke-profile --profile sample
deal-intel storage-status
deal-intel smoke-natural-questions --as-of 2026-06-10
```

Use this only for zero-config evaluation. It starts with bundled fictional data
and does not require MongoDB, paid APIs, or Atlas Vector Search.

`full` starts with the user's own MongoDB data and does not auto-seed sample
records. If you want a richer Atlas-backed demo, switch to the `developer` tool
surface and run `create_sample_data`; it writes 22 fictional generated deals to
the configured demo database, not to the primary real-data database.

---

## Zero-config sample mode (no MongoDB)

If you only want to test the BI and deal-review flows, you can run the bundled
fictional sample dataset without MongoDB Atlas, API keys, or Atlas Vector
Search.

Temporary PowerShell session:

```powershell
$env:DEAL_INTEL_STORAGE_BACKEND='local_sample'
& "$HOME\miniconda3\envs\deal-intel\python.exe" -m deal_intel.cli storage-status
& "$HOME\miniconda3\envs\deal-intel\python.exe" -m deal_intel.cli smoke-natural-questions --as-of 2026-06-10
```

Persistent sample profile:

```bash
deal-intel config init --profile sample --dry-run
deal-intel config init --profile sample
```

Sample mode is intentionally limited, but it is no longer purely read-only.
Core dashboard, reporting, customer-theme, deal-review, create/update/stage,
interaction-ingestion, and lifecycle flows can run against local personal data.
`add_interaction` requires a ready LLM provider and works on user-created local
deals; use `interaction_type: meeting` for meeting notes. Local sample mode
skips embedding storage and does not expose raw content in list/BI/report
paths. Semantic `search_deals`, Atlas Charts, and shared team operation still
belong to MongoDB-backed `full` or `pro` mode.

Local personal data defaults to `~/.deal-intel/local-data` and can be changed
with `storage.local_data_dir`.

Useful local-data commands:

```bash
deal-intel local-data status
deal-intel local-data export
deal-intel local-data reset          # dry-run
deal-intel local-data reset --force  # clears local deals, preserves delete audit logs
deal-intel local-data migrate-to-mongo          # dry-run
deal-intel local-data migrate-to-mongo --apply  # writes local deals to MongoDB
```

The bundled fictional fixture is immutable. After local personal deals exist,
the fixture is hidden from active reads instead of being mixed with your data.
The dry-run-first local-to-MongoDB migration command lets users graduate from
sample/local mode to `full` without retyping deals. It migrates only
user-created local personal deals, never bundled fixture records. If no local
personal deals exist yet, the dry-run returns immediately and skips MongoDB
target readiness checks.

---

## Tool guide

The detailed guide below focuses on the core user-facing workflow. For the
complete current tool contract, read [`docs/baseline.md`](docs/baseline.md).

> **Tip**: In Claude Desktop, type the example sentences below verbatim or say something similar. You can find a `deal_id` with `create_deal` or `list_deals`.

---

### 1. `create_deal` - create a new deal

**When to use**: Run this first when you start engaging a new prospect.

**Example**:
```
Create a new deal for Hyundai Precision. Manufacturing industry, deal size 200M KRW.
```

**Parameters**:
| Parameter | Required | Description |
|---|---|---|
| `company` | required | Customer company name |
| `industry` | optional | True business vertical (e.g., "Manufacturing", "Finance", "Retail") |
| `industry_tags` | optional | Additional vertical tags for cross-industry accounts. The primary `industry` is automatically included |
| `customer_segment` | optional | Customer segment or maturity label (e.g., "startup", "enterprise", "public_sector", "Series B", "Pre-IPO") |
| `deal_size_amount` | optional | Median expected contract size in `deal_size_currency` units (e.g., 200000000) |
| `deal_size_currency` | optional | ISO-style 3-letter currency code. Defaults to `deal_value.default_currency` (`KRW` by default) |
| `deal_size_status` | required when an amount is given | Amount status: `unknown`, `rough_estimate`, `customer_budget`, `quoted`, `strategic_zero` |
| `deal_size_low_amount` / `deal_size_high_amount` | optional | Estimate range. Omitted -> treated as equal to the median in metrics |
| `deal_size_note` | optional | Rationale for the amount classification, or a user memo |
| `expected_close_date` | optional | Expected close date. Omitted -> config default applies |

**Example result**:
```json
{
  "ok": true,
  "deal_id": "a3f9...",
  "company": "Hyundai Precision",
  "industry": "Manufacturing",
  "industry_tags": ["Manufacturing"],
  "customer_segment": "enterprise",
  "deal_size_amount": 200000000,
  "deal_size_currency": "KRW",
  "deal_size_status": "rough_estimate",
  "expected_close_date": "2026-06-15",
  "expected_close_date_source": "config_default"
}
```

Remember this `deal_id`, or look it up later with `list_deals`.

If you omit the expected close date, a default of 7 days after creation is filled in. This is an operational default, not a confirmed schedule. A date you provide always takes precedence over config.

When you enter a deal amount, you must also set its status. If unknown, leave `deal_size_status="unknown"` and leave the amount blank. If only a positive amount is given, the tool asks which basis applies - sales estimate / customer budget / quote sent. If only 0 is given, it doesn't save immediately and asks whether it's a strategic free/reference deal or an undecided amount. If undecided, it's saved as `unknown` with the amount blanked. An intentional zero-value deal (free sample, reference win) is saved with `deal_size_amount=0` and `deal_size_status="strategic_zero"`. If you heard a customer budget or sent a quote, use `customer_budget` or `quoted` so it counts as a validated pipeline value in metrics. Mixed currencies are not silently summed; metric and report outputs expose currency fields or per-currency breakdowns.

```yaml
deal_value:
  default_currency: KRW

pipeline:
  expected_close:
    default_days: 7
    days_by_segment:
      public_sector: 60
      enterprise: 28
    days_by_industry:
      Government: 60
      Manufacturing: 28

reporting:
  timezone: Asia/Seoul
```

Keep `industry` as the single primary business vertical. If an account is
cross-industry, put the other verticals in `industry_tags`; the primary
`industry` is always included in that tag list. Put account maturity,
ownership, buying segment, or funding stage in `customer_segment` instead.
Industry input is normalized against the built-in taxonomy when possible, so
values such as `제조`, `핀테크`, or `보험·금융·대기업` are stored as canonical
metadata such as `industry=Insurance`, `industry_tags=["Insurance", "Finance"]`,
and `customer_segment=enterprise`. Segment overrides apply first; industry
overrides apply second, both on a case-insensitive exact match. Auto-dates use
the business date in the reporting timezone, while stored audit timestamps stay
in UTC.

For existing data, use the taxonomy cleanup CLIs:

```bash
deal-intel audit-taxonomy
deal-intel apply-taxonomy-cleanup
deal-intel apply-taxonomy-cleanup --apply --confirmed-by-user
deal-intel backfill-industry-tags
deal-intel backfill-industry-tags --apply --confirmed-by-user
```

`audit-taxonomy` is read-only. `apply-taxonomy-cleanup` and
`backfill-industry-tags` are dry-run by default. They automatically normalize
recognizable mixed labels into primary industry, industry tags, and customer
segment. If an industry is missing, the tool treats it as an enrichment task:
it either drafts a medium-confidence industry from the company name or returns a
web research query so the AI client can look it up and call `update_deal`. The
default UX is draft-first and correction-friendly; only impossible rows stay out
of writes.

---

### 2. `add_interaction` - add a customer interaction

**When to use**: Right after a customer meeting, email reply, user interview, call summary, or internal note. Paste the content as-is and the server-side LLM extracts active-framework qualification signals and customer themes with source-aware scoring. MEDDPICC is the default built-in framework.

Cost note: this is intentionally one of the few places where the MCP server
uses its own LLM provider, because the extracted result is persisted as product
data. For explanation-only questions, prefer read tools such as
`get_deal_review`, `get_deal_gaps`, and `get_metrics`; let Claude/Codex explain
their deterministic output.

Meeting notes are just `interaction_type: meeting`. Older `meetings` records
are still read as legacy fallback, but new integrations should write canonical
`interactions` records through this tool.

**Example**:
```
Add today's (2026-06-08) meeting note to Hyundai Precision, deal_id: a3f9...
Use interaction_type=meeting and direction=inbound.

Notes:
Met Director Kim (purchasing decision-maker). Current production-line defect rate
is 3.2%, causing ~1.5B KRW/yr loss. Our solution targets <=1.5%. Manager Park is in
favor internally. Competitor A is under review but costs 2x. Internal approval due
end of June.
```

**Parameters**:
| Parameter | Required | Description |
|---|---|---|
| `deal_id` | required | Target deal ID |
| `date` | required | Interaction date (YYYY-MM-DD) |
| `interaction_type` | required | `meeting`, `email_thread`, `user_interview`, `call_summary`, `internal_note`, or a configured custom type |
| `direction` | required | `inbound`, `outbound`, `mixed`, or `internal` |
| `content` | required | Raw interaction content (Korean or English both fine) |
| `participants` | optional | Names/emails/roles if known |
| `subject` | optional | Email/call/meeting subject |
| `source_confidence` | optional | Override source confidence when needed |

**What the result includes**:
- `qualification` - active-framework scores + evidence extracted from this interaction
- `qualification_latest` - the deal's cumulative health_pct + per-dimension trend
- `meddpicc` / `meddpicc_latest` - compatibility aliases used when the active framework is MEDDPICC or legacy records are read
- `summary` - a 2-3 sentence LLM-generated summary
- `customer_themes` - customer concerns / selection criteria extracted from this interaction
- `scoring_applied` - whether this source updated qualification health/customer themes
- `source_policy` - why this source was treated as confirmed evidence or stored as unconfirmed context
- `stage_suggestion` - filled only when the content explicitly implies a stage transition (e.g., contract signed -> won, lost deal -> lost); otherwise `null`
- `embedding_stored` - whether the similar-deal-search embedding was stored

Source-aware scoring is deliberately conservative:

- `direction=inbound` defaults to `source_confidence=customer_stated`, so explicit customer replies, interviews, and meeting notes can update qualification/customer themes.
- `direction=outbound` defaults to `source_confidence=outbound_unconfirmed`, so seller-only emails are stored but do not improve health scores.
- `interaction_type=internal_note` or `direction=internal` defaults to `source_confidence=internal`, so internal hypotheses stay out of confirmed scoring.
- `direction=mixed` is allowed for email threads or calls with both sides represented; only explicit customer statements should be treated as evidence.

> **The stage never changes automatically.** Even if the content says "contract signed," `add_interaction` does not change the stage directly - it only **suggests** via `stage_suggestion`. When Claude asks "shall I move this deal to won?", `update_stage` makes the actual change after you confirm. This is a deliberate separation to prevent wrong auto-closing.

---

### 3. `get_deal` - view deal details

**When to use**: To check a specific deal's full history, qualification scores, and interaction records.

**Example**:
```
Show me the full Hyundai Precision deal. deal_id is a3f9...
```

You get stored interactions, any legacy meeting records, per-interaction
qualification extraction, and the cumulative health_pct. Legacy
`meddpicc_latest` fields are still returned for compatibility when present.

---

### 4. `update_stage` - change the pipeline stage

**When to use**: When a deal moves to the next stage or the outcome is finalized.

```text
update_stage(deal_id, new_stage, actual_close_date="")
```

**Example**:
```
Move the Hyundai Precision deal to the proposal stage.
```

When moving to `won` or `lost`, you can specify the actual close date as `YYYY-MM-DD`. Omitted -> the processing day is stored. `expected_close_date` stays as the forecast, and `stage_history.entered_at` is the system audit time, kept distinct from the actual close date. Moving a closed deal back to an open stage clears `actual_close_date`.

**Stages** (in order):
```
discovery -> qualification -> proposal -> negotiation -> won / lost / stalled
```

**What the result includes**:
- `actual_close_date` - the real close date for Won/Lost
- `days_in_previous_stage` - how long it spent in the previous stage
- `stuck_threshold_days` - the stuck threshold for a new Active stage; otherwise `null`
- Qualification gaps are recomputed per stage (e.g., under the default MEDDPICC framework, a drop in Identify Pain at the proposal stage is not a gap - it's a positive signal that the pain is being resolved)

---

### 5. `update_deal` - fix an existing deal's amount or confirmed metadata

**When to use**: When an existing deal's `deal_size_status` is missing, to save after the user confirms customer-budget / quote / strategic-zero, or to correct confirmed metadata such as company, industry, industry tags, customer segment, and close dates.

This tool stays intentionally narrow. It can update deal-value fields and selected metadata, but it does not change pipeline stage, interactions, meetings, contacts, or raw notes. Stage transitions still go through `update_stage`.

**Example**:
```
The existing ArcanaGames deal has evidence of a closed contract, so save it as quoted.
Note the rationale as "CEO said let's sign today and paid same-day."
```

**Required conditions**:
- `confirmed_by_user=true`
- Value updates require `deal_size_note` with the user's confirmation rationale or meeting evidence
- Metadata updates require `update_note` or a fallback `deal_size_note`

Value edits are logged to `deal_value_history`; metadata edits are logged to
`deal_metadata_history`. Recognizable mixed industry labels are normalized into
primary industry, `industry_tags`, and `customer_segment`; unmapped labels should
be corrected with an explicit confirmed update.

---

### 6. `list_deals` - see all deals at a glance

**When to use**: When you want the whole pipeline at a glance. Good for a weekly review.

**Example**:
```
Show me all deals, stuck ones first.
```

Or a specific stage only:
```
Show me only the deals in the proposal stage.
```

**Result**:
- `health_pct` - overall qualification score (0-100; MEDDPICC by default)
- `gaps` - list of weak, low-scoring dimensions
- `is_stuck` - whether the Active-stage dwell time exceeds the per-stage threshold
- `is_overdue` / `overdue_days` - whether an Open deal passed its expected close date
- `attention_reasons` - multiple reasons: `stalled`, `overdue`, `stuck`, `at_risk`
- `days_in_stage` - days spent in the current stage
- `data_quality` - per-deal missing/invalid/estimated fields and overall coverage
- `as_of`, `timezone`, `generated_at` - reporting base date and generation time

Specify `as_of="YYYY-MM-DD"` to re-run date-based calculations against the same base date. Stuck deals sort to the top.

---

### 7. `analyze_deal` - optional generated BD strategy

**When to use**: Only when you explicitly want the server-side LLM to write a BD strategy memo or persist `bd_strategy` back onto the deal.

For routine deal status, risk, uncertainty, and next-question review, prefer
`get_deal_review`. For "what information are we missing?" use
`get_deal_gaps`. Those read paths are deterministic, LLM-free, and cheaper.

**Example**:
```
Generate a BD strategy memo for the Hyundai Precision deal.
```

The result includes:
- a summary of current qualification health
- concrete responses per weak dimension
- a recommended agenda for the next meeting

When product context has been indexed, `analyze_deal` may use bounded
seller-side snippets to improve product-fit and positioning advice. It stores
only product-context reference metadata with the generated strategy, not raw
product documents.
- persisted `bd_strategy` when the tool succeeds

---

### 8. `get_metrics` - current pipeline-health KPIs

**When to use**: For instant BI questions in Claude/Codex like "how's pipeline health right now'", "how many at-risk deals'", "show pipeline value and health by stage."

This is the default read tool for numeric pipeline answers. Do not use
`list_deals` to hand-calculate KPIs, and do not use `get_insights` unless the
question is about a legacy/special BI pattern such as win/loss comparison or
stage velocity.

Supported metric types are `pipeline_health` and `pipeline_trend`.

**Parameters**:
| Parameter | Required | Description |
|---|---|---|
| `metric_type` | optional | `pipeline_health` or `pipeline_trend` |
| `stage` | optional | Exact match against the stored stage |
| `industry` | optional | Exact match against the stored industry |
| `as_of` | optional | Base date for stuck/overdue calculation, `YYYY-MM-DD` |

**What the result includes**:
- `kpis`: active/open/stalled/terminal count, open value, avg health, coverage, stuck/overdue, attention count
- `stage_breakdown`: count/value/health/stuck/overdue in canonical stage order
- `health_bands`: healthy/watch/at_risk/unassessed counts
- `attention_reasons`: stalled/overdue/stuck/at_risk reason counts and unique attention-deal count
- `pipeline_values`, `win_rate`, `data_quality`, `warnings`

The BI path uses no LLM and no embedding. Raw notes, contacts, and vectors are excluded from the metric read path.

**Example**:
```
Tell me the current pipeline health
```
```
Show pipeline health for the proposal stage only
```
```
Show stuck/overdue status for IT-industry deals
```

---

### 9. `get_deal_gaps` - surface the customer intel you're still missing

**When to use**: When you want to know what you still need to confirm before pursuing, forecasting, or reviewing a deal.

This is not a table-completeness checker. It prioritizes missing or weak information by practical sales impact and forecast trust. It is read-only, uses no LLM, uses no embedding, and excludes raw notes, raw interaction content, contacts, and vectors.

Use this for missing-information questions across the pipeline or for a single
deal. Use `get_deal_review` when the user wants a broader one-deal status/risk
review.

**Parameters**:
| Parameter | Required | Description |
|---|---|---|
| `as_of` | optional | Base date for stuck/overdue calculation, `YYYY-MM-DD` |
| `stage` | optional | Exact match against the stored stage |
| `industry` | optional | Exact match against the stored industry |
| `deal_id` | optional | Exact deal id. Returns that deal regardless of `min_priority` and `limit` |
| `min_priority` | optional | `low`, `medium`, or `high`. Defaults to `medium` |
| `limit` | optional | 1 to 50. Defaults to 10 |

**What the result includes**:
- `summary`: deal count, gap-deal count, priority counts, gap status/type counts
- `deals`: prioritized deals with health band, attention reasons, priority score, and gaps
- each gap includes reason, suggested question, and recommended action

**Example**:
```
What important customer information are we missing for active deals'
```
```
Show high-priority gaps for negotiation deals
```
```
For this deal_id, what should I confirm next'
```

---

### 10. `export_report` - generate a pipeline report

**When to use**: When you need a file to share or for a meeting, like "make this week's pipeline report."

Use `export_report` for manager/team meeting reports and narrative pipeline
briefings. For chat-only KPI answers, use `get_metrics` instead. For
spreadsheet-ready CSV ledgers, use `export_data` instead.

**Parameters**:
| Parameter | Required | Description |
|---|---|---|
| `report_type` | optional | `weekly_pipeline` or `pipeline_trend`; defaults to `weekly_pipeline` |
| `output_dir` | optional | Save path. Omitted -> `reporting.output_dir` or `~/.deal-intel/reports`; relative paths are scoped under `~/.deal-intel/` |
| `stage` | optional | Exact match against the stored stage |
| `industry` | optional | Exact match against the stored industry |
| `as_of` | optional | Base date for stuck/overdue calculation, `YYYY-MM-DD` |

**What the result includes**:
- `csv_path`, `markdown_path`: absolute paths of the generated files
- `artifacts`: report artifact filename, path, encoding
- `metrics`, `warnings`, `row_count`
- `briefing`, `briefing_sections`: compact meeting-ready summary text
- `host_report_prompt`: a safe prompt the host app can use to polish the
  deterministic data pack into a more natural manager/team report

The deterministic report data pack uses no LLM and no embedding. Host apps may
use that data pack to produce more polished meeting prose, but should not
change numbers, company names, stages, amounts, health scores, or warning codes.

**Example**:
```
Make this week's pipeline report
```
```
Export the proposal stage only as a weekly pipeline report
```

---

### 11. `export_data` - export Excel/CSV-ready deal ledgers

**When to use**: When the user asks for raw-but-safe CSV data, Excel records,
open deal tables, all-deal ledgers, or won/lost postmortem rows.

Use `export_data` for spreadsheet work. It is not a narrative report tool.

**Datasets**:
| Dataset | Purpose |
|---|---|
| `open_deals` | Active/stalled pipeline ledger with health, timing, attention, gaps, pain, and decision criteria |
| `all_deals` | Full safe deal ledger for filtering and recordkeeping |
| `closed_deals` | Won/lost ledger with close metadata and postmortem fields |

**Parameters**:
| Parameter | Required | Description |
|---|---|---|
| `dataset` | optional | `open_deals`, `all_deals`, or `closed_deals`; defaults to `open_deals` |
| `output_dir` | optional | Save path. Omitted -> `reporting.data_output_dir`, `reporting.output_dir`, or `~/.deal-intel/reports`; relative paths are scoped under `~/.deal-intel/` |
| `stage` | optional | Exact match against the stored stage |
| `industry` | optional | Exact match against the stored primary industry |
| `as_of` | optional | Base date for stuck/overdue calculation, `YYYY-MM-DD` |

`export_data` excludes raw notes, raw email/interview/call content, contacts,
and embeddings. It writes UTF-8 BOM CSV and guards spreadsheet formula
injection.

**Example**:
```
Export the open deal ledger as CSV
```
```
Create a won/lost CSV for postmortem review
```

---

### 12. `get_usage` - inspect server-side LLM usage

**When to use**: When you want to know how much server-side LLM work this MCP
has performed, such as token counts, call counts, and safe cost estimates.

This is read-only. It never returns prompts, raw notes, raw emails, API keys,
OAuth tokens, or MongoDB URIs. ChatGPT OAuth is shown as subscription-backed
with zero incremental API estimate. API-provider costs are estimated only when
you configure `usage.pricing`.

**Example**:
```
Show my Deal Intelligence MCP usage this month.
```
```
Show usage since 2026-06-01.
```

---

### Atlas Charts Dashboard - `Weekly Pipeline Review`

When you'd rather see it on screen than as CSV/Markdown, use the Atlas Charts dashboard. The dashboard aggregation spec and setup runbook are in [`docs/atlas-charts.md`](docs/atlas-charts.md).

Render command:

```bash
~/miniconda3/envs/deal-intel/python.exe -m deal_intel.cli render-atlas-dashboard --as-of 2026-06-09 --output outputs/atlas_charts/weekly_pipeline_review_20260609.json
```

To paste a single chart into the Atlas Query bar:

```bash
~/miniconda3/envs/deal-intel/python.exe -m deal_intel.cli render-atlas-dashboard --as-of 2026-06-09 --chart-id pipeline_kpis
```

The six managed chart ids are `pipeline_kpis`, `stage_breakdown`, `health_bands`, `attention_deals`, `qualification_gap_distribution`, and legacy-compatible `meddpicc_gap_distribution`.

Cross-check the dashboard numbers:

```bash
~/miniconda3/envs/deal-intel/python.exe -m deal_intel.cli crosscheck-weekly-dashboard --as-of 2026-06-09 --output-dir outputs/m3_3_crosscheck
```

---

### 13. `get_insights` - legacy/special BI analysis

**When to use**: To aggregate all deal data and spot patterns. Good for monthly reviews and learning win/loss patterns.

Prefer `get_metrics` for current pipeline-health KPIs. Prefer customer theme
tools for customer concerns, decision criteria, and evidence. `get_insights`
remains useful for special BI variants such as win/loss comparison, gap
frequency, industry benchmark, and stage velocity.

You can specify `as_of`; the response includes `timezone` and a UTC `generated_at`. These label the current collection snapshot - they don't reconstruct historical document state.

**Seven analysis types**:

| query_type | What it tells you |
|---|---|
| `pipeline_overview` | Deal count / avg health / total size by stage |
| `win_patterns` | Average legacy/default-framework scores of Won deals |
| `loss_patterns` | Average legacy/default-framework scores of Lost deals |
| `compare_won_lost` | Per-dimension score gap between Won and Lost |
| `gap_frequency` | The dimensions most often missing in active deals |
| `industry_benchmark` | Avg health / win rate / deal size by industry |
| `stage_velocity` | Average dwell days per stage |

**Example**:
```
Show me the whole pipeline overview.
```
```
What's the qualification pattern difference between deals we win and deals we lose'
```
```
Which dimension is most often missing'
```

---

### 14. `search_deals` - semantic similar-deal search

**When to use**: When you want to reference how past deals in similar situations played out. Search in natural language.

Do not use semantic search for frequency/ranking questions such as "what do
customers worry about most?" Use `get_customer_themes` for that. `search_deals`
is for similar-case retrieval in Mongo-backed mode.

**Example**:
```
Find deals where the customer struggled with cost reduction.
```
```
Show deals with a strong champion and a clear decision structure.
```
```
Any deals with a pattern similar to Hyundai Precision'
```

**How it works**:
1. Convert the query into a 384-dim vector
2. Compute cosine similarity against every deal's meeting-summary vector
3. Return them sorted by similarity, highest first

**What the result includes**:
- `score` - similarity (0-1, higher = more similar)
- `deal_stage`, `health_pct`, `gaps` - the deal's current state

> The local embedding model warms up in the background at server start. While it's loading, `warming_up: true` is returned, so retry after 5 seconds. After 30+ seconds it switches to a stalled error.

---

### 15. `get_customer_themes` - frequency of customer concerns / selection criteria

**When to use**: To group meeting evidence across deals and see the topics customers worry about most. It counts by unique deal (not by meeting) and returns representative companies and evidence.

Customer themes are intentionally a 3-step workflow:

1. `get_customer_themes` ranks recurring concerns or decision criteria.
2. `get_customer_theme_breakdown` compares those themes by stage, primary
   industry, industry tag, or theme dimension.
3. `get_customer_theme_evidence` shows privacy-safe snippets for one known
   `theme_key`.

For "show me examples" follow up with `get_customer_theme_evidence`. For
stage/industry/tag comparison, use `get_customer_theme_breakdown`.

**Example**:
```
Show the top 5 things customers worried about most across active deals.
```
```
Tell me the most frequent themes and evidence in Decision Criteria.
```

**Filters**:
- `dimension`: `all`, `identify_pain`, `decision_criteria`, `metrics`
- `stage`: `active`, `all`, or an individual deal stage
- `industry`: primary industry or `industry_tags` match
- `top_k`: up to 20

For cross-industry accounts, keep pipeline and forecast metrics on the single
primary `industry`, then use Customer Themes with the `industry` filter or
`get_customer_theme_breakdown(group_by="industry_tag")` to see semantic
industry-tag groupings.

To backfill themes onto existing data, run this first:

```bash
~/miniconda3/envs/deal-intel/python.exe -m deal_intel.cli backfill-customer-themes --apply
```

`backfill-customer-themes` is a maintenance/migration command for historical
meeting records. It may call the configured server-side LLM once per processed
meeting, so run the dry-run first, consider `--limit`, and avoid treating it as
normal daily intake. New emails, interviews, calls, and meetings should go
through `add_interaction`.

The versioned Atlas Charts spec is in
`atlas/charts/customer_themes.v1.json`. See `docs/atlas-charts.md` for the
Customer Themes dashboard setup, including the optional
`pain_by_industry_tag` chart.

---

## Recommended workflow

```
1. Right after customer evidence -> add_interaction (meeting/email/interview/call)
2. On stage change           -> update_stage
3. One-deal status/risk      -> get_deal_review
4. Before pursuing/forecast  -> get_deal_gaps (what's still missing)
5. Optional strategy memo    -> analyze_deal (LLM-written BD strategy)
6. Weekly review             -> list_deals (find stuck deals)
7. Pipeline KPIs             -> get_metrics pipeline_health
8. Usage / cost check        -> get_usage
9. Monthly retro             -> get_insights compare_won_lost / stage_velocity
10. Reference similar cases  -> search_deals
11. Customer-concern analysis -> get_customer_themes
12. Dashboard                -> Atlas Charts Weekly Pipeline Review
```

---

## Architecture

Current source of truth:

- MCP server: `src/deal_intel/mcp_server.py`
- Current registered tool count: see `get_tool_catalog` or `config_doctor`
- Detailed contract: [`docs/baseline.md`](docs/baseline.md)
- Documentation map: [`docs/README.md`](docs/README.md)
- User memory samples: [`user_docs/README.md`](user_docs/README.md)

```
[Claude Desktop / Codex - natural-language input]
         | stdio JSON-RPC
         v
[deal-intel-mcp  FastMCP server]
         |
         |-- LLM Provider
         |     |-- ChatGPT OAuth (default, Plus/Pro subscription)
         |     |-- Anthropic API (optional)
         |     `-- OpenAI API (optional)
         |
         |-- Embedding Provider
         |     `-- sentence-transformers all-MiniLM-L6-v2
         |          -> runs locally / no API key / 384 dims
         |
         `-- Storage
               |-- local_sample  : bundled fixture + local personal deals
               `-- MongoDB Atlas : real deals collection and analytics snapshots

search_deals
  |-- M0 default : reads summary_embedding, computes cosine in Python
  `-- M10+ option : uses the Atlas Vector Search index
```

### Deal document schema (key fields)

```json
{
  "deal_id": "uuid",
  "company": "Hyundai Precision",
  "industry": "Manufacturing",
  "deal_size_amount": 200000000,
  "deal_size_currency": "KRW",
  "deal_stage": "proposal",
  "expected_close_date": "2026-09-30",
  "expected_close_date_source": "user_provided",
  "actual_close_date": null,
  "stage_history": [
    {"stage": "discovery",     "entered_at": "2026-05-01T..."},
    {"stage": "qualification", "entered_at": "2026-05-15T..."},
    {"stage": "proposal",      "entered_at": "2026-06-01T..."}
  ],
  "interactions": [
    {
      "interaction_id": "uuid",
      "meeting_id": "uuid",
      "date": "2026-06-08",
      "interaction_type": "meeting",
      "direction": "inbound",
      "source_confidence": "customer_stated",
      "raw_content": "Met Director Kim. Defect rate 3.2% -> target 1.5%...",
      "summary": "2-3 sentence LLM-generated summary",
      "meddpicc": {
        "metrics":      {"score": 4, "evidence": "~1.5B KRW/yr loss"},
        "identify_pain": {"score": 5, "evidence": "defect rate 3.2%, line urgent"},
        "champion":     {"score": 3, "evidence": "Manager Park in favor"}
      }
    }
  ],
  "meddpicc_latest": {
    "health_pct": 72.4,
    "gaps": ["economic_buyer", "decision_criteria"],
    "metrics":       {"score": 4.0, "trend": "up"},
    "identify_pain": {"score": 5.0, "trend": "flat"},
    "champion":      {"score": 3.0, "trend": "up"}
  },
  "summary_embedding": [0.012, -0.034, ...],
  "created_at": "2026-05-01T...",
  "updated_at": "2026-06-08T..."
}
```

### Module structure

```
src/deal_intel/
  mcp_server.py         FastMCP entry point
  cli.py                typer CLI (login-chatgpt, backfill-customer-themes,
                        render-atlas-dashboard, crosscheck-weekly-dashboard,
                        smoke-deal-review, smoke-deal-review-audit,
                        smoke-natural-questions)
  _env.py               dotenv + 3-tier config merge
  _context.py           LLM / MongoDB / Embedding process singletons
  providers/
    llm.py              LLMProvider ABC + Anthropic + ChatGPTOAuth + factory
    embedding.py        EmbeddingProvider + SentenceTransformerProvider + factory
  schema/
    meddpicc.py         compute_meddpicc_latest, Deal/Meeting Pydantic models
    customer_themes.py  customer-theme taxonomy, parser, stage-signal validation
  storage/
    mongodb.py          MongoDBClient - CRUD + aggregation + semantic-search storage
  tools/
    create_deal.py
    add_interaction.py  canonical interaction intake + qualification extraction
    add_meeting.py      deprecated compatibility alias for meeting interactions
    get_deal.py
    update_stage.py     stage_history logging + qualification recompute
    update_deal.py      edit deal value and limited metadata after user confirmation
    list_deals.py       health_pct / gaps / stuck-flag aggregation
    get_metrics.py      pipeline_health KPIs / stage aggregation / warnings
    get_deal_gaps.py    read-only prioritized sales follow-up gaps
    export_report.py    human-facing pipeline report export
    export_data.py      spreadsheet-ready CSV data export
    get_user_memory.py  constrained user-memory read context
    record_user_memory.py
                        constrained user-memory append tool
    get_insights.py     7 BI queries plus legacy insight query
    get_customer_themes.py
                        aggregates customer concerns by unique deal count
    analyze_deal.py     qualification gap analysis + BD strategy via LLM
    search_deals.py     Python cosine by default / Atlas semantic search optional
```

### How qualification health_pct is computed

MEDDPICC is the default active framework, so the example below uses the
bundled MEDDPICC dimensions and weights. Custom frameworks use the same
weighted formula with their own enabled dimensions.

```
health_pct = sum(dim_avg x weight) / sum(5 x weight) x 100
```

Weights (tunable in `config/defaults.yaml`):

| Dimension | Weight | Why |
|---|---|---|
| champion | 2.0 | No internal momentum -> no deal |
| identify_pain / economic_buyer | 1.5 | Confirming the pain and reaching the budget holder are core |
| metrics / decision_criteria / decision_process | 1.0 | Standard |
| competition | 0.5 | Competition surfacing late is normal |

**Stage-aware gap adjustment** (applied automatically on `update_stage`):
- A drop in Identify Pain at the `proposal` / `negotiation` stage -> not a gap (signals the pain is being resolved)
- `won` deals -> no gaps

**Health-band configuration**:

Defaults are Healthy >=70, Watch >=40, At Risk <40. These classify the level of qualification validation, not win probability, and can be changed in `~/.deal-intel/config.yaml` once you've accumulated operational data.

```yaml
metrics:
  health_bands:
    healthy_min: 75
    watch_min: 45
  overdue:
    grace_days: 0
  win_rate:
    minimum_closed_sample: 10
```

The formal definitions of Active/Open/Stalled and unassessed handling are in [`docs/metrics.md`](docs/metrics.md).

---

## FAQ

**Q. Can I fork or customize this?**
Yes. This project is MIT-licensed. You can fork it, customize the storage,
model, scoring, reporting, and workflow behavior, and adapt it to your own deal
process. Keep the license and attribution notices when redistributing modified
versions.

If you are an AI assistant helping customize this repo, prefer small explicit
changes, avoid storing secrets in docs, and record meaningful local
modifications in docs or user memory so future agents can understand the fork.

**Q. Do my meeting notes have to be perfect?**
No. A rough memo of the essentials is fine. The LLM just skips dimensions with no evidence.

**Q. Do Korean meeting notes work?**
Yes. Mixed English/Korean works too.

**Q. Do I need MongoDB at all?**
Not for `sample`. You need MongoDB only when you want persistent real deal data
or Atlas Charts against your own database.

**Q. Do I need a paid MongoDB Atlas plan?**
No. The default `full` profile works on the free M0 plan today. `sample` does
not need MongoDB at all, and `pro` is the paid-infra path. `search_deals`
computes with Python cosine on M0. As deal volume grows you can switch to Atlas
Vector Search on M10+.

**Q. What does Pro add?**
`pro` keeps the same MCP tools but switches semantic search to Atlas Vector
Search and API-key LLM operation. The vector index spec is versioned at
`atlas/vector_indexes/deal_summary_vector.v1.json`. If Atlas search fails, the
server returns a structured error instead of silently falling back; use
`docs/pro-fallback-errors.md` to record repeatable setup failures.

**Q. search_deals returns nothing.**
Right after the server first starts, the local model may still be warming up - retry after 5 seconds. You also need at least one deal with a stored `summary_embedding` (run `add_interaction` with scoring-eligible content).

**Q. Should I use ChatGPT OAuth, Anthropic, or OpenAI API?**
For quick personal use, ChatGPT OAuth is attractive if you already subscribe to
ChatGPT Plus/Pro. Anthropic and OpenAI API modes are better when you want
explicit API-key operation, team billing, or production-style deployment.
