Metadata-Version: 2.4
Name: brasil-mcp-leads
Version: 0.1.0
Summary: MCP server pra geração de leads B2B brasileiros com WhatsApp e summary IA.
Project-URL: Homepage, https://github.com/brasil-mcp/leads
Project-URL: Repository, https://github.com/brasil-mcp/leads
Project-URL: Issues, https://github.com/brasil-mcp/leads/issues
Author: Brasil MCP
License: AGPL-3.0-or-later
License-File: LICENSE
Keywords: ai,b2b,brasil,cnpj,leads,mcp,whatsapp
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)
Classifier: Natural Language :: Portuguese (Brazilian)
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.11
Requires-Dist: anthropic>=0.40
Requires-Dist: asyncpg>=0.29
Requires-Dist: beautifulsoup4>=4.12
Requires-Dist: brasil-mcp-essentials>=0.3.0
Requires-Dist: brasil-mcp-match>=0.2.0
Requires-Dist: fastapi>=0.115
Requires-Dist: httpx>=0.27
Requires-Dist: lxml>=5.0
Requires-Dist: mcp>=1.0.0
Requires-Dist: pydantic>=2.9
Requires-Dist: python-multipart>=0.0.12
Requires-Dist: slowapi>=0.1.9
Requires-Dist: trafilatura>=1.12
Requires-Dist: uvicorn[standard]>=0.30
Description-Content-Type: text/markdown

# brasil-mcp-leads

[![PyPI version](https://img.shields.io/pypi/v/brasil-mcp-leads.svg)](https://pypi.org/project/brasil-mcp-leads/)
[![Python](https://img.shields.io/pypi/pyversions/brasil-mcp-leads.svg)](https://pypi.org/project/brasil-mcp-leads/)
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL_v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)
[![CI](https://github.com/brasil-mcp/leads/actions/workflows/ci.yml/badge.svg)](https://github.com/brasil-mcp/leads/actions/workflows/ci.yml)

> **B2B lead generation for Brazil, with credit-priced enrichment.** RFB-grade
> base data, opt-in WhatsApp/email scraping, and AI-generated lead summaries —
> exposed as MCP tools and a REST API.

**brasil-mcp-leads** is Phase 3 of the [Brasil MCP](https://github.com/brasil-mcp)
family. It sits on top of [brasil-mcp-match](https://github.com/brasil-mcp/match)
(which owns the Receita Federal ingest) and adds three things `match` deliberately
doesn't do:

1. **Reveal RFB data** for outbound work — razão social, endereço, CNAE — once
   the caller is willing to pay for it and the company has not opted out.
2. **Enrich** with on-the-spot site scraping for WhatsApp links, emails,
   phones, and AI-chat-channel detection.
3. **Summarise** the lead with a structured JSON brief (industry, target,
   pitch angles) generated either through MCP Sampling (the caller's LLM) or
   via our Anthropic fallback.

Pricing is per-credit (1 cred = 1 lead with cheap delivery, 2 cred = lead +
our-LLM summary). Plans range from a free 50-lead lifetime quota to a 1000
leads/mo Pro plan. Overage is capped at 2× the monthly fee and billed via
Asaas.

---

## Install

```bash
pip install brasil-mcp-leads
# or
uv add brasil-mcp-leads
```

This package depends on `brasil-mcp-match>=0.2.0` and Postgres 16+ shared with
that ingest. If the Match base is not loaded, every `lookup_lead` returns
`CNPJ_NOT_COVERED`.

**License note.** `brasil-mcp-leads` is **AGPL-3.0-or-later**. Calling the
hosted API requires no license. Self-hosting a derived service in production
without releasing source requires a separate commercial license (planned for
v0.2+).

---

## Quick start

### 1. Start the dependencies

`brasil-mcp-leads` shares the Postgres database with `brasil-mcp-match`. Run
the Match ingest first (see its README) then apply the leads schema:

```bash
git clone https://github.com/brasil-mcp/leads.git
cd leads
docker compose up -d postgres
uv run psql "$BRASIL_MCP_LEADS_DATABASE_URL" \
    -f src/brasil_mcp_leads/db/migrations/001_initial.sql
```

### 2. Start the API

```bash
# REST (FastAPI + slowapi rate limiter)
uv run brasil-mcp-leads serve 8000

# Or run the MCP SSE server (FastMCP)
uv run brasil-mcp-leads-server
```

### 3. Make a call

```bash
curl -X POST http://localhost:8000/v1/leads/lookup \
  -H "Authorization: Bearer brasilmcp_leads_yourkey" \
  -H "Content-Type: application/json" \
  -d '{
    "cnpj": "33000167000101",
    "deliver": {
      "base": true,
      "scrape": true,
      "summary_max_tier": 2,
      "summary_available_tier": 2
    }
  }'
```

Sample MCP tool invocation (JSON payload sent by the client):

```json
{
  "name": "lookup_lead",
  "arguments": {
    "cnpj": "33000167000101",
    "deliver": { "base": true, "scrape": true, "summary_max_tier": 1 }
  }
}
```

The response is a [`LeadResponse`](src/brasil_mcp_leads/core/models.py) with
`charged_credits`, `delivered_skus`, `summary_source`, and the resolved data.

---

## v0.1.0 tools

| Tool | Description | Credits |
|---|---|---|
| `lookup_lead` | One CNPJ → base + optional scrape + optional summary. | 1 or 2 (see pricing). |
| `bulk_lookup_leads` | Up to 1000 CNPJs in parallel; partial failures contained per-item. | N × per-lead. |
| `sample_leads` | Filter-based discovery (CNAE / UF / porte / etc.) excluding already-sold. | N × per-lead. |
| `refresh_lead` | Force re-scrape + summary regeneration; charged at standard rate. | 1 or 2. |
| `extract_from_site` | Debug: contact extraction from a URL without DB write or charge. | 0 (10/min cap). |
| `health` | Server version + DB ping. | 0. |

Full input/output schemas, error codes, and per-tool cURL examples are in
[`docs/tools.md`](docs/tools.md).

---

## Pricing

| Plan | Monthly credits | Monthly BRL | Overage / credit | Overage cap | Max summary tier |
|---|---|---|---|---|---|
| `free` | 50 (lifetime) | R$ 0.00 | — | — | 1 |
| `hobby` | 100 | R$ 50.00 | R$ 0.50 | R$ 100.00 | 1 |
| `starter` | 500 | R$ 125.00 | R$ 0.30 | R$ 250.00 | 2 |
| `pro` | 1000 | R$ 187.50 | R$ 0.20 | R$ 375.00 | 3 |

- Unused credits **expire at cycle end** (standard SaaS).
- When `overage_brl_accumulated >= overage_cap_brl`, the caller's tokenised
  card is charged immediately via Asaas and the accumulator resets.
- A failed Asaas charge moves the subscription to `blocked` and every
  subsequent call returns HTTP 402.

Summary tiers map to Anthropic models when the caller doesn't supply
sampling: 1 → Haiku, 2 → Sonnet, 3 → Opus.

---

## Environment variables

| Variable | Purpose |
|---|---|
| `BRASIL_MCP_LEADS_DATABASE_URL` | asyncpg DSN for the Match+Leads Postgres. |
| `LEADS_ANTHROPIC_API_KEY` | Anthropic key for the fallback summary path. |
| `LEADS_API_KEYS` | `key1:client_id1,key2:client_id2` mapping for REST auth. |
| `LEADS_CORS_ORIGINS` | Comma-separated CORS allowlist (REST only). |
| `ASAAS_API_KEY` | Asaas API key for subscription + payments. |
| `ASAAS_ENV` | `production` or `sandbox` (default: sandbox). |
| `ASAAS_WEBHOOK_SECRET` | Shared HMAC secret for verifying webhook deliveries. |

---

## MCP client setup (Claude Desktop)

```json
{
  "mcpServers": {
    "brasil-mcp-leads": {
      "url": "https://leads.brasil-mcp.com/sse",
      "transport": "sse",
      "headers": {
        "Authorization": "Bearer brasilmcp_leads_yourkey"
      }
    }
  }
}
```

For local stdio runs (development), use the REST API with the same key.

---

## Privacy & LGPD

The Match repo carries the legal stance for the family (LIA + DPA templates
in `brasil-mcp-match/docs/lgpd/`). `leads` adds three guarantees on top:

1. **Opt-out propagation.** Any CNPJ that has invoked Art. 18 LGPD via Match
   is automatically blocked here too (the `opt_out_request` table is shared).
2. **No sócio PII.** Responses are strictly the LeadResponse schema — no CPF,
   no nome de sócio, no PF endereço. Asserted across the security test suite.
3. **Audit log is hash-only.** Every call writes a row in the shared
   `audit_log` with `sha256(input)`, never the input itself.

Cold outreach in volume, cross-referencing with third-party PF bases without
consent, and bulk resale are explicitly forbidden by the Terms of Service.

---

## Architecture

```
src/brasil_mcp_leads/
  core/
    pricing.py            # plan tables + compute_charge
    usage_tracker.py      # charge_credits + threshold cap
    scraper.py            # httpx + trafilatura + BeautifulSoup
    extractors/           # whatsapp / emails / phones / ai_channels
    sampling.py           # MCP sampling path
    fallback_llm.py       # Anthropic fallback (Haiku/Sonnet/Opus)
    pipeline.py           # execute_lookup orchestration
    billing/              # Asaas client + webhooks + cron_overage
    repository/           # asyncpg repos (leads, summary, usage, search, sold_attrs)
  adapters/
    mcp/                  # FastMCP server + tool registration
    rest/                 # FastAPI app + dependencies + routes (lookup, billing, health)
```

Same pattern as `brasil-mcp-essentials` and `brasil-mcp-match`: pure-Python
core, thin adapters. asyncpg pool, FastAPI + FastMCP SSE.

---

## Roadmap

- **v0.1.0 (now)** — 6 MCP tools, REST API, credit-based billing on Asaas,
  AGPLv3, 378 tests across unit/adapter/security suites, 99% coverage.
- **v0.2.0** — Dashboard (saldo + history), webhooks de update, saved
  searches, commercial license terms.
- **v0.3.0** — Hunter/Apollo email enrichment, SMTP validation, lookalike
  search.

Spec §14 enumerates the explicit out-of-scope items for v0.1.0.

---

## Family

- **Fase 1** — [`brasil-mcp-essentials`](https://github.com/brasil-mcp/essentials).
  Offline validators (CNPJ, CPF, telefone, etc.). MIT.
- **Fase 2** — [`brasil-mcp-match`](https://github.com/brasil-mcp/match).
  Privacy-preserving CNPJ verification (match, don't reveal). AGPLv3.
- **Fase 3 — this repo.** B2B lead generation. AGPLv3 + planned commercial license.

---

## License

[AGPL-3.0-or-later](LICENSE). Commercial self-host without source-release
obligations requires a separate commercial license (planned for v0.2+).

---

## Contributing

Issues and PRs welcome. Before opening a PR, run:

```bash
uv run ruff check
uv run pyright src
uv run pytest --cov=brasil_mcp_leads --cov-branch --cov-fail-under=95 -q
```

CI runs the same gates across Python 3.11, 3.12, and 3.13.
