Metadata-Version: 2.4
Name: chktm
Version: 0.1.1
Summary: Screen proposed marks against the full US trademark corpus
Project-URL: Homepage, https://github.com/nickschuetz/chktm
Project-URL: Documentation, https://github.com/nickschuetz/chktm/blob/main/docs/usage-guide.md
Project-URL: Repository, https://github.com/nickschuetz/chktm
Project-URL: Bug Tracker, https://github.com/nickschuetz/chktm/issues
Project-URL: Changelog, https://github.com/nickschuetz/chktm/blob/main/CHANGELOG.md
Project-URL: MCP Testing, https://github.com/nickschuetz/chktm/blob/main/docs/testing-mcp.md
Project-URL: Deployment Guide, https://github.com/nickschuetz/chktm/blob/main/docs/deployment.md
Author: Nick Schuetz
License-Expression: Apache-2.0
License-File: LICENSE
Requires-Python: >=3.11
Requires-Dist: defusedxml>=0.7
Requires-Dist: fastapi>=0.115
Requires-Dist: fpdf2>=2.8
Requires-Dist: jellyfish>=1.0
Requires-Dist: mcp>=1.0
Requires-Dist: rich>=13
Requires-Dist: typer>=0.12
Requires-Dist: uvicorn>=0.30
Description-Content-Type: text/markdown

# chktm

Screen proposed marks against the full US trademark corpus.

**chktm is a research aid, not legal clearance.** Trademark conflicts require a
likelihood-of-confusion analysis that only a qualified attorney can perform.
Do not rely on this tool for any legal decision. Always consult an attorney
before adopting a mark.

## What it does

Given one or more search terms, chktm screens them against the full active US
trademark corpus (USPTO bulk data) and produces a report ranking matches by
risk:

- **HIGH** — Live mark, name matches, and shares at least one international class
- **MEDIUM** — Live mark, name matches, but no class overlap
- **LOW** — Dead/abandoned mark (shown only with `--include-dead`)

chktm runs as a CLI, a web application, and an MCP server for AI agent access.
It works on **Linux**, **macOS**, and **Windows**.

## Quickstart

### Prerequisites

- Python 3.11+
- A free [USPTO ODP API key](https://data.uspto.gov/apis/getting-started)

### Install

**All platforms:**

```bash
pip install chktm
```

Or clone and install in editable mode:

```bash
git clone https://github.com/nickschuetz/chktm.git
cd chktm
pip install -e .
```

### Set your API key

**Linux / macOS (bash/zsh):**

```bash
export CHKTM_USPTO_API_KEY="your-key-here"
```

To persist across sessions, add it to `~/.bashrc`, `~/.zshrc`, or `~/.profile`.

**Windows (PowerShell):**

```powershell
$env:CHKTM_USPTO_API_KEY = "your-key-here"
```

To persist, set it permanently:

```powershell
[System.Environment]::SetEnvironmentVariable("CHKTM_USPTO_API_KEY", "your-key-here", "User")
```

**Windows (Command Prompt):**

```cmd
set CHKTM_USPTO_API_KEY=your-key-here
```

To persist, use `setx`:

```cmd
setx CHKTM_USPTO_API_KEY "your-key-here"
```

### Initialize the corpus (one-time)

```bash
chktm init
```

This downloads the full USPTO trademark dataset (~22 GB compressed) and ingests
it into a local SQLite database. Expect 2–4 hours depending on your connection
and USPTO rate limits. The download is resumable — if interrupted, re-run
`chktm init` and it will pick up where it left off.

By default, data is stored in a `data/` subdirectory in the current working
directory. To use a different location:

**Linux / macOS:**

```bash
chktm init --data-dir ~/chktm-data
```

**Windows:**

```powershell
chktm init --data-dir $HOME\chktm-data
```

The data directory is **saved automatically** to your config file during init,
so all subsequent commands (`update`, `search`, `serve`, `status`) find it
without needing `--data-dir` again.

Config file location:
- **Linux / macOS:** `~/.config/chktm/config.toml`
- **Windows:** `%APPDATA%\chktm\config.toml`

Resolution order (first wins): `--data-dir` flag > `CHKTM_DATA_DIR` env var
> config file > `./data` default.

### Search

These commands work identically on all platforms:

```bash
# Basic search (default classes: 9, 41, 42)
chktm search thundercorp

# Multiple terms with custom classes
chktm search thundercorp "thunder corp" thunder-corp --classes 9,41,42

# Include dead/abandoned marks
chktm search thundercorp --include-dead

# JSON output for scripting
chktm search thundercorp --json

# Write report to file
chktm search thundercorp --out report.md
```

### Keep the corpus up to date

```bash
chktm update
```

Downloads and ingests daily files published since the last update.

### Check corpus status

```bash
chktm status
```

## Web UI and API

Start the web server:

```bash
chktm serve
```

This starts a lightweight web interface at `http://localhost:8000/` with:

- **`GET /`** — Search form
- **`GET /api/search?q=<terms>&classes=9,41,42`** — JSON search API
- **`GET /api/status`** — Corpus status
- **`GET /docs`** — Interactive API reference (Swagger UI)
- **`GET /redoc`** — API reference (ReDoc)

Works on all platforms. On Windows, ensure your firewall allows connections on
port 8000 if you want to access it from other machines.

## MCP Server (AI Agent Access)

chktm exposes an [MCP](https://modelcontextprotocol.io/) server so AI agents
can search trademarks programmatically.

### Remote (via web server)

Start `chktm serve`, then configure your MCP client:

```json
{
  "mcpServers": {
    "chktm": {
      "type": "url",
      "url": "http://localhost:8000/mcp/"
    }
  }
}
```

### Local (stdio)

**Linux / macOS:**

```json
{
  "mcpServers": {
    "chktm": {
      "command": "python3",
      "args": ["-m", "chktm.mcp_server"],
      "env": {
        "CHKTM_DATA_DIR": "/home/youruser/chktm-data"
      }
    }
  }
}
```

**Windows:**

```json
{
  "mcpServers": {
    "chktm": {
      "command": "python",
      "args": ["-m", "chktm.mcp_server"],
      "env": {
        "CHKTM_DATA_DIR": "C:\\Users\\youruser\\chktm-data"
      }
    }
  }
}
```

### Available tools

| Tool | Description |
|------|-------------|
| `search_trademarks` | Search for potential conflicts. Args: `terms`, `classes`, `include_dead` |
| `generate_legal_report` | Attorney-ready report with component analysis. Args: `terms`, `classes` |
| `corpus_status` | Check database status, record counts, last update date |

### Testing with MCP Inspector

See [docs/testing-mcp.md](docs/testing-mcp.md) for step-by-step instructions
on testing the MCP server with
[MCP Inspector](https://github.com/modelcontextprotocol/inspector) — covers
Streamable HTTP, SSE, stdio, and headless CLI modes on all platforms.

## Usage Guide and Search Best Practices

See [docs/usage-guide.md](docs/usage-guide.md) for detailed guidance on:

- Constructing effective searches (multiple variants, component words)
- Understanding and interpreting risk tiers
- Choosing the right international classes for your industry
- Common search patterns (LLC formation, product naming, brand refresh)
- Agent efficiency: minimizing tokens and round-trips via MCP
- What chktm does NOT catch (phonetic, visual, common-law, foreign marks)

## OpenShift Deployment

See [docs/deployment.md](docs/deployment.md) for full instructions on deploying
to OpenShift with quay.io, including PVC setup, init job, and CronJob for
automatic daily updates.

## CLI Reference

```
chktm init [--data-dir ./data]                    # one-time corpus download + ingest
              [--off-peak]                        # 3x faster during 10pm-5am EST
              [--stream-ingest]                   # stream XML from ZIP (no disk write)
              [--keep-zips]                        # retain ZIPs after ingestion
chktm update [--data-dir ./data]                  # pull new daily files
              [--off-peak] [--stream-ingest]
chktm search <term> [<term> ...]                  # screen terms
              [--classes 9,41,42]
              [--out report.md]
              [--data-dir ./data]
              [--include-dead]
              [--json]
              [--report standard|legal]           # attorney-ready report (PDF if --out *.pdf)
chktm status [--data-dir ./data] [--json]         # corpus stats
chktm serve [--host 0.0.0.0] [--port 8000]       # web UI + MCP server
chktm version [--json]
```

All commands support `--json` for machine-readable output. Exit codes:
`0` = success, `1` = error, `2` = database not initialized.

## Known Limitations

1. **Wordmark only.** Design marks, stylized marks, and image components are
   not searched.
2. **Substring + phonetic matching.** Searches match on text (substring) and
   sound (Soundex). "thundrcorp" will match "THUNDERCORP" phonetically. However,
   individual components ("thunder", "corp") are only searched automatically in
   `--report legal` mode. Advanced phonetic variants may still be missed.
3. **US only.** No EU, UK, WIPO, or other jurisdictions.
4. **Bulk data lag.** USPTO bulk data trails the live tmsearch UI by a few days.
   For very recent filings, also check
   [tmsearch.uspto.gov](https://tmsearch.uspto.gov) manually.
5. **Not legal advice.** Consult an attorney for any real decision.

## Data Source

USPTO **Trademark Full Text XML Data** from the
[Open Data Portal](https://data.uspto.gov). The full corpus contains ~12.7
million trademark records dating back to 1884.

## License

[Apache-2.0](LICENSE)

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md). Contributions use the
[Developer Certificate of Origin](https://developercertificate.org/) (DCO)
sign-off.
