Metadata-Version: 2.4
Name: leadhunter
Version: 0.1.0
Summary: Minimal LinkedIn 1st/2nd-degree lead-gen CLI with triage TUI and browser dashboard
Author-email: Durgesh Chandrakar <dchandrakar0980@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/durgeshchandrakar/leadhunter
Project-URL: Repository, https://github.com/durgeshchandrakar/leadhunter
Project-URL: Issues, https://github.com/durgeshchandrakar/leadhunter/issues
Keywords: linkedin,scraper,leads,cli,lead-generation,triage,b2b
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Environment :: Web Environment
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: End Users/Desktop
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Internet :: WWW/HTTP
Classifier: Topic :: Office/Business
Classifier: Topic :: Utilities
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: typer>=0.12
Requires-Dist: rich>=13
Requires-Dist: textual>=0.60
Requires-Dist: linkedin-api>=2.0
Requires-Dist: pydantic>=2.5
Requires-Dist: tomli-w>=1.0
Requires-Dist: browser-cookie3>=0.20
Requires-Dist: bottle>=0.13
Dynamic: license-file

# leads

Minimal LinkedIn lead-gen CLI: find 1st- and 2nd-degree connections at one or more companies, triage them, export.

> **Use a burner LinkedIn account.** This violates LinkedIn's ToS. Don't run it on an account you care about.

## Install

```bash
cd leads
uv pip install -e .          # or: pip install -e .
```

Installs a `leads` CLI command. Requires Python 3.11+.

## The one-shot

Log into LinkedIn (burner!) in any supported browser, then:

```bash
leads find "demtech.ai"
```

That's it. On first run it autodetects your cookies. Then it resolves the company, scrapes 1st+2nd-degree connections, and prints a table.

Batch mode:

```bash
leads find "demtech.ai" "stripe" "linear" --yes
```

Multiple positional names = multiple scrapes in one invocation. `--yes` skips the disambiguation prompt.

Other useful flags:

| Flag           | Effect                                              |
| -------------- | --------------------------------------------------- |
| `--triage`     | Open the interactive triage TUI after scraping      |
| `--out f.csv`  | Write to a file instead of printing                 |
| `--format jsonl` | Pair with `--out` for JSONL output               |
| `--auto-split` | On the LinkedIn ~1000-result cap, auto-split regions |
| `--quiet`      | Summary line only                                   |
| `--degree 2`   | Only 2nd-degrees (default: both)                    |

## The 8 verbs

```
leads auth import [--browser X]   # autodetect cookies (one-time, usually unneeded)
leads auth login                   # manual cookie paste fallback
leads company <name>               # interactively resolve a name → URN, cache it
leads scrape --company-urn ...     # power-user discovery (URN-driven, scriptable)
leads find <names...>              # one-shot: resolve → scrape → render
leads show [filters]               # universal viewer (terminal table / CSV / JSONL)
leads triage [filters]             # interactive TUI for flagging interesting leads
leads web [filters]                # browser dashboard (mirrors triage TUI)
leads status                       # accounts, jobs, companies, today's budget
```

### Filters (work everywhere)

```
-d, --distance 1 [-d 2]            # repeatable; 1st-degree, 2nd-degree
--interesting / --not-interesting  # the ⭐ flag
--location "bengaluru"             # substring match
--at "demtech"                     # filter to leads found at this company
--where "raw SQL"                  # power-user escape hatch
```

### Output (on `leads show`)

```
leads show                                       # Rich table in terminal (default)
leads show --format csv --out leads.csv          # write CSV (full columns)
leads show --format jsonl --out leads.jsonl      # write JSONL
leads show --format csv --columns triage --out triage.csv
                                                 # CSV with the triage subset
leads show --compact                             # drop the LinkedIn URL column
```

### Triage TUI

```
leads triage                                     # interactive, default
leads triage --at "demtech" --not-interesting    # pre-filter what shows up
leads triage --export triage.csv                 # spreadsheet round-trip
leads triage --import triage.csv                 # bring back the marked-up CSV
```

Keys: `j`/`k` or arrows to navigate · `space` toggle ⭐ · `n` edit note · `o` open LinkedIn URL · `/` filter · `esc` clear filter · `q` save+quit.

### Web dashboard

A minimal browser UI with the same actions as the triage TUI — toggle ⭐, edit notes inline, multi-select + bulk delete, filter, export CSV.

```bash
leads web                                 # binds 127.0.0.1:8765, opens browser
leads web --port 9000                     # different port
leads web --no-open                       # don't auto-open
leads web --at "demtech" --interesting    # initial filters (still editable in-page)
```

The server binds to `127.0.0.1` only and rejects any non-loopback `Host` header (DNS-rebinding defense). No authentication — anything that can reach loopback on your machine is treated as you.

Filters live in the URL query string, so refresh works and you can bookmark a scoped view. The `Export CSV` link reflects the current filter and respects the filter as you change it.

Shortcuts: `/` focus search · `Esc` clear search / clear selection · `Enter` save inline edit · `Cmd/Ctrl+A` select-all visible.

### Scrape (URN-based, scripty)

```
leads scrape --company-urn urn:li:fsd_company:98873360 --degree 1,2
leads scrape --company-urn A --company-urn B --auto-split --quiet
```

Re-running on the same company auto-resumes from the last `next_offset` if a previous job paused.

## Cookies

Auto-import works across `chrome`, `chromium`, `firefox`, `safari`, `edge`, `brave`, `opera`, `vivaldi`, `arc`, `librewolf`, `zen`.

Caveats:
- **Chrome on macOS** triggers a Keychain prompt — enter your Mac password.
- **Safari** needs Full Disk Access for your terminal (System Settings → Privacy & Security).
- **Browser may need to be closed** if its cookie DB is locked.

If auto-import fails, manual paste from DevTools → Cookies → `https://www.linkedin.com`:

- `li_at` — long alphanumeric
- `JSESSIONID` — `"ajax:1234..."`, **keep the literal quotes**

```bash
leads auth login
```

Stored in `~/.leads/config.toml` (chmod 600). DB lives in `~/.leads/leads.db`.

> **Never paste cookies into chat or shared docs.** They're bearer tokens equivalent to your password.

## Pacing & budgets

The tool deliberately scrapes slowly to avoid getting your burner account flagged. Between every API call it waits 5–12 seconds (randomized), and every ~20 calls it takes a longer 45–90 second pause. So scraping 1000 leads takes ~30–60 minutes, not seconds. This is on purpose — fast scraping gets you banned.

There are two budgets:

- **Per-invocation: 50 search calls.** One `leads find` run won't use more than 50 search API calls.
- **Per-day: 150 search calls.** Across all runs in a day. Tracked in the `api_calls` table.

When you run `leads find "a" "b" "c"`, those three companies **share** the same 50-call budget — it doesn't reset between them.

### What happens when the budget runs out?

The scrape **pauses** mid-company. It doesn't lose progress — everything fetched so far is already in the DB. The job row is marked `paused` and remembers the exact offset.

Just re-run the same command later (e.g. tomorrow when the daily budget resets):

```bash
leads find "demtech.ai"   # picks up from where it left off, no flag needed
```

You'll see a dim line like `Resuming job #7 for demtech.ai at offset 150`.

### What if LinkedIn locks the account?

If you get a 429 / challenge / 401 response, the account is marked `locked` and all commands abort. Wait a few hours, then re-run — most lockouts clear on their own. If not, run `leads auth login` to paste fresh cookies and try again.

### Checking what state you're in

```bash
leads status
```

Shows your daily budget used, recent jobs (with their status and offset), and which companies have been scraped.

## What if a company has more than ~1000 connections?

LinkedIn's `search_people` endpoint won't return more than about 1000 results per query, regardless of how many actually exist. When the tool hits this cap, it asks:

```
stripe hit the ~1000-result cap.
Auto-split across 5 regions (United States, India, UK, Germany, Singapore)? [y/N]
```

- **Press `y`** → the tool re-scrapes the same company once per region, getting up to ~1000 leads from each. Total: up to ~5000 leads instead of 1000. Costs up to 5× the search budget.
- **Press `n`** → keep the 1000 you got and move on.

To skip the prompt and always split, pass `--auto-split`:

```bash
leads find "stripe" --auto-split
```

This is also useful in `--quiet` mode (no prompts), since otherwise the tool just prints a warning and moves on without splitting.

## Project layout

```
leads/
├── pyproject.toml
├── schema.sql
├── leads/
│   ├── cli.py            # Typer app (entry: `leads`)
│   ├── config.py         # ~/.leads/config.toml
│   ├── cookie_import.py  # cross-browser cookie autodetect
│   ├── db.py             # sqlite3 + schema init + migrations
│   ├── linkedin.py       # tomquirk lib wrapper + paced calls
│   ├── budget.py         # api_calls, daily/invocation caps
│   ├── filters.py        # structured filter flags → SQL WHERE
│   ├── regions.py        # default 5 regions for auto-split
│   ├── companies.py      # interactive disambiguation + cache
│   ├── scrape.py         # multi-company, auto-resume, auto-split
│   ├── find.py           # the one-shot orchestrator
│   ├── show.py           # universal viewer (terminal/CSV/JSONL)
│   ├── triage.py         # CSV import/export
│   ├── triage_tui.py     # Textual TUI for interactive triage
│   └── web/              # Bottle + htmx browser dashboard
│       ├── server.py
│       ├── templates/index.tpl
│       └── static/        # htmx.min.js, app.css, app.js
└── tests/test_db.py
```

## Tests

```bash
pytest
```

Schema round-trip, triage CSV round-trip, filter compilation, show CSV export.
