Metadata-Version: 2.4
Name: lit-monitor
Version: 0.11.0
Summary: A personal literature tracker that ranks new papers by semantic similarity to your Zotero library, extracts structured fields with an LLM, builds a knowledge graph, and writes to Obsidian.
License: MIT License
        
        Copyright (c) 2026 Mayank Vats
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://github.com/max3925vats/lit-monitor
Project-URL: Repository, https://github.com/max3925vats/lit-monitor
Project-URL: Issues, https://github.com/max3925vats/lit-monitor/issues
Project-URL: Documentation, https://github.com/max3925vats/lit-monitor#documentation
Keywords: zotero,obsidian,literature,research,semantic-search,knowledge-graph,mcp,pubmed,arxiv
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Operating System :: MacOS
Classifier: Operating System :: POSIX :: Linux
Classifier: Environment :: Console
Classifier: Environment :: Web Environment
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Text Processing :: Indexing
Classifier: Topic :: Database :: Front-Ends
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: lxml>=6.1.0
Requires-Dist: xmltodict>=0.12
Requires-Dist: edlib>=1.3
Requires-Dist: colorama>=0.4.3
Requires-Dist: inquirer>=2.7
Requires-Dist: pyzotero
Requires-Dist: bibtexparser
Requires-Dist: habanero
Requires-Dist: semanticscholar
Requires-Dist: requests
Requires-Dist: tenacity
Requires-Dist: aiohttp>=3.14.0
Requires-Dist: chromadb>=0.5.3
Requires-Dist: sentence-transformers>=3.0
Requires-Dist: rapidfuzz
Requires-Dist: jinja2
Requires-Dist: pyyaml
Requires-Dist: pydantic>=2.0
Requires-Dist: click>=8.0
Requires-Dist: rich>=13.0
Requires-Dist: fastapi>=0.115
Requires-Dist: uvicorn[standard]>=0.30
Requires-Dist: sse-starlette>=2.1
Requires-Dist: tomli_w>=1.0
Requires-Dist: python-multipart>=0.0.29
Requires-Dist: kuzu>=0.6
Requires-Dist: inflect>=7
Requires-Dist: mcp>=1.0
Requires-Dist: plyer>=2.1
Requires-Dist: scikit-learn>=1.3
Requires-Dist: ruamel.yaml>=0.18
Provides-Extra: dev
Requires-Dist: ruff; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: responses; extra == "dev"
Requires-Dist: mypy; extra == "dev"
Provides-Extra: litellm
Requires-Dist: litellm; extra == "litellm"
Provides-Extra: cloud
Requires-Dist: litellm; extra == "cloud"
Provides-Extra: nlp
Requires-Dist: transformers>=4.40; extra == "nlp"
Requires-Dist: torch>=2.2; extra == "nlp"
Requires-Dist: accelerate>=0.30; extra == "nlp"
Provides-Extra: server
Provides-Extra: graph
Provides-Extra: mcp
Provides-Extra: notify
Dynamic: license-file

# lit-monitor for Zotero

**Semantic literature discovery for researchers who live in Zotero.**

lit-monitor tracks PubMed, arXiv, and Scopus on a schedule, ranks every new
paper against your existing Zotero library, extracts structured fields with an
LLM, and writes everything into your Obsidian vault — queryable from a browser,
the terminal, or any AI client that speaks the Model Context Protocol (MCP).

[![CI](https://github.com/max3925vats/lit-monitor/actions/workflows/ci.yml/badge.svg)](https://github.com/max3925vats/lit-monitor/actions/workflows/ci.yml)
[![Version](https://img.shields.io/badge/version-0.11.0-informational.svg)](https://github.com/max3925vats/lit-monitor/releases)
[![Downloads](https://img.shields.io/github/downloads/max3925vats/lit-monitor/total)](https://github.com/max3925vats/lit-monitor/releases)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
[![Python 3.11+](https://img.shields.io/badge/python-3.11%2B-blue.svg)](https://www.python.org/)
[![Platform: macOS | Linux](https://img.shields.io/badge/platform-macOS%20%7C%20Linux-lightgrey.svg)](#requirements)
[![MCP compatible](https://img.shields.io/badge/MCP-compatible-blueviolet.svg)](docs/integrations.md)

Your library is the signal. Each candidate paper is scored by semantic
similarity to embeddings of what you already keep in Zotero, so the feed adapts
to whatever you read — papers close to your interests rank higher, papers in a
different domain rank lower. Nothing in the pipeline is domain-specific: your
Zotero library, a handful of search topics, and an optional free-text domain
paragraph are the only inputs it needs.

The default configuration and some examples lean toward downstream
biopharmaceutical process development — the domain the tool was originally
developed against — but ready-made starting configs for other fields (ML
research, climate science, and more) ship in
[`config/examples/`](config/examples/). See
[Installation](docs/installation.md).

Drive it from a **localhost web UI** (`lit-monitor serve`) for setup and
day-to-day operation, or from the **CLI** (`lit-monitor --help`) for scripted
work. Same pipeline either way.

> **Local-first, free by default.** On the default configuration everything runs
> on your machine: your library is embedded and stored locally (Ollama +
> ChromaDB), with no per-call API costs. Only outbound paper searches reach
> PubMed, arXiv, and Scopus. Routing extraction or embeddings to a cloud provider
> (Anthropic, OpenAI, Vertex AI, Ollama Cloud) is opt-in.

> **Beta.** lit-monitor is feature-complete and in active daily use, but still
> maturing toward 1.0 — interfaces and on-disk formats may change between
> releases, and you may hit rough edges. Bug reports and feedback are very
> welcome via [GitHub Issues](https://github.com/max3925vats/lit-monitor/issues).
> Provided under the MIT License, **without warranty** (see [License](#license)).

## Features

- **Topic search with library-relative ranking.** Recurring searches across
  PubMed, arXiv, and Scopus, powered by a bundled copy of
  [findpapers](https://github.com/jonatasgrosman/findpapers) (see
  [Acknowledgements](#acknowledgements)). Each candidate is ranked by cosine
  similarity to embeddings of your Zotero library, computed locally via
  `mxbai-embed-large` (or any LiteLLM-compatible provider) against a per-machine
  ChromaDB store. *Explicit search coverage for individual journal publishers is
  planned for a future release.*
- **Obsidian-native output.** Every paper becomes a structured Markdown note
  with persist zones for your own annotations, two-phase LLM extraction, and a
  citation-graph rebuild path.
- **Knowledge graph with an ask interface.** A KuzuDB graph stores entities
  (topics, methods, materials, authors, journals, keywords) and ten typed
  relationships across the corpus. Ask questions in plain English from the CLI,
  HTTP, or MCP — for example, `lit-monitor ask "what methods extend Carta
  2009?"`. Answers are theme-aware when the library has been clustered.
- **Three retrieval modes** — vector (semantic), graph (entity-typed), and
  hybrid (reciprocal-rank fusion), selectable per command with
  `--rag-mode {vector,graph,hybrid}`.
- **MCP server for AI clients.** Twelve tools that Claude Desktop, Cursor,
  Continue, and any other MCP-capable agent can call to query the graph and
  vector index, including a read-only Cypher escape hatch with a safety guard.
- **Notifications and flexible delivery.** An OS notification when a discovery
  run finishes, a weekly Markdown digest, on-demand Markdown export, or a rich
  terminal table — configurable.
- **Runs on any schedule.** One-command install for launchd (macOS) or systemd
  user timers (Linux), plus ad-hoc runs from the dashboard.

For the scoring model and the design behind each signal, see
[How it works](docs/how-it-works.md).

## Requirements

- macOS or Linux (ARM Linux, including Raspberry Pi 4/5, works for
  cloud-Ollama configurations)
- Python 3.11+
- [Ollama](https://ollama.com) installed locally for embeddings
  (`ollama pull mxbai-embed-large`)
- A Zotero library (Better BibTeX optional)
- An Obsidian vault (full absolute path required)

## Install

```bash
pip install lit-monitor        # or: uvx lit-monitor / pipx install lit-monitor
lit-monitor first-run
```

That's the whole install. `lit-monitor first-run` walks you through interactive
setup and then launches the web UI. [Ollama](https://ollama.com) is a separate
prerequisite for local embeddings (`ollama pull mxbai-embed-large`) — see
[Requirements](#requirements).

For optional extras — `[nlp]` (BioBERT entity extraction) and `[litellm]`
(multi-provider cloud LLM routing) — and the from-source / development install,
see the [Installation guide](docs/installation.md).

### From source (development)

```bash
git clone https://github.com/max3925vats/lit-monitor.git
cd lit-monitor
./install.sh
```

The script installs [`uv`](https://docs.astral.sh/uv/) if needed, creates a
project-local `.venv`, resolves all dependencies, and seeds working configs
from `config/*.example.yaml`.

## Quickstart

### Web UI

```bash
lit-monitor first-run   # interactive first-time setup, then launches the server
# or, once credentials are configured:
lit-monitor serve
```

Open **`http://127.0.0.1:8765/setup`** in any browser. An 8-step wizard covers
credentials, paths, extraction config, topics, domain context, theme
vocabulary, tracked researchers, and item routing, with live credential checks
at each step. After setup, the dashboards take over. See the
[Web UI guide](docs/web-ui.md) for every page.

### CLI

```bash
lit-monitor check          # verify config + Ollama + Zotero connectivity
lit-monitor brain-build    # index your existing Zotero library (one-time)
lit-monitor run            # first discovery run
lit-monitor serve          # browse results at http://127.0.0.1:8765
```

Full command surface in the [CLI reference](docs/cli-reference.md). To configure
credentials and YAML by hand instead of using the wizard, see
[Configuration](docs/configuration.md).

## Documentation

| Guide | Covers |
|---|---|
| [Installation](docs/installation.md) | Install paths, optional extras, field-specific starter configs |
| [How it works](docs/how-it-works.md) | Library-as-signal, score decomposition, clustering, domain extraction, trending, embeddings |
| [Configuration](docs/configuration.md) | Three setup recipes, LLM and embedding providers, notifications, strict mode |
| [CLI reference](docs/cli-reference.md) | Every day-to-day command |
| [Web UI](docs/web-ui.md) | Dashboard pages and the setup wizard |
| [Integrations](docs/integrations.md) | MCP server and HTTP API |
| [Development](docs/development.md) | Running tests and deployment |

## Glossary

A few terms used throughout the docs:

- **Zotero** — reference manager that holds your library of papers; lit-monitor
  reads it as the relevance signal.
- **Obsidian** — Markdown-based knowledge base; lit-monitor writes one note per
  paper into a vault (a folder of Markdown files).
- **Embedding** — a numeric vector representing a paper's text, so similarity can
  be measured by distance. Papers near your library's embeddings rank higher.
- **Ollama** — runs language and embedding models locally on your machine (no
  cloud account needed for the default setup).
- **ChromaDB** — the local vector database that stores paper embeddings.
- **KuzuDB** — the local graph database that stores entities (methods, authors,
  …) and their typed relationships.
- **LiteLLM** — an optional adapter to route LLM or embedding calls to cloud
  providers (OpenAI, Anthropic, Vertex AI) instead of local Ollama.
- **MCP (Model Context Protocol)** — an open standard that lets AI clients (Claude
  Desktop, Cursor, …) call external tools; lit-monitor ships an MCP server.
- **Cypher** — the query language for the knowledge graph; the `ask` and MCP
  surfaces translate plain English into read-only Cypher under the hood.
- **brain-build** — the one-time step that indexes your existing Zotero library
  into the embedding store and graph.
- **RRF (reciprocal-rank fusion)** — the method behind `--rag-mode hybrid` that
  blends vector and graph rankings into one ordered list.

## Acknowledgements

Multi-source literature search is powered by
[**findpapers**](https://github.com/jonatasgrosman/findpapers) by Jonatas Grosman
(MIT License, © 2020). A copy is bundled under
[`lit_monitor/_vendor/findpapers`](lit_monitor/_vendor/findpapers) — with its
license retained — so that `pip install lit-monitor` resolves cleanly without an
upstream dependency conflict. The original project is gratefully acknowledged.

Explicit search coverage for individual journal publishers (beyond the sources
findpapers provides) is planned for a future release.

## License

[MIT](LICENSE)

This project bundles a copy of [findpapers](https://github.com/jonatasgrosman/findpapers)
(MIT License) — see [Acknowledgements](#acknowledgements) and
[`lit_monitor/_vendor/findpapers/LICENSE`](lit_monitor/_vendor/findpapers/LICENSE).
