Metadata-Version: 2.4
Name: pubchatter
Version: 0.1.0
Summary: Citation metadata resolver, formatter, verifier, and manuscript citation audit CLI.
Author: Ligandal, Inc.
License-Expression: MIT
Project-URL: Homepage, https://github.com/nanogenomic/pubchatter
Project-URL: Repository, https://github.com/nanogenomic/pubchatter
Keywords: citations,crossref,pubmed,semantic-scholar,openalex,docx,manuscripts
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Text Processing :: Markup
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: python-docx>=1.1.0
Dynamic: license-file

# PubChatter

PubChatter is a standalone citation-verification CLI and drop-in agent plugin for resolving scholarly metadata from real sources before citations are written into manuscripts.

It uses public metadata APIs and does not require API keys:

- CrossRef for DOI resolution and general scholarly search
- PubMed E-utilities for biomedical PMID lookup and PubMed search
- Semantic Scholar for citation counts, abstracts, and CS/bio search
- OpenAlex for DOI fallback and open scholarly metadata

## Capabilities

- Resolve DOI metadata into structured JSON or formatted citations.
- Resolve PubMed IDs.
- Search across CrossRef, PubMed, and Semantic Scholar.
- Search one source directly with `pubmed`, `crossref`, or `s2`.
- Verify a free-form citation string against real metadata.
- Batch-resolve DOI/PMID lists.
- Format references in `nature`, `apa`, `vancouver`, `bibtex`, `cell`, or `science` style.
- Audit `.docx` manuscripts for broken inline citation superscripts.
- Auto-fix known broken `.docx` citation superscripts for common method names.
- Report DOI coverage in manuscript reference lists.

## Standalone Install

From PyPI:

```bash
python -m pip install pubchatter
```

From GitHub:

```bash
python -m pip install "git+https://github.com/nanogenomic/pubchatter.git"
```

From a local checkout:

```bash
python -m pip install -e .
```

Then run:

```bash
pubchatter --help
```

`python-docx` is declared as a dependency so the `.docx` audit commands work after installation.

## Plugin-Style Install

PubChatter can also be used as a drop-in local plugin by placing this directory at:

```bash
~/.claude/plugins/pubchatter
```

For a local executable wrapper:

```bash
mkdir -p ~/.local/bin
ln -sf ~/.claude/plugins/pubchatter/cli.py ~/.local/bin/pubchatter
chmod +x ~/.claude/plugins/pubchatter/cli.py
```

If you prefer the wrapper script used in the original local setup:

```bash
#!/usr/bin/env bash
exec python3 ~/.claude/plugins/pubchatter/cli.py "$@"
```

## Examples

Resolve a DOI:

```bash
pubchatter doi 10.1038/s41586-021-03819-2
```

Return JSON:

```bash
pubchatter doi 10.1038/s41586-021-03819-2 --json
```

Format a citation:

```bash
pubchatter format 10.1038/s41586-021-03819-2 --style nature
```

Search broadly:

```bash
pubchatter search "peptide binder design deep learning" --max-results 5
```

Search one source:

```bash
pubchatter pubmed "protein protein binding affinity"
pubchatter crossref "AlphaFold protein structure prediction"
pubchatter s2 "discrete diffusion language model"
```

Verify a citation:

```bash
pubchatter verify "Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583-589 (2021). https://doi.org/10.1038/s41586-021-03819-2"
```

Batch-resolve identifiers:

```bash
pubchatter batch dois.txt --style science
```

Audit a manuscript:

```bash
pubchatter check-super manuscript.docx
pubchatter check-super manuscript.docx --json
```

Fix known broken superscript citations in place:

```bash
pubchatter fix-super manuscript.docx
```

Write the fixed copy to a new file:

```bash
pubchatter fix-super manuscript.docx --output manuscript.citation-fixed.docx
```

## Manuscript Citation Workflow

Recommended agent workflow:

1. Resolve every known DOI with `pubchatter doi <doi>`.
2. For uncertain references, use `pubchatter search`, `pubchatter pubmed`, `pubchatter crossref`, or `pubchatter s2`.
3. Verify free-form references with `pubchatter verify "<citation>"`.
4. Run `pubchatter check-super <file.docx>` before submission.
5. Use `pubchatter fix-super <file.docx> --output <fixed.docx>` only after reviewing the reported changes.

## Notes

- PubChatter is metadata-first. It should flag uncertainty instead of inventing citation details.
- Some canonical scholarly records do not have DOI metadata, especially OpenReview pages, GitHub software references, and selected arXiv or conference pages. Use their canonical URLs when a DOI is unavailable.
- Citation verification is search-based for free-form strings; always inspect warnings about keyword matches or mismatched authors/years.

## License

MIT. See [LICENSE](LICENSE).
