Metadata-Version: 2.4
Name: langchain-synapsoft
Version: 0.1.0
Summary: LangChain integration package for Synap DocuAnalyzer.
Author: Synapsoft
Maintainer: Synapsoft
License-Expression: MIT
Project-URL: Homepage, https://github.com/synapsoft-DA/langchain-synapsoft
Project-URL: Repository, https://github.com/synapsoft-DA/langchain-synapsoft
Project-URL: Documentation, https://github.com/synapsoft-DA/langchain-synapsoft/tree/main/docs
Project-URL: Issues, https://github.com/synapsoft-DA/langchain-synapsoft/issues
Keywords: langchain,synap,docuanalyzer,document-loader,tool
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Classifier: Typing :: Typed
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: httpx<1,>=0.28
Requires-Dist: langchain-core<2,>=1.2
Requires-Dist: pydantic<3,>=2.7
Provides-Extra: dev
Requires-Dist: build<2,>=1.4; extra == "dev"
Requires-Dist: pytest<10,>=9; extra == "dev"
Requires-Dist: pytest-asyncio<2,>=1.0; extra == "dev"
Requires-Dist: respx<1,>=0.22; extra == "dev"
Requires-Dist: ruff<1,>=0.15; extra == "dev"
Requires-Dist: langchain-tests<2,>=1.1; extra == "dev"
Requires-Dist: twine<7,>=6; extra == "dev"
Dynamic: license-file

# langchain-synapsoft

`langchain-synapsoft` is a standalone LangChain integration package for Synap DocuAnalyzer.

It exposes a small, focused public surface:

- `SynapDocuAnalyzerClient` for the REST workflow
- `SynapDocuAnalyzerLoader` for converting parsed pages into LangChain `Document` objects
- `SynapDocuAnalyzerTool` for agent-facing document conversion

The package currently targets the verified upload -> poll -> result flow and is prepared for the 0.1.0 alpha release.

After the 0.1.0 GitHub release is published to PyPI, you can install and integrate this package directly in your own project. Access to a live Synap DocuAnalyzer deployment is managed separately: installing the package does not grant service access, and API keys plus usage quotas are provisioned by Synapsoft.

## Installation

Install from PyPI after the 0.1.0 release:

```bash
pip install -U langchain-synapsoft
```

Install for local development:

```bash
python -m pip install -e ".[dev]"
```

This package is intended to be easy to adopt in customer projects. After publication, `pip install` is enough to add the integration client, but using the Synap DocuAnalyzer service itself still requires a separately issued API key and an agreed usage policy from Synapsoft.

## Environment

To call a live Synap DocuAnalyzer service, `langchain-synapsoft` expects a base URL and API key.

```bash
export SYNAP_DA_BASE_URL="https://your-docuanalyzer-host"
export SYNAP_DA_API_KEY="your-api-key"
```

```powershell
$env:SYNAP_DA_BASE_URL = "https://your-docuanalyzer-host"
$env:SYNAP_DA_API_KEY = "your-api-key"
```

## Quickstart: Loader

```python
from langchain_synapsoft import SynapDocuAnalyzerLoader

loader = SynapDocuAnalyzerLoader(
    file_path="sample.pdf",
    base_url="https://your-docuanalyzer-host",
    api_key="your-api-key",
    output_type="md",
    mode="page",
)

docs = loader.load()
print(len(docs))
print(docs[0].metadata)
print(docs[0].page_content[:300])
```

The loader returns one `Document` per page by default. Use `mode="single_document"` to join every page into one `Document`.

If your deployment requires a longer request timeout or skips TLS verification for an internal certificate chain, pass `request_timeout=` and `verify_ssl=` when creating the loader.

## Quickstart: Tool

```python
from langchain_synapsoft import SynapDocuAnalyzerTool

tool = SynapDocuAnalyzerTool(
    base_url="https://your-docuanalyzer-host",
    api_key="your-api-key",
)

content = tool.invoke({"file_path": "sample.pdf", "output_type": "md"})
print(content[:300])
```

## Supported behavior

- Auto-detects both root-style endpoints such as `/monitor` and legacy `/api/monitor` deployments
- Uses the verified Synap page result contract behind a simple Python API
- Pretty-prints JSON page responses while preserving the raw payload in `PageResult.raw_payload`
- Includes focused unit tests plus an opt-in live smoke test for a real server

## Current limitations

- Asset ZIP download flows are intentionally out of the public API surface for now
- Remote delete endpoints are not exposed yet
- Live integration tests require a reachable Synap DocuAnalyzer deployment and credentials

## Development

From a repository checkout, create and activate a virtual environment, then install the development dependencies:

```bash
python -m venv .venv
python -m pip install -U pip
python -m pip install -e ".[dev]"
```

Run the local checks:

```bash
python -m ruff check .
python -m pytest -q
synap-docuanalyzer-smoke --sample-file "path/to/sample.docx"
```

To run the opt-in live pytest target, set `SYNAP_DA_BASE_URL`, `SYNAP_DA_API_KEY`, and `SYNAP_SAMPLE_FILE`, then run:

```bash
python -m pytest -q -m live_server
synap-docuanalyzer-smoke --sample-file "$SYNAP_SAMPLE_FILE"
```

## Repository guide

- [CONTRIBUTING.md](CONTRIBUTING.md) explains local setup, validation, and pull request expectations.
- [docs/README.md](docs/README.md) links the public package docs in this repository.
- [docs/API_CONTRACT.md](docs/API_CONTRACT.md) documents the verified REST surface used by this package.
- [examples/loader_basic.py](examples/loader_basic.py) and [examples/tool_basic.py](examples/tool_basic.py) are minimal runnable examples.

## License

This project is licensed under the [MIT License](LICENSE).
