Metadata-Version: 2.4
Name: invoice-ocr-mcp
Version: 1.0.0
Summary: 企业发票OCR识别MCP服务器 - 基于ModelScope的专业发票识别解决方案
Author-email: Invoice OCR Team <team@example.com>
License: MIT
Project-URL: Homepage, https://github.com/your-org/invoice-ocr-mcp
Project-URL: Documentation, https://github.com/your-org/invoice-ocr-mcp/docs
Project-URL: Repository, https://github.com/your-org/invoice-ocr-mcp.git
Project-URL: Issues, https://github.com/your-org/invoice-ocr-mcp/issues
Keywords: mcp,ocr,invoice,modelscope,ai
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Office/Business :: Financial
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: mcp>=1.0.0
Requires-Dist: modelscope>=1.28.0
Requires-Dist: opencv-python>=4.8.0
Requires-Dist: Pillow>=10.0.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: torch>=2.0.0
Requires-Dist: transformers>=4.30.0
Requires-Dist: asyncio>=3.4.3
Requires-Dist: aiofiles>=23.0.0
Requires-Dist: jsonschema>=4.17.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: PyYAML>=6.0.0
Requires-Dist: structlog>=23.0.0
Provides-Extra: dev
Requires-Dist: pytest>=7.4.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: pytest-mock>=3.11.0; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: flake8>=6.0.0; extra == "dev"
Requires-Dist: isort>=5.12.0; extra == "dev"
Requires-Dist: mypy>=1.5.0; extra == "dev"
Provides-Extra: gpu
Requires-Dist: torch[cuda]>=2.0.0; extra == "gpu"
Requires-Dist: torchvision[cuda]>=0.15.0; extra == "gpu"
Provides-Extra: monitoring
Requires-Dist: prometheus-client>=0.17.0; extra == "monitoring"
Requires-Dist: redis>=4.6.0; extra == "monitoring"
Provides-Extra: performance
Requires-Dist: cachetools>=5.3.0; extra == "performance"
Requires-Dist: scikit-image>=0.21.0; extra == "performance"
Dynamic: license-file

# 企业发票OCR识别MCP服务器

基于ModelScope生态构建的专业企业发票OCR识别MCP服务器，为企业财务数字化提供智能化解决方案。

## 🚀 产品特性

- **标准化接入**：符合MCP协议规范，无缝集成各类AI应用
- **专业发票识别**：支持13种主流发票类型，准确率达99%+
- **结构化输出**：自动提取发票关键信息，输出标准JSON格式
- **企业级服务**：支持批量处理，满足大规模业务需求

## 📋 支持的发票类型

- 01: 增值税专用发票
- 02: 机动车增值税专用发票
- 03: 增值税普通发票
- 04: 增值税电子普通发票
- 05: 增值税普通发票（卷式）
- 06: 增值税普通发票（通行费）
- 07: 二手车发票
- 08: 增值税电子专用发票
- 09: 数电发票（增值税专用发票）
- 10: 数电发票（普通发票）
- 11: 数电发票（航空运输电子客票行程单）
- 12: 数电发票（铁路电子客票）
- 13: 区块链发票（支持深圳、北京和云南地区）

## 🛠️ 安装指南

### 环境要求

- Python 3.8+
- ModelScope账号和API Token
- 至少4GB内存
- GPU支持（推荐）

### 快速安装

```bash
# 克隆项目
git clone https://github.com/wuyonghui0810/invoice-ocr-mcp.git
cd invoice-ocr-mcp

# 创建虚拟环境
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# 安装依赖
pip install -r requirements.txt

# 配置环境变量
cp .env.example .env
# 编辑 .env 文件，添加你的 ModelScope API Token
```

### Docker部署

```bash
# 构建镜像
docker-compose build

# 启动服务
docker-compose up -d
```

## 📖 使用指南

### MCP客户端集成

```python
import asyncio
from mcp.client.session import ClientSession
from mcp.client.stdio import stdio_client

async def main():
    async with stdio_client(["python", "src/invoice_ocr_mcp/server.py"]) as streams:
        async with ClientSession(streams[0], streams[1]) as session:
            await session.initialize()
            
            # 识别单张发票
            result = await session.call_tool(
                "recognize_single_invoice",
                {"image_data": "base64_encoded_image_data"}
            )
            print("识别结果:", result)

if __name__ == "__main__":
    asyncio.run(main())
```

### 批量处理

```python
# 批量识别发票
result = await session.call_tool(
    "recognize_batch_invoices",
    {
        "images": [
            {"id": "invoice1", "image_data": "base64_data1"},
            {"id": "invoice2", "image_data": "base64_data2"}
        ],
        "parallel_count": 3
    }
)
```

## 🔧 配置说明
```json
{
  "mcpServers": {
    "invoice_ocr_mcp": {
      "command": "npx",
      "args": ["node", "start-python.js"],
      "env": {
      }
    }
  }
}
```

## 🔧 配置说明

主要配置文件位于 `configs/` 目录：

- `models.yaml`: ModelScope模型配置
- `server.yaml`: 服务器配置
- `logging.yaml`: 日志配置

详细配置说明请参考 [部署指南](docs/deployment.md)

## 📊 性能指标

- **识别准确率**: >99%
- **处理速度**: 单张发票<3秒
- **并发支持**: 支持多线程并行处理
- **服务可用性**: >99.9%

## 🤝 贡献指南

欢迎提交 Issue 和 Pull Request！

## 📄 许可证

本项目基于 MIT 许可证开源。

## 📞 技术支持

如有问题，请通过以下方式联系：

- GitHub Issues
- 邮箱: wuyonghui0810@126.com

---

© 2024 Invoice OCR MCP Server. All rights reserved. 
