Metadata-Version: 2.4
Name: by-datacloud
Version: 0.1.9
Summary: Whale DataCloud - A multi-service data platform
Project-URL: Homepage, https://github.com/beyonai/by-datacloud
Project-URL: Repository, https://github.com/beyonai/by-datacloud
Author: Whale DataCloud Team
License: MIT
License-File: LICENSE
Keywords: cloud,data,platform,service
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.12
Requires-Dist: aiochclient[aiohttp]>=2.0
Requires-Dist: aiomysql>=0.2
Requires-Dist: aiosqlite>=0.19
Requires-Dist: asyncpg>=0.29
Requires-Dist: boto3>=1.34
Requires-Dist: by-framework>=0.1.2
Requires-Dist: deepagents>=0.4.7
Requires-Dist: fastapi>=0.115.0
Requires-Dist: httpx>=0.27
Requires-Dist: langchain-core>=0.3.0
Requires-Dist: langchain-mcp-adapters>=0.2.2
Requires-Dist: langchain-openai>=1.1.11
Requires-Dist: langchain>=0.3.0
Requires-Dist: langgraph-checkpoint-postgres>=2.0.0
Requires-Dist: langgraph>=1.0.10
Requires-Dist: llama-index-embeddings-openai>=0.3.0
Requires-Dist: matplotlib>=3.7
Requires-Dist: mcp>=1.18.0
Requires-Dist: networkx>=3.0
Requires-Dist: numpy>=1.26
Requires-Dist: opengauss-sqlalchemy>=2.4.0
Requires-Dist: pandas>=2.0
Requires-Dist: psycopg[binary,pool]>=3.1
Requires-Dist: pydantic-settings>=2.0
Requires-Dist: pydantic>=2.0
Requires-Dist: pypinyin>=0.50
Requires-Dist: python-multipart>=0.0.12
Requires-Dist: pyyaml>=6.0.3
Requires-Dist: rapidfuzz>=3.14.3
Requires-Dist: rdflib>=7.0
Requires-Dist: sqlalchemy[asyncio]>=2.0
Requires-Dist: sqlglot
Requires-Dist: sqlparse>=0.5
Requires-Dist: strawberry-graphql[fastapi]>=0.260
Requires-Dist: uvicorn>=0.32.0
Requires-Dist: watchfiles>=1.0
Provides-Extra: dev
Requires-Dist: mypy>=1.11; extra == 'dev'
Requires-Dist: pre-commit>=3.8; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.6; extra == 'dev'
Description-Content-Type: text/markdown

# dataCloud 2.0

dataCloud 是一个数智引擎，通过智能构建企业级知识网络，面向大模型、智能应用和业务人员输出业务化组件能力，提升企业数据获取效率与推理准确性。

## 安装总包

```bash
pip install by-datacloud
```

总包会直接打入以下源码模块：

- `datacloud-analysis`
- `datacloud-data[all]`
- `datacloud-knowledge`

安装后可直接导入：

```python
import by_datacloud
import datacloud_analysis
import datacloud_data_sdk
import datacloud_knowledge
```

## 项目结构与核心模块

当前仓库采用 Monorepo，目录分为两层：

- `packages/`：核心 SDK 与基础能力层
- `examples/`：应用样例与演示工程

核心模块如下：

1. `datacloud-analysis`（`packages/datacloud-analysis`）
   - 顶层 AI 分析/编排 SDK（原 `datacloud-agent`）
   - 依赖 `datacloud-data`、`datacloud-knowledge`、`datacloud-memory`

2. `datacloud-data`（`packages/datacloud-data`）
   - 核心数据查询与执行 SDK
   - 提供 NL2Data、异构数据源接入、执行链路能力

3. `datacloud-knowledge`（`packages/datacloud-knowledge`）
   - 领域知识、本体、术语检索与约束能力

4. `datacloud-memory`（`packages/datacloud-memory`）
   - 会话级与跨会话记忆存储、检索与压缩能力

5. `sales_analysis_demo`（`examples/sales_analysis_demo`）
   - 业务样例工程（`frontend/`、`backend/`、`mock_env/`）
   - `backend/datacloud_data_service/` 为数据服务层示例实现

## 开发规范

统一遵守根目录规范（根级优先）：

| 规范文档 | 说明 | 优先级 |
|----------|------|--------|
| [`docs/项目规范/CODING_CONVENTIONS.md`](docs/项目规范/CODING_CONVENTIONS.md) | Python 编码规范（全项目通用） | **根级 · 最高** |
| [`docs/项目规范/TESTING_CONVENTIONS.md`](docs/项目规范/TESTING_CONVENTIONS.md) | 测试规范与覆盖率要求 | **根级 · 最高** |

## 开发指南

### 环境要求

- Python >= 3.12
- [uv](https://docs.astral.sh/uv/) >= 0.7

### 快速开始

```bash
# 1) 安装所有 workspace 依赖
uv sync

# 2) 运行 analysis 包（示例）
uv run --package datacloud-analysis python -m datacloud_analysis

# 3) 质量检查
uv run ruff format .
uv run ruff check .
uv run mypy .
uv run pytest
```

### Monorepo 结构

```text
by_datacloud/
├── pyproject.toml
├── uv.lock
├── README.md
├── docs/
├── src/
├── tests/
├── packages/
│   ├── datacloud-analysis/
│   ├── datacloud-data/
│   ├── datacloud-knowledge/
│   └── datacloud-memory/
└── examples/
    └── sales_analysis_demo/
        ├── frontend/
        ├── backend/
        │   └── datacloud_data_service/
        └── mock_env/
```

### Workspace 依赖管理

根 `pyproject.toml` 中通过 `tool.uv.workspace` 管理成员，当前为：

- `packages/datacloud-analysis`
- `packages/datacloud-data`
- `packages/datacloud-knowledge`
- `packages/datacloud-memory`

示例：为 `datacloud-analysis` 添加依赖：

```bash
uv add --package datacloud-analysis <package>
```
