Metadata-Version: 2.4
Name: memect-pdfjson2x
Version: 0.1.3
Summary: 一个 docJson 转换为 Markdown / CSV / HTML 的工具
Author-email: lihanghang <lihanghang@memect.co>
License: MIT
Keywords: docjson,markdown,pdf,converter,html
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Text Processing :: Markup
Classifier: Topic :: Utilities
Requires-Python: >=3.9.21
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas>=2.3.0
Provides-Extra: test
Requires-Dist: pytest-integration>=0.2.3; extra == "test"
Dynamic: license-file

# docJson2X

docJson2X is a tool to convert docJson to Markdown or CSV, or Html.

- to_md/to_md_with_id

- to_csv

## Usage

```bash
uv build
# upload memect pypi source
uv publish --index memect

# 公共 PyPI 安装（distribution 名为 memect-pdfjson2x，import 仍为 docjson2x）
pip install memect-pdfjson2x


from docjson2x import DocJsonAnalyzer
# eg: 
input_file_paht = "tests/data/doc.json"
analyzer = DocJsonAnalyzer().analyze(input_file_paht)

# output to markdown，默认table为md格式输出
output_file_path = "tests/data/doc.md"
analyzer.to_md(output_file_path)
## markdown支持table为html
analyzer.to_md(output_file_path, table_html=True)

```
