Metadata-Version: 2.4
Name: files3
Version: 0.10.1
Summary: (cloudpickle+lz4 based) save Python objects in binary to both the `file system` and `virtual disk in ram` and manage them.
Author-email: 2229066748@qq.com
Maintainer: Eagle'sBaby
Maintainer-email: 2229066748@qq.com
License: Apache Licence 2.0
Keywords: pickle,lz4,file system,file management
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: Microsoft :: Windows
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: lz4
Requires-Dist: cloudpickle
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: keywords
Dynamic: license
Dynamic: maintainer
Dynamic: maintainer-email
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

<h1>Files3 - Python Object File System</h1>

<ol>
<li><a href="#en_overview">English Version</a>
<ol>
<li><a href="#en_overview">Overview</a></li>
<li><a href="#en_installation">Installation</a></li>
<li><a href="#en_quick_start">Quick Start</a></li>
<li><a href="#en_use_cases">Use Cases</a></li>
<li><a href="#en_advanced">Advanced Usage</a></li>
<li><a href="#en_mem_backend">In-Memory Backend</a></li>
<li><a href="#en_embedded">Embedded Packaging</a></li>
<li><a href="#en_cli">CLI Commands</a></li>
<li><a href="#en_notice">Notice</a></li>
</ol>
</li>
<li><a href="#cn_overview">Chinese Version</a>
<ol>
<li><a href="#cn_overview">概述</a></li>
<li><a href="#cn_installation">安装</a></li>
<li><a href="#cn_quick_start">快速开始</a></li>
<li><a href="#cn_use_cases">应用场景</a></li>
<li><a href="#cn_advanced">高级用法</a></li>
<li><a href="#cn_mem_backend">内存后端</a></li>
<li><a href="#cn_embedded">嵌入式打包</a></li>
<li><a href="#cn_cli">命令行工具</a></li>
<li><a href="#cn_notice">注意事项</a></li>
</ol>
</li>
</ol>

<hr>

<a name="en_overview"></a>
<h2>Overview</h2>

<p>A Windows-native Python object persistence library. Save any Python object to the file system with a dict-like interface. Built on <code>cloudpickle</code> + <code>lz4</code> compression.</p>

<h3>When to Use</h3>
<table>
<tr><th>Scenario</th><th>Why Files3</th></tr>
<tr><td>Config persistence</td><td>Store complex Python objects (custom classes, lambdas, closures) instead of JSON/YAML</td></tr>
<tr><td>Local cache</td><td>Cache function results or intermediate states without a database</td></tr>
<tr><td>Data exchange</td><td>Pass Python objects between processes/scripts via the file system</td></tr>
<tr><td>Embedded packaging</td><td>Pack resources into a <code>.py</code> file with <code>packpy</code> for distribution</td></tr>
<tr><td>Experiment snapshots</td><td>Quickly save model weights, parameters, states mid-experiment</td></tr>
</table>

<h3>Core Advantages</h3>
<ul>
<li><b>Any object storage</b>: cloudpickle handles lambdas, closures, local classes, module references</li>
<li><b>Fast compression</b>: lz4 enabled by default, balancing space and speed</li>
<li><b>Source code relinking</b>: objects defined in <code>__main__</code> are auto-relinked on load even after script rename/move</li>
<li><b>Dict-like API</b>: <code>f['key']</code>, <code>f.key</code>, <code>f.set()</code> all work</li>
<li><b>Sub-key support</b>: one primary key can expand into multiple sub-keys, auto-converting to a folder</li>
<li><b>Dual backend</b>: file system (<code>F3Shell</code>) or shared memory (<code>F3Mem</code>) with identical APIs</li>
</ul>

<h3>Not Recommended For</h3>
<ul>
<li>Cross-platform data exchange (Windows only)</li>
<li>High-concurrency write scenarios (no locking, relies on filesystem atomicity)</li>
<li>Massive key-value stores (hundreds of thousands+ keys, filesystem inode bottleneck)</li>
<li>Complex querying requiring SQL-like search</li>
</ul>

<a name="en_installation"></a>
<h2>Installation</h2>
<pre><code class="language-bash">pip install files3</code></pre>
<p>After installation, associate a file extension with the <code>f3open</code> viewer:</p>
<pre><code class="language-cmd">f3assoc .ist</code></pre>

<a name="en_quick_start"></a>
<h2>Quick Start</h2>
<pre><code class="language-python">from files3 import files

f = files('./data')  # workspace directory, default suffix '.ist'

# Save
f.set('model', {'weights': [0.1, 0.2], 'epoch': 10})

# Load
print(f.get('model'))  # {'weights': [0.1, 0.2], 'epoch': 10}

# Check
print(f.has('model'))  # True

# Delete
f.delete('model')
</code></pre>

<a name="en_use_cases"></a>
<h2>Use Cases</h2>

<h3>1. Config Persistence</h3>
<pre><code class="language-python">from files3 import files

f = files('./config')

# Save a complex config with custom classes
f['app_cfg'] = {
    'lr_scheduler': lambda epoch: 0.1 ** (epoch // 10),  # lambda is fine
    'model_cls': MyModel,  # class reference is fine
    'layers': [64, 128, 256],
}

# Load it later (even in another script)
cfg = f['app_cfg']
</code></pre>

<h3>2. Function Result Cache</h3>
<pre><code class="language-python">from files3 import files

f = files('./cache')

def expensive_compute(x):
    key = f'compute_{x}'
    if f.has(key):
        return f[key]
    result = sum(i ** 2 for i in range(x))
    f[key] = result
    return result
</code></pre>

<h3>3. Data Exchange Between Processes</h3>
<pre><code class="language-python"># script_a.py
from files3 import files
f = files('./shared')
f['model'] = trained_model

# script_b.py
from files3 import files
f = files('./shared')
model = f['model']
</code></pre>

<h3>4. Batch Operations with Filter Syntax</h3>
<pre><code class="language-python">import re

f = files('./data')

# Delete all keys starting with 'temp_'
del f[re.compile(r'^temp_')]

# Set multiple keys at once
f['a', 'b', 'c'] = 100

# Delete by custom filter
del f[lambda name, ftype: name.startswith('old_')]

# Clear everything
del f[...]
</code></pre>

<a name="en_advanced"></a>
<h2>Advanced Usage</h2>

<h3>Dict-Style Access</h3>
<pre><code class="language-python">f = files('./data')

# Write
f.a = 1          # same as f.set('a', 1, error=True)
f['b'] = 2       # same as above
f['c', 'data'] = [1, 2, 3]  # sub-key

# Read
print(f.a)       # same as f.get('a', error=False)
print(f['b'])    # same as f.get('b', error=True)

# Delete
del f.a
del f['b']

# Check
'a' in f         # same as f.has('a', error=True)
len(f)           # count of primary keys
</code></pre>

<h3>Sub-Keys</h3>
<p>One primary key can hold multiple sub-keys. The primary key automatically becomes a folder.</p>
<pre><code class="language-python">f = files('./data')
f.set('user', {'name': 'alice'})           # user.ist (file)
f.set('user', {'age': 30}, skey='age')      # user.ist/ (folder)
                                            #   _.ist   original content
                                            #   age.ist new content

f['user', '_']     # read default sub-key
f['user', 'age']   # read age sub-key
f.list('user')     # ['_', 'age']
</code></pre>

<h3>Serialization Tools</h3>
<pre><code class="language-python">from files3 import files

# Serialize to bytes
b = files.dumps({'data': [1, 2, 3]})
obj = files.loads(b)

# Pack a file/directory into bytes
b = files.pack(r'C:\my_resource')
files.unpack(b, r'C:\extract_to')

# Pack into a Python module (no files3 dependency to unpack)
code = files.packpy(r'C:\my_resource')
with open('resource.py', 'w') as fh:
    fh.write(code)

# Unpack from the module
from resource import F3DATA
files.unpackpy(F3DATA, r'C:\extract_to')
</code></pre>

<a name="en_mem_backend"></a>
<h2>In-Memory Backend (F3Mem)</h2>
<p>Zero disk IO, cross-process sharing, lost on reboot. Identical API to <code>files</code>.</p>
<pre><code class="language-python">from files3 import memfiles

m = memfiles('my_ns')
m['key'] = {'speed': 'fast'}
m['key', 'sub'] = 'zero_disk_io'

# Persist to disk
m.save('./backup')

# Load from disk
m2 = memfiles('another_ns')
m2.load('./backup')

# Cleanup
m.clear()
</code></pre>

<table>
<tr><th>Feature</th><th>files (F3Shell)</th><th>memfiles (F3Mem)</th></tr>
<tr><td>Backend</td><td>File system</td><td>OS shared memory</td></tr>
<tr><td>Persistent</td><td>Yes</td><td>No (lost on reboot)</td></tr>
<tr><td>Cross-process</td><td>Via filesystem</td><td>Direct share (zero-copy)</td></tr>
<tr><td>Disk IO</td><td>Yes</td><td>None</td></tr>
</table>

<a name="en_embedded"></a>
<h2>Embedded Packaging</h2>
<p>Pack any file or directory into a <code>.py</code> file so it can travel with your source code. No database or extra dependency needed on the receiving end.</p>

<h3>Pack to a .py file</h3>
<pre><code class="language-python">from files3 import prefab

# Pack C:\my_data (file or folder) into C:\my_data.py
prefab.aspy(r'C:\my_data')
</code></pre>
<p>What happens:</p>
<ol>
<li>Zip-compresses the target.</li>
<li>Converts the zip bytes into a Python <code>bytes</code> literal.</li>
<li>Writes the literal into <code>C:\my_data.py</code> as variable <code>F3DATA</code>.</li>
</ol>

<h3>Auto-extract on first run</h3>
<p>Use <code>astarget</code> in your script. If the target is missing, it extracts from the adjacent <code>.py</code> file automatically.</p>
<pre><code class="language-python">from files3 import prefab

# If C:\my_data does not exist, extract from C:\my_data.py
prefab.astarget(r'C:\my_data')
</code></pre>

<h3>Low-level API</h3>
<p>If you need the code string in memory instead of a file:</p>
<pre><code class="language-python">from files3 import files

# Returns a Python code string (contains F3DATA variable)
code = files.packpy(r'C:\my_data')

# Later, feed the string back to unpack
files.unpackpy(code, r'C:\extract_to')
</code></pre>

<a name="en_cli"></a>
<h2>CLI Commands</h2>
<pre><code class="language-cmd">f3 [name] [type] -d [dir]   # open a files3 object interactively
f3open [filepath]             # open a single .ist file
f3assoc [type]                # associate file extension with f3open
f3unassoc [type]              # remove file association
</code></pre>

<a name="en_notice"></a>
<h2>Notice</h2>
<ul>
<li><b>Security</b>: pickle is not safe. Do not <code>loads()</code> data from untrusted sources.</li>
<li><b>Cannot save</b>: <code>F3Bool</code> instances, <code>F3Shell</code> instances, active exception objects, generators, open file handles, and some C-extension objects.</li>
<li><b>Windows only</b>: Relies on Win32 APIs for file associations and folder icons.</li>
<li><b>ModuleNotFoundError on load</b>: If the source script was moved/renamed, use <code>f.relink(new_path, 'key')</code> to fix.</li>
</ul>

<hr>

<a name="cn_overview"></a>
<h2>概述</h2>

<p>Windows原生Python对象持久化库。以文件系统为后端，将任意Python对象序列化存储，提供类字典的交互接口。基于 <code>cloudpickle</code> + <code>lz4</code> 压缩。</p>

<h3>适用场景</h3>
<table>
<tr><th>场景</th><th>说明</th></tr>
<tr><td>配置持久化</td><td>替代json/yaml存储复杂Python数据结构（含自定义类、lambda、闭包）</td></tr>
<tr><td>本地缓存</td><td>缓存函数计算结果、中间状态，支持任意可序列化对象</td></tr>
<tr><td>数据交换</td><td>通过文件系统在不同进程/脚本间传递Python对象</td></tr>
<tr><td>嵌入式打包</td><td>将资源文件打包为Python代码（<code>packpy</code>/<code>unpackpy</code>），随代码分发</td></tr>
<tr><td>实验数据保存</td><td>快速保存实验中间结果（模型、参数、状态），无需配置数据库</td></tr>
</table>

<h3>核心优势</h3>
<ul>
<li><b>任意对象存储</b>：基于cloudpickle，支持lambda、闭包、局部类、模块引用等标准pickle无法处理的类型</li>
<li><b>lz4压缩</b>：默认启用高速压缩，平衡存储空间和读写性能</li>
<li><b>源代码重链接</b>：保存<code>__main__</code>中定义的类/函数时，自动记录源文件路径，加载时自动修正模块名</li>
<li><b>类字典接口</b>：支持<code>f['key']</code>、<code>f.key</code>、<code>f.set()</code>等多种交互方式</li>
<li><b>子键支持</b>：单主键可扩展为多子键，主键自动转为文件夹管理</li>
<li><b>双后端</b>：文件系统（<code>F3Shell</code>）或共享内存（<code>F3Mem</code>），API完全一致</li>
</ul>

<h3>不推荐的场景</h3>
<ul>
<li>跨平台数据交换（仅支持Windows）</li>
<li>高并发写入场景（无锁机制，依赖文件系统原子性）</li>
<li>超大规模键值存储（数十万级以上，文件系统inode成为瓶颈）</li>
<li>需要SQL查询的复杂检索场景</li>
</ul>

<a name="cn_installation"></a>
<h2>安装</h2>
<pre><code class="language-bash">pip install files3</code></pre>
<p>安装后，可在cmd中将文件后缀关联到<code>f3open</code>查看器：</p>
<pre><code class="language-cmd">f3assoc .ist</code></pre>

<a name="cn_quick_start"></a>
<h2>快速开始</h2>
<pre><code class="language-python">from files3 import files

f = files('./data')  # 工作目录，默认后缀 '.ist'

# 保存
f.set('model', {'weights': [0.1, 0.2], 'epoch': 10})

# 读取
print(f.get('model'))  # {'weights': [0.1, 0.2], 'epoch': 10}

# 检查
print(f.has('model'))  # True

# 删除
f.delete('model')
</code></pre>

<a name="cn_use_cases"></a>
<h2>应用场景</h2>

<h3>1. 配置持久化</h3>
<pre><code class="language-python">from files3 import files

f = files('./config')

# 保存含自定义类的复杂配置
f['app_cfg'] = {
    'lr_scheduler': lambda epoch: 0.1 ** (epoch // 10),  # lambda没问题
    'model_cls': MyModel,  # 类引用没问题
    'layers': [64, 128, 256],
}

# 之后读取（甚至另一个脚本）
cfg = f['app_cfg']
</code></pre>

<h3>2. 函数结果缓存</h3>
<pre><code class="language-python">from files3 import files

f = files('./cache')

def expensive_compute(x):
    key = f'compute_{x}'
    if f.has(key):
        return f[key]
    result = sum(i ** 2 for i in range(x))
    f[key] = result
    return result
</code></pre>

<h3>3. 跨进程数据交换</h3>
<pre><code class="language-python"># script_a.py
from files3 import files
f = files('./shared')
f['model'] = trained_model

# script_b.py
from files3 import files
f = files('./shared')
model = f['model']
</code></pre>

<h3>4. 批量筛选操作</h3>
<pre><code class="language-python">import re

f = files('./data')

# 删除所有以'temp_'开头的键
del f[re.compile(r'^temp_')]

# 同时设置多个键
f['a', 'b', 'c'] = 100

# 自定义条件删除
del f[lambda name, ftype: name.startswith('old_')]

# 清空全部
del f[...]
</code></pre>

<a name="cn_advanced"></a>
<h2>高级用法</h2>

<h3>字典式访问</h3>
<pre><code class="language-python">f = files('./data')

# 写入
f.a = 1          # 等价 f.set('a', 1, error=True)
f['b'] = 2       # 等价同上
f['c', 'data'] = [1, 2, 3]  # 子键

# 读取
print(f.a)       # 等价 f.get('a', error=False)
print(f['b'])    # 等价 f.get('b', error=True)

# 删除
del f.a
del f['b']

# 检查
'a' in f         # 等价 f.has('a', error=True)
len(f)           # 主键数量
</code></pre>

<h3>子键</h3>
<p>一个主键下可存储多个子键，主键自动变为文件夹。</p>
<pre><code class="language-python">f = files('./data')
f.set('user', {'name': 'alice'})           # user.ist（文件）
f.set('user', {'age': 30}, skey='age')      # user.ist/（文件夹）
                                            #   _.ist   原内容
                                            #   age.ist 新内容

f['user', '_']     # 读取默认子键
f['user', 'age']   # 读取age子键
f.list('user')     # ['_', 'age']
</code></pre>

<h3>序列化工具</h3>
<pre><code class="language-python">from files3 import files

# 序列化为bytes
b = files.dumps({'data': [1, 2, 3]})
obj = files.loads(b)

# 将文件/目录打包为bytes
b = files.pack(r'C:\my_resource')
files.unpack(b, r'C:\extract_to')

# 打包为Python模块（解压时无需安装files3）
code = files.packpy(r'C:\my_resource')
with open('resource.py', 'w') as fh:
    fh.write(code)

# 从模块解压
from resource import F3DATA
files.unpackpy(F3DATA, r'C:\extract_to')
</code></pre>

<a name="cn_mem_backend"></a>
<h2>内存后端（F3Mem）</h2>
<p>零磁盘IO、跨进程共享、重启丢失。接口与<code>files</code>完全一致。</p>
<pre><code class="language-python">from files3 import memfiles

m = memfiles('my_ns')
m['key'] = {'speed': 'fast'}
m['key', 'sub'] = 'zero_disk_io'

# 持久化到磁盘
m.save('./backup')

# 从磁盘加载
m2 = memfiles('another_ns')
m2.load('./backup')

# 清理
m.clear()
</code></pre>

<table>
<tr><th>特性</th><th>files (F3Shell)</th><th>memfiles (F3Mem)</th></tr>
<tr><td>后端</td><td>文件系统</td><td>OS共享内存</td></tr>
<tr><td>持久化</td><td>是（磁盘持久）</td><td>否（重启丢失）</td></tr>
<tr><td>跨进程</td><td>通过文件系统</td><td>直接共享（零拷贝）</td></tr>
<tr><td>磁盘IO</td><td>有</td><td>无</td></tr>
</table>

<a name="cn_embedded"></a>
<h2>嵌入式打包</h2>
<p>将任意文件或目录打包成 <code>.py</code> 文件，随源代码一起分发。接收端无需数据库或其他依赖。</p>

<h3>打包为 .py 文件</h3>
<pre><code class="language-python">from files3 import prefab

# 将 C:\my_data（文件或目录）打包为 C:\my_data.py
prefab.aspy(r'C:\my_data')
</code></pre>
<p>内部流程：</p>
<ol>
<li>将目标 zip 压缩。</li>
<li>将 zip 字节流转换为 Python <code>bytes</code> 字面量。</li>
<li>写入 <code>C:\my_data.py</code>，变量名为 <code>F3DATA</code>。</li>
</ol>

<h3>首次运行时自动解压</h3>
<p>在脚本中使用 <code>astarget</code>。如果目标不存在，自动从相邻的 <code>.py</code> 文件解压。</p>
<pre><code class="language-python">from files3 import prefab

# 若 C:\my_data 不存在，则从 C:\my_data.py 解压
prefab.astarget(r'C:\my_data')
</code></pre>

<h3>底层 API</h3>
<p>如果需要字符串而非直接生成文件：</p>
<pre><code class="language-python">from files3 import files

# 返回 Python 代码字符串（内含 F3DATA 变量）
code = files.packpy(r'C:\my_data')

# 后续传入字符串解压
files.unpackpy(code, r'C:\extract_to')
</code></pre>

<a name="cn_cli"></a>
<h2>命令行工具</h2>
<pre><code class="language-cmd">f3 [name] [type] -d [dir]   # 交互式打开files3对象
f3open [filepath]             # 打开单个.ist文件
f3assoc [type]                # 关联文件后缀到f3open
f3unassoc [type]              # 移除文件关联
</code></pre>

<a name="cn_notice"></a>
<h2>注意事项</h2>
<ul>
<li><b>安全性</b>：pickle不安全，不要对不信任的数据调用<code>loads()</code>。</li>
<li><b>无法保存</b>：<code>F3Bool</code>实例、<code>F3Shell</code>实例、活动异常对象、生成器、打开的文件句柄、某些C扩展对象。</li>
<li><b>仅限Windows</b>：依赖Win32 API实现文件关联和文件夹图标设置。</li>
<li><b>加载时ModuleNotFoundError</b>：如果源脚本被移动/重命名，使用<code>f.relink(new_path, 'key')</code>修复。</li>
</ul>
