Metadata-Version: 2.4
Name: dbhose_utils
Version: 0.0.2.0
Summary: DBHose dumps convertors utility.
Author-email: 0xMihalich <bayanmobile87@gmail.com>
License: MIT License
        
        Copyright (c) 2025 0xMihalich
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://github.com/0xMihalich/dbhose_utils
Project-URL: Documentation, https://github.com/0xMihalich/dbhose_utils#readme
Project-URL: Repository, https://github.com/0xMihalich/dbhose_utils
Project-URL: Changelog, https://github.com/0xMihalich/dbhose_utils/blob/main/CHANGELOG.md
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: light-compressor==0.0.2.1
Requires-Dist: nativelib==0.2.2.4
Requires-Dist: pgcopylib==0.2.2.8
Requires-Dist: pgpack==0.3.1.7
Dynamic: license-file

# DBHouse Utils

Dump convertor utility

## Enums

### CompressionMethod

Enum for select current compression method

* NONE - without compression
* LZ4 - lz4 compression
* ZSTD - zstd compression

### DumpType

Enum for dump class classification

* NATIVE - Clickhouse native
* PGCOPY - Postgres/Greenplum pgcopy
* PGPACK - Postgres/Greenplum pgpack container with metadata and pgcopy

### DumpClass

NamedTuple object for DumpType value

* name: str
* reader: object
* writer: object
* have_compress: bool

## Cython functions

### columns_from_metadata

Function for convert pgpack metadata to native column list

### metadata_from_columns

Function for convert native column list to pgpack metadata

### pgoid_from_metadata

Function for convert pgpack metadata to pgcopy oid list

## Python functions

### chunk_fileobj

Function for make iterable bytes chunks from file

### dump_convertor

Function for convert dump to another dump

A dump in Native format can be converted to PGPack or PGCopy, or you can leave it in Native format but change the compression codec.
A dump in PGPack format can be converted to Native or PGCopy, or you can leave it in PGPack format but change the compression codec.
A dump in PGCopy format cannot be converted to other formats due to its storage characteristics (no metadata),
but you can change the compression codec for compact storage of the dump on disk.

Example

```python
from dbhose_utils import dump_convertor

source = "path_to_source_dump"
destination = "path_to_output_dump"
dump_type = "native"  # Native, PGPack or PGCopy output format
compression_method = "lz4"  # lz4, zstd or none

dump_convertor(
    source=source,
    destination=destination,
    dump_type=dump_type,
    compression_method=compression_method,
)
```

### dump_detective

Function for auto detect dump type and compression method

Make current reader object from dump

```python
from dbhose_utils import dump_detective

file = "path_to_any_dump"

reader = dump_detective(file)
```

### dump_recovery

Added dump_recovery function to recover data from incomplete dumps

How it works

For pgpack and pgcopy, the damaged archive is recovered to the last valid row; for native, the damaged archive is recovered to the last valid block.
Why might this be useful? Let's say you were dumping a table and at that moment the server crashed/the connection was lost/some other force majeure occurred.
Now you have the opportunity to retrieve at least some data if you need it here and now.

```python
from dbhose_utils import dump_recovery

file_path = "path_to_broken_dump"
recovery_path = "path_to_recovery_dump"

dump_recovery(
    file_path=file_path,
    recovery_path=recovery_path,
)
```

## Installation

From pip

```bash
pip install dbhose-utils
```

From local directory

```bash
pip install .
```

From git

```bash
pip install git+https://github.com/0xMihalich/dbhose_utils
```
