Metadata-Version: 2.4
Name: oe2pg
Version: 1.0.0
Summary: Ready to use Progress to PostgreSQL database mirroring tool
Home-page: https://github.com/rowingdude/oe2pg
Author: Benjamin Cance
Author-email: canceb@gmail.com
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: System Administrators
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Database
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: psycopg2-binary>=2.9.6
Requires-Dist: JayDeBeApi>=1.2.3
Requires-Dist: xxhash>=3.0.0
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# OpenEdge 2 PostgreSQL Mirroring Tool

Enterprise-grade tool for mirroring Progress OpenEdge databases to PostgreSQL with support for:
- Initial full sync and incremental delta updates
- Row count verification and discrepancy repair
- Parallel processing for large tables
- Comprehensive tracking and error recovery
- Authorization error handling (we just ignore the table, kek)

ASCII lettering made with: [Patorjk](https://patorjk.com/software/taag/#p=display&f=Graffiti&t=Type%20Something%20)


# Install from PyPi with 

`pip install oe2pg`


## To get started ...

1. Create a configuration file (config.json):

```json
{
    "progress_db": {
        "host": "your-progress-host",
        "port": 1234,
        "db_name": "your_db",
        "user": "your_user", 
        "password": "your_password",
        "schema": "PUB",
        "jar_file": "path/to/openedge.jar",
        "driver_class": "com.ddtek.jdbc.openedge.OpenEdgeDriver"
    },
    "postgres_db": {
        "conn_string": "postgresql://user:password@host:5432/dbname",
        "pool_min_conn": 5,
        "pool_max_conn": 20
    },
    "mirror_settings": {
        "batch_size": 1000,
        "max_workers": 4,
        "max_cursors": 50,
        "log_file": "/var/log/db_mirror.log"
    }
}
```



2. Initial setup (first run):   `oe2pg --config config.json --first-run`

3. Regular syncs: `oe2pg --config config.json`

4. Repair discrepancies: `oe2pg --config config.json --repair`


## Cron Example

I recommend using this tool well away from peak load times, via cron, and ideally after any of your ETL imports occur. Our third parties run theirs during business hours, so this is a general cron example:

Daily sync at 1am: `0 1 * * * /usr/local/bin/oe2pg --config /etc/oe2pg/config.json >> /var/log/oe2pg-cron.log 2>&1`


## Tracking features

By default the program initiates a delta sync. This was designed on an older database built on Progress 11.7, and updated for 12.3, but the underlying data structure was not updated, therefore I have taken the following steps for keep track up updates:

1. Delta Sync Feature

- Get last known state (PK, hash, row count)
- Fetch that row from Progress by PK
- Compute hash and compare
- If hash matches:
  - Proceed with normal delta sync (PK > last_pk)
- If hash doesn't match:
  - Binary search backwards to find divergence point
  - Upsert all rows from divergence point forward

2. Binary Search Feature:

- Start with current row, go back by powers of 2
- When hash matches, narrow down to find exact divergence point
- This efficiently handles cases where changes occur near the end

3. Upsert Strategy:

- For tables with PKs: Use INSERT ... ON CONFLICT UPDATE
- For tables without PKs: Continue with full sync approach
