Metadata-Version: 2.4
Name: cnt-collector-node
Version: 4.0.0rc1
Summary: Orcfax cnt collector node - Orcfax Node
Author-email: George Orcfax <george@orcfax.io>, "R. Spencer" <ross@orcfax.io>
Project-URL: Homepage, https://orcfax.io
Project-URL: Source, https://github.com/orcfax/cnt-collector-node
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Build Tools
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3 :: Only
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: requests==2.32.5
Requires-Dist: websocket-client==1.8.0

# cnt-collector-node (Cardano Native Tokens Collector Node)

The cnt-collector node is separated into two applications:

- a data collection indexer
- a data submitter that formats price data and submits it to an Orcfax validator
  node.

The indexer will be run constantly in the background, collecting data and saving
it in a local sqlite database, while the submitter node will run from time to
time (once every minute for example) and will send the newest collected data to
the validator node. In case the indexer application is not running, the
submitter will collect the data before submitting it to the validator node, but
the process is slow, taking a few minutes, depending on the amount of data that
needs to be collected. Submitting correct data to the validator node is more
important than submitting quickly data that might be incorrect.

## Issues

If you find any iddues with this code, please log them as a new issue under
[Orcfax Commons][commons-1].

[commons-1]: https://github.com/orcfax/issues-commons

## Quickstart

Most users outside of the Orcfax network will want to run this code and
experiment with it without forwaridng data to an Orcfax validator. You can
do this as follows.

### From pypi

```cli
python -m venv venv
source venv/bin/activate
python -m pip install cnt-collector-node
```

#### Configure the collector node

An identity file is needed by

The node configuration file (`/tmp/.node_identity.json`) can be generated by
running the script [script][identity-1] in `node_identity`.

```bash
python node_identity/create_identity.py
```

[identity-1]: node_identity/create_identity.py

It will generate a json file (`/tmp/.node-identity.json`) with the following
format:

```json
{
  "node_id": "<UUID node id>",
  "location": {
    "ip": "<ip address>",
    "hostname": "<hostname>",
    "city": "<City>",
    "region": "<Region>",
    "country": "<Country code>",
    "loc": "<Latitude and longitude>",
    "org": "<Organisation>",
    "postal": "<Zip code>",
    "timezone": "<Timezone>",
    "readme": "<Link>"
  },
  "initialization": "<Timestamp>",
  "init_version": null,
  "validator_web_socket": "<Websocket address of the validator, like: ws://localhost:8001/ws/node>",
  "validator_certificate": null
}
```

#### Environment

Setup the following environment variables (a indexer.env env file is provided
to help persist these settings):

```env
export USE_KUPO=true
export OGMIOS_URL=
export KUPO_URL=
export CNT_DB_NAME=
```

#### Indexer and submit entry-points

and then run the following for more information:

```bash
cnt-indexer --help
```

or:

```bash
cnt-submit --help
```

Both modes/commands are documented in detail below.

### From source

Clone the repository:

```bash
git clone git@github.com:orcfax/cnt-collector-node.git
cd cnt-collector-node
```

Create the python virtual environment and activate it:

```bash
python -m venv venv
source venv/bin/activate
pip install -r requirements/local.txt
```

#### Configuration and environment

Configure your node and environment as per the `pypi` instructions.

#### Run submit

Submit collects live data from the Kupo and Ogmios node and persists price
information in the database.

Run submit with:

```bash
just submit
```

The recipe looks as follows;

```bash
python -m src.cnt_collector_node.submitter --create-db \
 --identity-file-location /tmp/.node-identity.json \
 --pairs demo_pairs/pairs.py \
 --nopublish
```

#### Run index

The indexer indexes CNT data and stores it at `CNT_DB_NAME`.

Run the indexer:

`just index`

The recipe looks as follows;

```bash
python -m src.cnt_collector_node.indexer --pairs demo_pairs/pairs.py
```

#### Inspecting the commands

Both commands can be investigated further by running:

```bash
python submitter.py --help
```

and:

```bash
python indexer.py --help
```

respectively.

## Configuring the data sources

The cnt-collector node needs to be configured to know what CNT it needs to
collect information about and where to collect the information from.

The configuration lands in the `piars.py` file in the following format:

```json
{
  "name": "FACT-ADA",
  "token1_policy": "a3931691f5c4e65d01c429e473d0dd24c51afdb6daf88e632a6c1e51",
  "token1_name": "6f7263666178746f6b656e",
  "token1_decimals": 6,
  "token2_policy": "",
  "token2_name": "lovelace",
  "token2_decimals": 6,
  "sources": [
    {
      "source": "MinSwap",
      "address": "addr1z8snz7c4974vzdpxu65ruphl3zjdvtxw8strf2c2tmqnxzf6g882n6sa2gxnk42heavu7uddl5jdl0ektf5f204mmc7s3ykuf9",
      "security_token_policy": "0be55d262b29f564998ff81efe21bdc0022621c12f15af08d0f2ddb1",
      "security_token_name": "b4ba2b47edce71234f328fa20efdb25c3f96e348ca19a683193880489bb368db"
    },
    {
      "source": "WingRiders",
      "address": "addr1z8nvjzjeydcn4atcd93aac8allvrpjn7pjr2qsweukpnayg6pp9snyy9v7uwarxd7dqc5k52egtc49y5w5h3nqqdy6qs2nzs8y",
      "security_token_policy": "026a18d04a0c642759bb3d83b12e3344894e5c1c7b2aeb1a2113a570",
      "security_token_name": "4c"
    },
    {
      "source": "SundaeSwap",
      "address": "addr1w9qzpelu9hn45pefc0xr4ac4kdxeswq7pndul2vuj59u8tqaxdznu",
      "security_token_policy": "0029cb7c88c7567b63d1a512c0ed626aa169688ec980730c0473b913",
      "security_token_name": "7020fb04"
    },
    {
      "source": "Spectrum",
      "address": "addr1x94ec3t25egvhqy2n265xfhq882jxhkknurfe9ny4rl9k6dj764lvrxdayh2ux30fl0ktuh27csgmpevdu89jlxppvrst84slu",
      "security_token_policy": "d740d4088886a6b7e4ab8293424308515590e93e826cb874f8c92aff",
      "security_token_name": "6f7263666178746f6b656e5f4144415f4e4654"
    }
  ]
}
```

A full demo example is in the [`demo_pairs`][demo-pairs-1] directory.

[demo-pairs-1]: demo_pairs/pairs.py

This config file is generated automatically by running the code in
[`cnt-collector-config`][cnt-config-1].

[cnt-config-1]: https://github.com/orcfax/cnt-collector-config

### Config details

Each UTxO with a liquidity pool (a token pair in a DEX) contains a unique token
(NFT) which authenticates the liquidity pool. This token is used to identify
each liquidity pool. However, it's important to note that the smart contract
address for a liquidity pool from a DEX can change from time to time (eg if the
DEX wants to change it in order to delegate it to a stake pool). This means that
the CNT configured in `pairs.py` may require occasional updating.

In a liquidity pool, each token of the pair has 50% of the pool's total value--
making each equivalent in value. This means that the price of the target token
can be calculated by dividing the numbers of tokens in the pair.

Example: if the liquidity pool ADA/TokenX has 50k ADA and 350k TokenX, then the
price of TokenX in ADA is:

```text
price = 50000 / 350000 = 0.142857143
```

When calculating the average price on all DEXes, the total amount of available
tokens in the CNT on all DEXes is taken into account. From each DEX, the largest
liquidity pool is taken into account, because the others are much smaller and
they are irrelevant (this needs to be checked from time to time, to make sure it
doesn't change).

The `pairs.py` file included in this repository is a correct configuration file
currently, it can be used for testing purposes. For production, it's a good idea
to generate it again.

## Indexer

The indexer script (`indexer.py`) connects to Kupo and Ogmios to read the
configured DEX liquidity pools data and save it into a sqlite database table
for the collector script.

The table is called "utxos" and has the following structure:

```sql
CREATE TABLE utxos (
    id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    pair TEXT NOT NULL,
    source TEXT NOT NULL,
    price FLOAT NOT NULL,
    block_height INTEGER NOT NULL,
    address TEXT NOT NULL,
    token1_policy TEXT NOT NULL,
    token1_name TEXT NOT NULL,
    token1_decimals INTEGER NOT NULL,
    token2_policy TEXT NOT NULL,
    token2_name TEXT NOT NULL,
    token2_decimals INTEGER NOT NULL,
    security_token_policy TEXT NOT NULL,
    security_token_name TEXT NOT NULL,
    token1_amount INTEGER NOT NULL,
    token2_amount INTEGER NOT NULL,
    tx_hash TEXT NOT NULL,
    output_index INTEGER NOT NULL,
    date_time timestamp
);
```

The indexer should run continuously. There are 2 threads:

1. the `populate_utxos` threads, which inserts or updates the data in the
   `utxos` table. it runs when the script starts and after that every
   `UTXOS_THREAD_TIMEOUT` seconds (configurable in `config.py`)
2. the main execution thread, which connects to Ogmios and requests all the new
   blocks created in real time. it parses each transaction from each block, and
   if a transaction is updating the UTxO of a liquidity pair configured in
   `pairs.py`, it updates the utxo record for that pair on that DEX in the
   `utxos` table and inserts a new data point into the `price` table.

The data points saved in the `price` table is not used when submitting the data
to the validator node. It is saved for archiving and troubleshooting purposes.

The table where the datapoints are saved has the following format:

```sql
CREATE TABLE IF NOT EXISTS price (
    id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    pair TEXT NOT NULL,
    source TEXT NOT NULL,
    price FLOAT NOT NULL,
    token1_amount INTEGER NOT NULL,
    token2_amount INTEGER NOT NULL,
    epoch INTEGER NOT NULL,
    block_height INTEGER NOT NULL,
    date_time timestamp
);
```

Each update of an UTxO in the database table and each new block received from
Ogmios also triggers an update of the `status` table, which keeps track of the
latest block slot in the blockchain (and the timestamp when the record was
updated).

```sql
CREATE TABLE status (
    id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    current_block_slot INTEGER NOT NULL,
    date_time timestamp
    );
```

## Submit

The script (`submitter.py`) calculates the prices of the configured CNT pairs
from Cardano DEX liquidity pools and sends the prices to the configured
validator node.

The script should be run at regular intervals, and it will save a datapoint for
each pair of tokens (in each DEX) in a database.

The script needs to connect to an Ogmios server to read the blockchain data.

As a significant improvement, the collector script does not read the liquidity
pools data directly from the Ogmios, but from the sqlite database created by the
indexer script. Only when the data in the indexer script is outdated (which is
detected by reading the data saved in the `status` table and comparing it with
the data read from Ogmios), the script will read the data directly from Ogmios.

## Justfile

A `justfile` is included in the rerpo for convenience functions. See
[just][just-1] for more information on how to install this.

[just-1]: https://github.com/casey/just?tab=readme-ov-file#installation
