Metadata-Version: 2.2
Name: cherry-indexer
Version: 0.1.0
Summary: A flexible blockchain indexing and data processing pipeline
License: MIT
Requires-Python: >=3.10.11
Description-Content-Type: text/markdown
Requires-Dist: aioboto3==13.4.0
Requires-Dist: aiobotocore==2.18.0
Requires-Dist: aiofiles==24.1.0
Requires-Dist: aiohappyeyeballs==2.4.4
Requires-Dist: aiohttp==3.11.11
Requires-Dist: aioitertools==0.12.0
Requires-Dist: aiosignal==1.3.1
Requires-Dist: annotated-types==0.7.0
Requires-Dist: argon2-cffi==23.1.0
Requires-Dist: argon2-cffi-bindings==21.2.0
Requires-Dist: async-timeout==5.0.1
Requires-Dist: asyncpg==0.30.0
Requires-Dist: attrs==24.3.0
Requires-Dist: boto3==1.36.1
Requires-Dist: botocore==1.36.1
Requires-Dist: certifi==2025.1.31
Requires-Dist: cffi==1.17.1
Requires-Dist: charset-normalizer==3.4.1
Requires-Dist: cherry-core==0.0.6
Requires-Dist: click==8.1.8
Requires-Dist: clickhouse-connect==0.8.15
Requires-Dist: colorama==0.4.6
Requires-Dist: dacite==1.9.2
Requires-Dist: frozenlist==1.5.0
Requires-Dist: greenlet==3.1.1
Requires-Dist: hypersync==0.8.5
Requires-Dist: idna==3.10
Requires-Dist: jmespath==1.0.1
Requires-Dist: lz4==4.4.3
Requires-Dist: minio==7.2.15
Requires-Dist: multidict==6.1.0
Requires-Dist: numpy==2.2.1
Requires-Dist: packaging==24.2
Requires-Dist: pandas==2.2.3
Requires-Dist: polars==1.19.0
Requires-Dist: propcache==0.2.1
Requires-Dist: psycopg2-binary==2.9.10
Requires-Dist: pyarrow==19.0.0
Requires-Dist: pycparser==2.22
Requires-Dist: pycryptodome==3.21.0
Requires-Dist: pydantic==2.10.4
Requires-Dist: pydantic_core==2.27.2
Requires-Dist: python-dateutil==2.9.0.post0
Requires-Dist: python-dotenv==1.0.1
Requires-Dist: pytz==2024.2
Requires-Dist: PyYAML==6.0.2
Requires-Dist: requests==2.32.3
Requires-Dist: s3transfer==0.11.2
Requires-Dist: six==1.17.0
Requires-Dist: SQLAlchemy==2.0.37
Requires-Dist: StrEnum==0.4.15
Requires-Dist: typing_extensions==4.12.2
Requires-Dist: tzdata==2024.2
Requires-Dist: urllib3==2.3.0
Requires-Dist: wrapt==1.17.2
Requires-Dist: yarl==1.18.3
Requires-Dist: zstandard==0.23.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: isort>=5.12.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Requires-Dist: build>=1.0.0; extra == "dev"

# Cherry Event Indexer

A flexible blockchain event indexing and data processing pipeline.

## Overview

Cherry Event Indexer is a modular system for:
- Ingesting blockchain events and logs
- Processing and transforming blockchain data
- Writing data to various storage backends

## Features

- **Modular Pipeline Architecture**
  - Configurable data providers
  - Customizable processing steps
  - Pluggable storage backends

- **Built-in Steps**
  - EVM block validation
  - Event decoding
  - Custom processing steps

- **Storage Options**
  - Local Parquet files
  - AWS S3
  - More coming soon...

## Project Structure:

```
cherry/
├── src/
│ ├── config/ # Configuration parsing
│ ├── utils/ # Pipeline and utilities
│ └── writers/ # Storage backends
├── examples/ # Example implementations
├── tests/ # Test suite
└── config.yaml # Pipeline configuration
```

## Prerequisites:
- Python 3.10 or higher
- Docker and Docker Compose
- MinIO (for local S3-compatible storage)

## Installation Steps

Clone the repository and go to the project root:

```bash
git clone https://github.com/steelcake/cherry.git
cd cherry
```

Create and activate a virtual environment:

```bash
# Create virtual environment (all platforms)
python -m venv .venv

# Activate virtual environment

# For Windows with git bash:
source .venv/Scripts/activate

# For macOS/Linux:
source .venv/bin/activate
```

Install dependencies:

```bash
pip install -r requirements.txt
```

Set up environment variables:

Create a .env file in the project root
Add your Hypersync API token:

## Quick Start

1. Create a config file (`config.yaml`):

2. Run the script:

```bash
python main.py
```

## Custom Processing Steps

To add a custom processing step, you need to:

1. Define the step function
2. Add the step to the context
3. Add the step to the config

example: get_block_number_stats.py
```python
def get_block_number_stats(data: Dict[str, pa.RecordBatch], step_config: Dict[str, Any]) -> Dict[str, pa.RecordBatch]:
    """Custom processing step for transfer events"""
    pass
```

config.yaml
```yaml
steps:
  - name: my_get_block_number_stats
    kind: get_block_number_stats
    config:
      input_table: logs
      output_table: block_number_stats
```

## Running the Project

Start MinIO server (for local S3 storage):

```bash
# Navigate to docker-compose directory
cd docker-compose

# Start MinIO using docker-compose
docker-compose up -d

# Return to project root
cd ..
```

Default credentials:

```
Access Key: minioadmin
Secret Key: minioadmin
Console URL: http://localhost:9001
```

Note: The MinIO service will be automatically configured with the correct ports and volumes as defined in the docker-compose.yml file.

Configure pipelines:

- Open config.yaml
- Adjust query, event filters, and batch sizes as needed for your pipeline
- Configure writer settings (S3/local parquet etc.)

Run the indexer:

```bash
python main.py
```
