Metadata-Version: 2.4
Name: PECLibrary
Version: 0.7.0
Summary: Event capture library for data pipelines with Kafka transport and decorator-based monitoring
Author-email: Sahabista Arkitanego Armantaera <sahabista.arkitanego@jkt1.ebdesk.com>
License: MIT
Project-URL: Homepage, http://git.blackeye.id/sahabista/palantir-event-capture.git
Project-URL: Documentation, http://git.blackeye.id/sahabista/palantir-event-capture.git
Project-URL: Repository, http://git.blackeye.id/sahabista/palantir-event-capture.git
Keywords: monitoring,logging,kafka,data-pipeline,observability
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: pydantic<3.0,>=1.10
Requires-Dist: kafka-python>=2.0.0
Requires-Dist: loguru>=0.7.0
Requires-Dist: python-dotenv>=1.0.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Provides-Extra: test-v1
Requires-Dist: pydantic<2.0,>=1.10; extra == "test-v1"
Requires-Dist: pytest>=7.0.0; extra == "test-v1"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "test-v1"
Provides-Extra: test-v2
Requires-Dist: pydantic>=2.0.0; extra == "test-v2"
Requires-Dist: pytest>=7.0.0; extra == "test-v2"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "test-v2"

# Palantir Event Capture

A production-ready event capture library for data pipelines with decorator-based monitoring, Kafka transport, and standardized error logging.

## Features

- 🎯 **Decorator-based capture** - Easily wrap functions with error/event monitoring
- 📊 **Standardized event models** - Consistent event structure with Pydantic validation
- 🚀 **Kafka transport** - Automatic event streaming to Kafka topics
- 🔄 **Retry logic** - Built-in retry mechanisms with exponential backoff
- 📝 **Rich context** - Captures stack traces, function metadata, and custom data
- ⚡ **Async support** - Full support for async/await patterns
- 🚀 **Background Transport** - Asynchronous event sending in background thread
- 🎛️ **Flexible configuration** - Environment-based or programmatic configuration
- 🔌 **Multiple transports** - Console, Kafka, or custom transport layers
- 🐍 **Pydantic V2 Compatible** - Works with both Pydantic V1 and V2

## Installation

```bash
pip install git+https://<USERNAME>:<TOKEN>@git.blackeye.id/sahabista/rca-event-capture.git
```

- Use git install to access the library, ensure you have been add as the maintainer for the repository.
- Use Personal Access Token (PAT) to avoid error

## Quick Start

### Basic Usage

```python
from PECLibrary.decorators import capture_errors
from PECLibrary.models.captured_models import PipelineComponent, Severity

@capture_errors(
    component=PipelineComponent.KAFKA_PRODUCER,
    severity=Severity.HIGH
)
def send_message(topic: str, message: dict):
    # Your code here
    producer.send(topic, message)
```

### Async Functions

```python
from PECLibrary.decorators import async_capture_errors

@async_capture_errors(
    component=PipelineComponent.ELASTICSEARCH,
    severity=Severity.CRITICAL
)
async def index_document(doc: dict):
    await es_client.index(index="logs", document=doc)
```

### Retry with Capture

```python
from PECLibrary.decorators import retry_with_capture

@retry_with_capture(
    max_retries=3,
    delay=1.0,
    backoff=2.0,
    component=PipelineComponent.DATABASE
)
def query_database(query: str):
    return db.execute(query)
```

### Context Manager

```python
from PECLibrary.utils.context_managers import capture_context

with capture_context(
    component=PipelineComponent.TRANSFORMATION,
    pipeline_name="etl_pipeline",
    engine_name="pandas"
):
    # Your data transformation code
    df = process_data(raw_data)
```

## Configuration

### Environment Variables

Create a `.env` file:

```env
# Kafka Configuration
KAFKA_BOOTSTRAP_SERVERS_RCA=localhost:9092
KAFKA_EVENT_TOPIC_RCA=rca.events
KAFKA_ENABLED_RCA=true
KAFKA_COMPRESSION_RCA=gzip

# Logging Configuration
LOG_LEVEL=INFO
LOG_TO_CONSOLE=true
SEND_TO_TRANSPORT=true

# Event Capture Settings
DEFAULT_COMPONENT=CUSTOM
DEFAULT_SEVERITY=MEDIUM
ENABLE_STACK_TRACES=true
USE_BACKGROUND_TRANSPORT=false
```

### Programmatic Configuration

```python
from PECLibrary.config import configure

configure(
    KAFKA_BOOTSTRAP_SERVERS_RCA="localhost:9092",
    kafka_events_topic="rca.events",
    log_level="INFO",
    send_to_transport=True
)
```

## Event Model

Events are captured with comprehensive metadata:

```python
{
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "event_type": "error",
    "pipeline_name": "data_ingestion_pipeline",
    "engine_name": "spark",
    "component": "kafka_consumer",
    "component_name": "UserEventsConsumer",
    "severity": "high",
    "message": "Failed to consume messages: Connection timeout",
    "timestamp": 1703001234.567,
    "timestamp_iso": "2024-12-01 12:34:56.789",
    "error_type": "KafkaTimeoutError",
    "stack_trace": "...",
    "function_name": "consume_messages",
    "file_path": "/app/consumers/events.py",
    "line_number": 45,
    "metadata": {
        "topic": "user_events",
        "partition": 0,
        "offset": 12345
    },
    "resolved": false
}
```

## Advanced Usage

### Custom Component Names

```python
@capture_errors(
    component=PipelineComponent.CUSTOM,
    component_name="MyCustomETL",
    pipeline_name="daily_batch_job",
    engine_name="custom"
)
def custom_pipeline():
    pass
```

### Capture All Class Methods

```python
from PECLibrary.decorators import capture_method_errors

@capture_method_errors(
    component=PipelineComponent.API,
    severity=Severity.HIGH
)
class DataAPI:
    def fetch(self):
        pass
  
    def save(self):
        pass
```

### Manual Event Capture

```python
from PECLibrary.services import get_capture_service
from PECLibrary.models.captured_models import CaptureConfig

service = get_capture_service()

try:
    risky_operation()
except Exception as e:
    config = CaptureConfig(
        component=PipelineComponent.POSTGRES,
        severity=Severity.CRITICAL,
        metadata={"query": "SELECT * FROM users"}
    )
    service.capture_exception(e, config=config)
```

### Custom Transport

```python
from PECLibrary.services.transport import Transport
from PECLibrary.models.captured_models import ErrorEvent

class CustomTransport(Transport):
    def send(self, event: ErrorEvent) -> bool:
        # Your custom transport logic
        my_logging_system.log(event.dict())
        return True

# Register your transport
from PECLibrary.services import get_capture_service
service = get_capture_service()
service.add_transport(CustomTransport())
```

## Pipeline Components

Supported pipeline components:

- **Message Brokers**: `KAFKA_PRODUCER`, `KAFKA_CONSUMER`, `RABBITMQ`, `BEANSTALK`
- **Databases**: `ELASTICSEARCH`, `POSTGRES`, `MONGODB`, `REDIS`, `MEMGRAPH`, `NEBULA`, `QDRANT`, `DATABASE`
- **Storage**: `S3`
- **Processing**: `TRANSFORMATION`, `DBT`, `PANDAS`
- **Orchestration**: `AIRFLOW`
- **Services**: `API`, `WEBHOOK`
- **Generic**: `CUSTOM`, `UNKNOWN`

## Event Severity Levels

- `CRITICAL` - System-breaking errors requiring immediate attention
- `HIGH` - Significant errors affecting functionality
- `MEDIUM` - Moderate issues that should be investigated
- `LOW` - Minor issues or warnings
- `WARNING` - Advisory notifications

## Architecture

See [ARCHITECTURE.md](ARCHITECTURE.md) for detailed design documentation.

## Testing

```bash
# Run tests
pytest

# With coverage
pytest --cov=src --cov-report=html

# Run specific test
pytest tests/test_decorators.py
```

## Contributing

Contributions are welcome! Please see our contributing guidelines.

## License

MIT License - see LICENSE file for details.

## Support

- Documentation: [ReadTheDocs](https://palantir-event-capture.readthedocs.io)
- Issues: [GitHub Issues](https://github.com/yourusername/palantir-event-capture/issues)
- Discussions: [GitHub Discussions](https://github.com/yourusername/palantir-event-capture/discussions)
