Metadata-Version: 2.4
Name: iscache
Version: 0.2.0
Summary: Distributed in-memory cache framework for Python with LRU, LFU, and TTL eviction policies
Author: isCache Contributors
License: MIT
Project-URL: Homepage, https://github.com/yourusername/iscache
Project-URL: Documentation, https://github.com/yourusername/iscache#readme
Project-URL: Repository, https://github.com/yourusername/iscache
Project-URL: Issues, https://github.com/yourusername/iscache/issues
Keywords: cache,distributed,in-memory,lru,lfu,ttl,eviction
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: System :: Distributed Computing
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: test
Requires-Dist: pytest>=7.0.0; extra == "test"
Requires-Dist: hypothesis>=6.0.0; extra == "test"
Dynamic: license-file

# isCache - Distributed In-Memory Cache Framework

## Overview

isCache is a distributed, fault-tolerant cache framework for Python that provides in-memory caching with multiple eviction strategies across a cluster of nodes. The framework ensures cache consistency and reliability during node failures through data replication and consistent hashing for efficient key distribution.

isCache is designed as an in-process distributed cache, where multiple cache nodes run within the same Python process but are managed as a logical cluster with consistent hashing, replication, and failover capabilities.

## Features

- **Multiple Eviction Policies**: Support for LRU (Least Recently Used), LFU (Least Frequently Used), and TTL (Time-To-Live) eviction strategies
- **Consistent Hashing**: Minimizes data redistribution when nodes are added or removed from the cluster
- **Data Replication**: Configurable replication factor ensures data availability during node failures
- **Automatic Failover**: Client automatically fails over to replica nodes when primary nodes are unavailable
- **Dynamic Topology Management**: Add or remove nodes at runtime without restarting the cache
- **Topology Broadcast**: Registered listeners are notified immediately when the cluster topology changes
- **Thread-Safe Operations**: All cache operations are thread-safe with fine-grained locking
- **Capacity Management**: Per-node capacity limits with automatic eviction
- **Statistics Tracking**: Monitor cache performance with hit/miss/eviction metrics
- **Background TTL Cleanup**: Automatic removal of expired entries for TTL policy

## Architecture

```
┌──────────────────────────────────────────────────────────────────────┐
│                            CacheClient                                │
│  ┌───────────────┐  ┌──────────────────┐  ┌────────────────────────┐│
│  │  Hash Ring    │  │ Replication Mgr  │  │    Cluster Manager     ││
│  │  (Consistent  │  │  (Replication    │  │  add_node / remove_node││
│  │   Hashing)    │  │   Factor: N)     │  │  get_topology          ││
│  └───────────────┘  └──────────────────┘  └────────────────────────┘│
└──────────────────────────────────────────────────────────────────────┘
                              │
         ┌────────────────────┼────────────────────┐
         │                    │                    │
         ▼                    ▼                    ▼
   ┌──────────┐         ┌──────────┐        ┌──────────┐
   │  Node 1  │         │  Node 2  │        │  Node 3  │
   │──────────│         │──────────│        │──────────│
   │ Storage  │         │ Storage  │        │ Storage  │
   │ LRU/LFU  │         │ LRU/LFU  │        │ LRU/LFU  │
   │   TTL    │         │   TTL    │        │   TTL    │
   │ Stats    │         │ Stats    │        │ Stats    │
   └──────────┘         └──────────┘        └──────────┘
```

### Components

- **CacheClient**: Main interface for interacting with the distributed cache. Handles routing, replication, failover, and dynamic topology management.
- **ClusterManager**: Manages live cluster topology — adding/removing nodes, broadcasting changes to listeners, and providing topology snapshots.
- **ClusterTopology**: Immutable snapshot of the cluster state at a point in time (nodes, version, timestamp, hash ring state).
- **CacheNode**: Individual cache storage unit with configurable eviction policy and capacity.
- **ConsistentHashRing**: Implements consistent hashing algorithm with virtual nodes for uniform key distribution.
- **ReplicationManager**: Coordinates replication of data across multiple nodes.
- **CacheEntry**: Data model representing a cached item with metadata (timestamps, access count, expiration).
- **EvictionPolicy**: Enumeration of supported eviction strategies (LRU, LFU, TTL).

## Installation

### Prerequisites

- Python 3.8 or higher

### Install Dependencies

```bash
# Clone or navigate to the project directory
cd isCache

# Create and activate a virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in development mode
pip install -e .

# Install test dependencies
pip install pytest hypothesis
```

## Quick Start

Here's a simple example to get started with isCache:

```python
from iscache import CacheClient

# Create a cache client with 3 nodes and replication factor of 2
client = CacheClient(
    node_addresses=["node1", "node2", "node3"],
    replication_factor=2
)

# Store data
client.set("user:1", {"name": "Alice", "age": 30})
client.set("user:2", {"name": "Bob", "age": 25})

# Retrieve data
user1 = client.get("user:1")
print(user1)  # {'name': 'Alice', 'age': 30}

# Update data
client.set("user:1", {"name": "Alice", "age": 31})

# Delete data
client.delete("user:2")

# Verify deletion
user2 = client.get("user:2")
print(user2)  # None
```

## Usage Guide

### Creating a Cache Client

The `CacheClient` is the main entry point for all cache operations:

```python
from iscache import CacheClient

# Single node (no replication)
client = CacheClient(["node1"], replication_factor=1)

# Multi-node with replication
client = CacheClient(
    ["node1", "node2", "node3", "node4"],
    replication_factor=3
)
```

**Parameters:**
- `node_addresses`: List of node identifiers (strings)
- `replication_factor`: Number of copies to maintain (default: 2, must be >= 1)

### Setting and Getting Values

```python
# Set a value (supports any Python object)
client.set("key", "value")
client.set("user:123", {"name": "Charlie", "email": "charlie@example.com"})
client.set("counter", 42)
client.set("items", [1, 2, 3, 4, 5])

# Get a value
value = client.get("key")
user = client.get("user:123")
count = client.get("counter")

# Non-existent keys return None
missing = client.get("nonexistent")  # None
```

### Deleting Values

```python
# Delete a key from all replicas
result = client.delete("key")  # Returns True

# Deleting a non-existent key also returns True
result = client.delete("nonexistent")  # Returns True
```

### Working with Individual Cache Nodes

For advanced use cases, you can work directly with `CacheNode` instances:

```python
from iscache.node import CacheNode
from iscache.models import EvictionPolicy

# Create a node with LRU policy
node = CacheNode(
    node_id="my_node",
    capacity=1000,
    eviction_policy=EvictionPolicy.LRU
)

# Use the node
node.set("key", "value")
value = node.get("key")
node.delete("key")

# Get statistics
stats = node.get_stats()
print(stats)  # {'hits': 1, 'misses': 0, 'evictions': 0, 'size': 0}
```

## Eviction Policies

### LRU (Least Recently Used)

Evicts the entry that was accessed (read or written) least recently.

```python
from iscache.node import CacheNode
from iscache.models import EvictionPolicy

node = CacheNode("node1", capacity=3, eviction_policy=EvictionPolicy.LRU)

node.set("a", 1)
node.set("b", 2)
node.set("c", 3)

# Access 'a' to make it recently used
node.get("a")

# Adding 'd' will evict 'b' (least recently accessed)
node.set("d", 4)

print(node.get("a"))  # 1 (still exists)
print(node.get("b"))  # None (evicted)
print(node.get("c"))  # 3 (still exists)
print(node.get("d"))  # 4 (newly added)
```

**Best for:** General-purpose caching where recent access patterns are good predictors of future access.

### LFU (Least Frequently Used)

Evicts the entry with the lowest access count (frequency).

```python
from iscache.node import CacheNode
from iscache.models import EvictionPolicy

node = CacheNode("node1", capacity=3, eviction_policy=EvictionPolicy.LFU)

node.set("a", 1)
node.set("b", 2)
node.set("c", 3)

# Access 'a' multiple times
for _ in range(5):
    node.get("a")

# Access 'c' twice
node.get("c")
node.get("c")

# Don't access 'b' at all

# Adding 'd' will evict 'b' (lowest frequency: 0)
node.set("d", 4)

print(node.get("a"))  # 1 (still exists, freq=5)
print(node.get("b"))  # None (evicted, freq=0)
print(node.get("c"))  # 3 (still exists, freq=2)
print(node.get("d"))  # 4 (newly added, freq=0)
```

**Best for:** Caching scenarios where access frequency is more important than recency (e.g., popular content).

### TTL (Time-To-Live)

Automatically evicts entries after a specified duration. Expired entries are removed lazily on access and proactively via background cleanup.

```python
from iscache.node import CacheNode
from iscache.models import EvictionPolicy
import time

# Create node with 2-second TTL
node = CacheNode(
    "node1",
    capacity=10,
    eviction_policy=EvictionPolicy.TTL,
    ttl_seconds=2
)

node.set("temp", "temporary data")
print(node.get("temp"))  # "temporary data"

# Wait for expiration
time.sleep(2.1)

print(node.get("temp"))  # None (expired)

# Stop the background cleanup thread when done
node.stop()
```

**Best for:** Caching time-sensitive data like session tokens, temporary credentials, or rate-limiting counters.

**Features:**
- Expired entries return `None` on access and are automatically removed
- Background thread cleans up expired entries every 60 seconds
- When cache is full, expired entries are prioritized for eviction

## Replication and Failover

isCache provides data replication to ensure availability during node failures:

```python
from iscache import CacheClient

# Create cluster with 4 nodes and replication factor 3
# Each key will be stored on 3 nodes
client = CacheClient(
    ["node1", "node2", "node3", "node4"],
    replication_factor=3
)

# Data is automatically replicated
client.set("important_data", {"value": 42})

# If primary node fails, client automatically fails over to replicas
# The get operation will try primary first, then replicas in order
data = client.get("important_data")  # Still works even if primary is down
```

### How Replication Works

1. **Key Routing**: Consistent hashing determines the primary node for each key
2. **Replication**: Data is replicated to N-1 additional nodes (clockwise on the hash ring)
3. **Read Operations**: Client tries primary first, falls back to replicas if needed
4. **Write Operations**: Data is written to all replica nodes
5. **Failover**: Automatic and transparent to the application

### Handling Node Failures

```python
from iscache import CacheClient, NodeUnreachableException

client = CacheClient(["node1", "node2", "node3"], replication_factor=2)

try:
    value = client.get("key")
except NodeUnreachableException:
    # All nodes (primary and replicas) are unreachable
    print("Cache is unavailable")
```

## Dynamic Topology Management

isCache supports adding and removing nodes at runtime without stopping the cache. The `ClusterManager` coordinates these changes, updates the hash ring, and notifies any registered listeners.

### Adding a Node

```python
from iscache import CacheClient

client = CacheClient(["node1", "node2"], replication_factor=2)

# Add a new node with default settings (capacity=1000, LRU)
client.add_node("node3")

# Add a node with custom settings
client.add_node("node4", capacity=5000, eviction_policy=EvictionPolicy.LFU)

# Cache operations immediately use the new node
client.set("key", "value")
```

### Removing a Node

```python
# Remove a node — hash ring is updated, existing operations continue
client.remove_node("node2")

# Raises ConfigurationException if node does not exist
from iscache import ConfigurationException
try:
    client.remove_node("ghost")
except ConfigurationException as e:
    print(e)
```

### Inspecting the Current Topology

```python
from iscache import CacheClient, ClusterTopology

client = CacheClient(["node1", "node2", "node3"], replication_factor=2)

topo: ClusterTopology = client.get_topology()
print(topo.version)          # 1
print(topo.nodes)            # {"node1": "active", "node2": "active", "node3": "active"}
print(topo.hash_ring_state)  # ["node1", "node2", "node3"]
print(topo.timestamp)        # Unix timestamp of the snapshot
```

### Subscribing to Topology Changes

Register a callback to be notified whenever a node is added or removed:

```python
def on_topology_change(topo: ClusterTopology):
    print(f"Topology changed! version={topo.version}, nodes={list(topo.nodes.keys())}")

client.cluster_manager.register_topology_listener(on_topology_change)

client.add_node("node4")     # triggers: "Topology changed! version=2, nodes=[...]"
client.remove_node("node1")  # triggers: "Topology changed! version=3, nodes=[...]"
```

### Notes on Dynamic Topology

- After `add_node`, the `ReplicationManager` is automatically rebuilt to include the new node.
- After `remove_node`, the node's TTL cleanup thread (if any) is stopped cleanly.
- `add_node` raises `ConfigurationException` if a node with the same ID already exists.
- `remove_node` raises `ConfigurationException` if the node ID is not found.
- All topology operations are thread-safe.

## API Reference

### CacheClient

**Constructor:**
```python
CacheClient(node_addresses: List[str], replication_factor: int = 2)
```

**Methods:**
- `get(key: str) -> Optional[Any]`: Retrieve value for key
- `set(key: str, value: Any, ttl: Optional[int] = None) -> bool`: Store key-value pair
- `delete(key: str) -> bool`: Remove key from cache
- `add_node(node_id, capacity=1000, eviction_policy=LRU, ttl_seconds=None) -> bool`: Add a new node to the cluster at runtime
- `remove_node(node_id) -> bool`: Remove a node from the cluster at runtime
- `get_topology() -> ClusterTopology`: Return current cluster topology snapshot

**Raises:**
- `ConfigurationException`: Invalid initialization parameters
- `NodeUnreachableException`: All nodes unreachable for a key
- `CacheException`: General cache operation failure

### CacheNode

**Constructor:**
```python
CacheNode(
    node_id: str,
    capacity: int,
    eviction_policy: EvictionPolicy,
    ttl_seconds: Optional[int] = None
)
```

**Methods:**
- `get(key: str) -> Optional[Any]`: Retrieve value
- `set(key: str, value: Any) -> bool`: Store key-value pair
- `delete(key: str) -> bool`: Remove key
- `get_stats() -> Dict[str, int]`: Get cache statistics
- `stop() -> None`: Stop background cleanup thread (TTL only)

**Statistics:**
Returns dictionary with:
- `hits`: Number of successful gets
- `misses`: Number of failed gets
- `evictions`: Number of evicted entries
- `size`: Current number of entries

### EvictionPolicy (Enum)

```python
from iscache.models import EvictionPolicy

EvictionPolicy.LRU   # Least Recently Used
EvictionPolicy.LFU   # Least Frequently Used
EvictionPolicy.TTL   # Time-To-Live
```

### ClusterManager

**Constructor:**
```python
ClusterManager(hash_ring: ConsistentHashRing, nodes: Dict[str, CacheNode])
```
> Normally accessed via `client.cluster_manager` — not created directly.

**Methods:**
- `add_node(node_id, capacity=1000, eviction_policy=LRU, ttl_seconds=None) -> bool`: Add a node to the cluster
- `remove_node(node_id) -> bool`: Remove a node from the cluster
- `get_topology() -> ClusterTopology`: Return a snapshot of current cluster state
- `register_topology_listener(callback) -> None`: Subscribe to topology change notifications

**Raises:**
- `ConfigurationException`: Duplicate node_id on add, or unknown node_id on remove

### ClusterTopology

A read-only dataclass snapshot of the cluster at a point in time.

**Fields:**
- `nodes: Dict[str, str]` — mapping of node_id → status (`"active"`)
- `version: int` — monotonically increasing version number
- `timestamp: float` — Unix timestamp when snapshot was created
- `hash_ring_state: List[str]` — list of active node_ids currently in the ring

```python
from iscache import ClusterTopology

topo = client.get_topology()
print(topo.version)        # 3
print(topo.nodes)          # {"node1": "active", "node3": "active"}
print(topo.hash_ring_state)  # ["node1", "node3"]
```

### Exceptions

```python
from iscache import (
    CacheException,              # Base exception
    NodeUnreachableException,    # Node connectivity issues
    ReplicationException,        # Replication failures
    ConfigurationException,      # Invalid configuration
    ConsensusException           # Consensus failures (future)
)
```

## Testing

### Run All Tests

```bash
# Run full test suite with verbose output
pytest tests/ -v

# Run with short traceback
pytest tests/ -v --tb=short

# Run specific test file
pytest tests/test_client.py -v

# Run smoke tests
pytest tests/test_smoke.py -v
```

### Run Specific Test Categories

```bash
# Run only hash ring tests
pytest tests/test_hashring.py -v

# Run only eviction tests
pytest tests/test_node.py::TestCacheNodeCapacity -v

# Run only replication tests
pytest tests/test_replication.py -v
```

### Property-Based Tests

The test suite includes property-based tests using Hypothesis:

```bash
# Run property-based tests
pytest tests/ -k "property" -v

# Run with increased iterations
pytest tests/ -v --hypothesis-iterations=200
```

## Project Structure

```
isCache/
├── iscache/                    # Main package
│   ├── __init__.py            # Public API exports
│   ├── client.py              # CacheClient implementation
│   ├── node.py                # CacheNode implementation
│   ├── hashring.py            # Consistent hash ring
│   ├── replication.py         # Replication manager
│   ├── models.py              # Data models (CacheEntry, EvictionPolicy)
│   ├── exceptions.py          # Exception hierarchy
│   ├── cluster.py             # ClusterManager and ClusterTopology
│   └── persistence.py         # Persistence engine (future)
├── tests/                      # Test suite
│   ├── test_client.py         # Client tests
│   ├── test_node.py           # Node tests
│   ├── test_hashring.py       # Hash ring tests
│   ├── test_replication.py    # Replication tests
│   ├── test_models.py         # Model tests
│   ├── test_exceptions.py     # Exception tests
│   ├── test_lfu_eviction.py   # LFU eviction tests
│   ├── test_ttl_expiration.py # TTL expiration tests
│   ├── test_ttl_cleanup.py    # TTL cleanup tests
│   ├── test_basic.py          # Basic sanity tests
│   ├── test_cluster.py        # ClusterManager topology tests
│   └── test_smoke.py          # End-to-end smoke tests
├── pyproject.toml             # Project metadata
└── README.md                  # This file
```

## Implemented Requirements

isCache implements the following requirements from the specification:

### ✅ Core Cache Operations (Requirement 1)
- Set, get, and delete operations
- Key routing via consistent hashing
- Return None for non-existent keys
- Replication to all replicas

### ✅ Eviction Policy Selection (Requirement 2)
- Support for LRU, LFU, and TTL policies
- Per-node policy configuration
- Metadata tracking for each policy

### ✅ LRU Eviction (Requirement 3)
- Timestamp-based eviction
- Access time tracking
- Least recently used entry removal

### ✅ LFU Eviction (Requirement 4)
- Frequency counter tracking
- Lowest frequency eviction
- Tie-breaking by creation time

### ✅ TTL Eviction (Requirement 5)
- Automatic expiration tracking
- Lazy removal on access
- Background cleanup thread
- Prioritized eviction of expired entries

### ✅ Consistent Hashing (Requirement 6)
- Virtual nodes for uniform distribution
- Minimal redistribution on topology changes
- Clockwise key-to-node mapping

### ✅ Data Replication (Requirement 7)
- Configurable replication factor
- Primary-replica architecture
- Automatic failover to replicas
- Delete propagation to all replicas

### ✅ Cache Client Initialization (Requirement 10)
- Multi-node client creation
- Connection to all nodes
- Graceful handling of unavailable nodes
- Hash ring initialization

### ✅ Capacity Management (Requirement 11)
- Per-node capacity limits
- Automatic eviction at capacity
- Accurate size tracking

### ✅ Concurrent Access Safety (Requirement 12)
- Thread-safe operations
- Fine-grained locking
- Linearizability of operations

### ✅ Error Handling (Requirement 13)
- Descriptive exception messages
- Clear exception hierarchy
- Configuration validation

### ✅ Cache Statistics (Requirement 14)
- Hit/miss/eviction counters
- Current size tracking
- Thread-safe statistics

### ✅ Dynamic Topology Management (Requirements 9.1, 9.2, 9.5, 10.4)
- Add nodes to the cluster at runtime via `add_node()`
- Remove nodes from the cluster at runtime via `remove_node()`
- Topology broadcast to all registered listeners on every change
- `get_topology()` returns a versioned snapshot of current cluster state

### 🚧 Future Enhancements

The following features are planned for future releases:

- **Node Failure Detection (Requirement 8)**: Heartbeat monitoring and automatic node removal
- **Cluster Consensus (Requirement 9.3/9.4)**: Majority-vote consensus for topology changes in multi-process deployments
- **Persistence**: Optional disk-backed storage
- **Network Communication**: True distributed nodes across processes/machines

## Performance Characteristics

- **Get Operations**: O(log N) hash ring lookup + O(1) node access
- **Set Operations**: O(log N) hash ring lookup + O(1) node write + O(R) replication
- **Delete Operations**: O(log N) hash ring lookup + O(R) replication
- **Eviction**: O(N) scan for victim selection (where N is entries per node)
- **Memory Usage**: O(C × N) where C is capacity per node and N is number of nodes

Where:
- N = number of nodes
- R = replication factor
- C = capacity per node

## Thread Safety

All cache operations are thread-safe:
- `CacheNode` uses `threading.Lock` for all shared state access
- `ConsistentHashRing` uses `threading.RLock` for thread safety
- Statistics counters are protected by locks
- TTL cleanup runs in a daemon thread

## License

This project is provided as-is for educational and development purposes.

## Contributing

Contributions are welcome! Areas for improvement:

- Additional eviction policies (ARC, SLRU, etc.)
- Network-based distribution
- Persistence layer
- Performance optimizations
- Additional test coverage

## Support

For issues, questions, or contributions, please refer to the project repository or documentation.
