Metadata-Version: 2.4
Name: mcard
Version: 0.1.6
Summary: MCard: Memory Card with TDD approach
Home-page: https://github.com/yourusername/MCard_TDD
Author: MCard Team
Author-email: MCard Team <user@example.com>
Project-URL: Homepage, https://github.com/yourusername/MCard_TDD
Project-URL: Issues, https://github.com/yourusername/MCard_TDD/issues
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.11.0
Description-Content-Type: text/markdown
Requires-Dist: python-dateutil==2.8.2
Requires-Dist: SQLAlchemy==1.4.47
Requires-Dist: aiosqlite==0.17.0
Requires-Dist: duckdb>=0.9.2
Requires-Dist: lancedb>=0.3.3
Requires-Dist: python-dotenv==1.0.0
Requires-Dist: structlog>=23.2.0
Requires-Dist: python-json-logger==2.0.7
Provides-Extra: dev
Requires-Dist: pytest==7.4.3; extra == "dev"
Requires-Dist: pytest-asyncio==0.23.2; extra == "dev"
Requires-Dist: pytest-cov==4.1.0; extra == "dev"
Requires-Dist: mypy>=1.7.1; extra == "dev"
Requires-Dist: black>=23.11.0; extra == "dev"
Requires-Dist: isort>=5.12.0; extra == "dev"
Requires-Dist: twine>=6.1.0; extra == "dev"
Dynamic: author
Dynamic: home-page
Dynamic: requires-python

# MCard Core

A Python library implementing an algebraically closed data structure for content-addressable storage. MCard ensures that every piece of content in the system is uniquely identified by its hash and temporally ordered by its claim time, enabling robust content verification and precedence ordering. It allows for an Monadic approach to namespace management and content deduplication for any type of data.

## Documentation

- [Card Collection Guide](docs/card_collection_guide.md): Detailed guide on MCard collection management and hash collision handling
- [Global Time Design](docs/design_g_time.md): Documentation on the global time (`g_time`) implementation
- [Test-Driven Development Guide](docs/tdd_guide.md): Guide on our TDD approach and methodology

## Core Concepts

MCard implements an algebraically closed system where:
1. Every MCard is uniquely identified by its content hash (configurable, defaulting to SHA-256).
2. Every MCard has an associated claim time (timezone-aware timestamp with microsecond precision).
3. The database maintains these invariants automatically.
4. Content integrity is guaranteed through immutable hashes.
5. Temporal ordering is preserved at microsecond precision.

This design provides several key guarantees:
- **Content Integrity**: The content hash serves as both identifier and verification mechanism.
- **Temporal Signature**: All cards are associated with a timestamp: `g_time`.
- **Precedence Verification**: The claim time enables determination of content presentation order.
- **Algebraic Closure**: Any operation on MCards produces results that maintain these properties.
- **Type Safety**: Built on Pydantic with strict validation and type checking.

### Required Attributes for Each MCard

Each MCard **must** have the following three required attributes:

#### 1. **`content`**: The actual data being stored (string or bytes).
#### 2. **`hash`**: A cryptographic hash of the content, using SHA-256 by default (configurable to other algorithms).
#### 3. **`g_time`**: A timezone-aware timestamp with microsecond precision, representing the global time when the card was claimed.

## Directory Structure

- `mcard/`: Contains the main application code.
- `tests/`: Contains test files for the application.
- `logs/`: Contains log files generated by the application.
- `data/db/`: Directory for storing database files used by the application.
- `data/files/`: Directory reserved for storing general files used by the application.

## Database Technologies

We will be using embedded database technologies, such as SQLite, DuckDB, and LanceDB initially, to provide efficient and reliable data storage solutions for MCard. These technologies are well-suited for handling the requirements of content-addressable storage and will allow for easy integration and management of data within the application.

## API Endpoint

MCard can serve as an API endpoint for serving data content. By using FastAPI with Uvicorn, you can easily create and manage API routes for accessing and manipulating MCard data. FastAPI provides automatic interactive API documentation and is designed for high performance, making it an excellent choice for this project.

FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.6+ based on standard Python type hints. Uvicorn is an ASGI server that allows you to serve the FastAPI application. To run the API server, use the following command:

```bash
uvicorn mcard.api:app --reload
```

This command will start the Uvicorn server with the FastAPI application defined in `mcard/api.py`, allowing you to access the MCard API at `http://localhost:8000`. You can also view the interactive API documentation at `http://localhost:8000/docs`. 

## PyTest Configuration

- The project uses [PyTest](https://docs.pytest.org/en/stable/) for testing.
- Tests are located in the `tests` directory.
- The configuration file `pytest.ini` specifies test paths and naming conventions.

## Logging Configuration

- The project uses Python's built-in `logging` module for logging.
- Logs are written to `logs/mcard.log` with a maximum size of 10MB and up to 5 backup files.
- The logging format includes timestamps, log levels, and detailed information about the source of the log messages.
- The logging level is set to DEBUG for console output and INFO for file output.
- To initialize logging, call `setup_logging()` from `mcard.logging_config` before running tests or application code.

## Running Tests

To run tests:
```bash
pytest
```

To run tests with coverage:
```bash
pytest --cov=mcard
```

## Hegel's Dialectic in Testing and CI/CD

Hegel's dialectic is a philosophical framework that describes the process of development and change through a triadic structure: thesis, antithesis, and synthesis. Here's how it relates to software testing and Continuous Integration/Continuous Deployment (CI/CD):

1. **Thesis (Initial Code)**: Represents the initial code or feature implementation, the starting point where a developer writes code to fulfill a specific requirement or feature.

2. **Antithesis (Testing and Bugs)**: Arises during the testing phase, where tests are executed. If tests fail or bugs are discovered, they represent a challenge to the initial implementation, highlighting discrepancies between intended functionality and actual behavior.

3. **Synthesis (Refinement and Improvement)**: Occurs when developers address the issues identified during testing, leading to a refined version of the code that resolves conflicts between the initial implementation and testing outcomes.

### CI/CD Integration
In a CI/CD pipeline, this dialectical process is continuous:

- **Continuous Integration**: Developers frequently integrate code changes into a shared repository. Each integration triggers automated tests, allowing for rapid identification of issues against the current codebase.

- **Continuous Deployment**: Once the code passes testing, it can be automatically deployed, representing the synthesis where refined code is made available to users.

This iterative process fosters continuous improvement, where each round of testing and deployment leads to better software quality and functionality. By applying Hegel's dialectic, teams can embrace the idea that conflict (in the form of bugs and failures) is a natural and necessary part of the development process, ultimately leading to a more robust and effective product.

## Handling Duplicate Events

When a duplicate card is detected, the `duplicate_event_card` is assigned a new timestamp value. This ensures that even though the content is identical to the original card, the hash value will be unique due to the different timestamp. This mechanism allows for robust handling of duplicate content while maintaining the integrity of the system.

## MD5 Collision Testing

The test suite includes verification of MD5 collision detection using known collision pairs from the FastColl attack. These pairs produce identical MD5 hashes despite having different content:

### MD5 Collision Pair
```
Input 1:
4dc968ff0ee35c209572d4777b721587d36fa7b21bdc56b74a3dc0783e7b9518afbfa200a8284bf36e8e4b55b35f427593d849676da0d1555d8360fb5f07fea2
                                                                     ^^^                                    ^^^

Input 2:
4dc968ff0ee35c209572d4777b721587d36fa7b21bdc56b74a3dc0783e7b9518afbfa202a8284bf36e8e4b55b35f427593d849676da0d1d55d8360fb5f07fea2
                                                                     ^^^                                    ^^^
```

Key differences:
1. `200` vs `202`
2. `d15` vs `d1d`

Both inputs produce the same MD5 hash value, demonstrating MD5's vulnerability to collision attacks. This is why MCard defaults to using more secure hash functions like SHA-256.

## Testing Behavior

The current tests, particularly `@test_sqlite_persistence.py`, will always clear the database after one of the test functions is run. This means that `test_mcard.db` will only contain the data from the last test executed. If the `clear()` function in the fixture is uncommented, it will remove the content of the last test as well.

## Core Dependencies

- `SQLAlchemy==1.4.47`: SQL toolkit and ORM
- `aiosqlite==0.17.0`: Async SQLite database driver
- `python-dateutil==2.8.2`: Date/time utilities
- `python-dotenv==1.0.0`: Environment management

## Description
MCard is a project designed to facilitate card management with a focus on validation and logging features.

## Installation

### Using uv

You can install the MCard package from PyPI (once published):

```bash
uv pip install mcard
```

### Installing from source

To install MCard directly from the source code:

```bash
# Clone the repository
git clone https://github.com/yourusername/MCard_TDD.git
cd MCard_TDD

# Install in development mode with uv
uv pip install -e .

# Install with development dependencies
uv pip install -e ".[dev]"
```

### Development Environment Setup

1. Set up a virtual environment using uv:
```bash
# Simply run the activate script which handles uv setup
source activate_venv.sh
```

This script will:
- Ensure conda is disabled (if present)
- Create a virtual environment using uv if it doesn't exist
- Activate the virtual environment
- Install dependencies from pyproject.toml using uv

Alternatively, you can manually set up the environment:
```bash
# Create and activate virtual environment with uv
uv venv .venv
source .venv/bin/activate

# Install dependencies with uv
uv pip sync pyproject.toml
```

## Usage

After installation, you can use MCard in your Python code:

```python
from mcard.model.card import MCard
from mcard.model.card_collection import CardCollection

# Create a new card
card = MCard(content="Hello, MCard!")

# Create a card collection
collection = CardCollection()

# Add the card to the collection
collection.add(card)

# Retrieve the card by its hash
retrieved_card = collection.get_by_hash(card.hash)
print(retrieved_card.content)  # Outputs: Hello, MCard!
```


# Or use the installed command-line entry point
mcard
```

## Recent Updates

### MCard Detail View Component
- Created a new component `mcard_detail_view.html` to display detailed information about MCards, including:
  - Full hash string
  - g_time string
  - Content type
  - Appropriate content display for images, videos, PDFs, and plain text.

### Dynamic Content Loading
- Implemented functionality to dynamically load and display card details when a card entry is clicked.
- Added JavaScript functions to handle click events and fetch card details from the server.

### Error Handling and Logging
- Enhanced error handling in the Flask backend to log errors and provide better feedback.
- Added detailed logging in the JavaScript to track the fetching and rendering process.

### Template Updates
- Updated existing templates to integrate the new detail view component and ensure proper rendering.

### User Experience Improvements
- Improved visual feedback for selected cards.
- Ensured that the focused area updates correctly without becoming blank.

### Configuration Management Refactoring (2024-12-18)
- Renamed `EnvConfig` to `EnvParameters` for better clarity and consistency
- Moved configuration management from `env_config.py` to `env_parameters.py`
- Updated all references to use the new class name across the codebase
- Enhanced test coverage for configuration parameters
- Maintained singleton pattern for configuration management
- Ensured backward compatibility with existing environment variable handling

### Database Enhancements
- Implemented `get_all()` method in SQLiteEngine for efficient pagination
- Added support for page size and page number parameters
- Enhanced error handling for invalid pagination parameters
- Improved performance by optimizing SQL queries
- Added comprehensive test coverage for pagination functionality

## Recent Changes

### Directory Structure Updates
- The `hash_algorithms` directory has been renamed to `algorithms` for simplicity and clarity.
- The `hash_validator.py` file has been renamed to `validator.py` to simplify the naming convention.

### Updated Imports
- All relevant import statements across the codebase have been updated to reflect the new structure and naming.

### Engine Refactor
- Removed the abstract `search_by_content` method from `SQLiteEngine` and `DuckDBEngine`.
- Integrated search functionality into the [search_by_string](cci:1://file:///mcard/model/card_collection.py:94:4-96:82) method, allowing searches across content, hash, and g_time fields.

### Event Generation
- Updated [generate_duplication_event](cci:1://file:///mcard/model/event_producer.py:38:0-54:28) and [generate_collision_event](cci:1://file:///mcard/model/event_producer.py:57:0-76:38) to return JSON strings.
- Enhanced event structure to include upgraded hash functions and content size.

### Logging
- Integrated logging into test cases for better traceability and debugging.

### MCard Class Update
- The [MCard](cci:2://file:///mcard/model/card.py:6:0-47:9) constructor now accepts a [hash_function](cci:1://file:///mcard/model/event_producer.py:8:0-23:16) parameter, providing more flexibility in hash generation.

### Tests
- Adjusted tests to verify the new event generation logic and ensure search functionality works as intended.

## Centralized Configuration Management

### Overview
MCard has adopted a centralized configuration management approach to improve maintainability, scalability, and readability. This involves consolidating all configuration constants into a single location, making it easier to manage and update configuration values across the application.

### Configuration Constants
All configuration constants are now defined in `config_constants.py`. This file contains named constants for various configuration values, including:

- Database schema and paths
- Hash algorithm constants and hierarchy
- Environment variable names
- API configuration
- HTTP status codes
- Error messages
- Event types and structure

### Benefits
Centralized configuration management provides several benefits, including:

- **Single Source of Truth**: All configuration constants are managed in one location.
- **Type Safety**: Constants are properly typed and documented.
- **Maintainability**: Changes to configuration values only need to be made in one place.
- **Code Completion**: IDE support for constant names improves developer productivity.
- **Documentation**: Each constant group is documented with its purpose and usage.
- **Testing**: Test files use the same constants as production code, ensuring consistency.

### Implementation
The `config_constants.py` file uses an enum-based approach for hash algorithms, ensuring type safety and readability. The file is organized into logical groups, making it easier to find and update specific configuration values.

### Example Usage
To use a configuration constant, simply import the `config_constants` module and access the desired constant. For example:
```python
from config_constants import HASH_ALGORITHM_SHA256

# Use the SHA-256 hash algorithm
hash_algorithm = HASH_ALGORITHM_SHA256
```
By adopting a centralized configuration management approach, MCard has improved its maintainability, scalability, and readability, making it easier to manage and update configuration values across the application.

## Using MCardFromData for Stored Values

When retrieving stored MCard data from the database, always use the subclass `MCardFromData`. This approach allows you to bypass unnecessary and unwanted algorithms, significantly speeding up the MCard instantiation process.

## Project Structure

```plaintext
MCard_TDD/
├── mcard/
│   ├── algorithms/          # Hash algorithm implementations
│   ├── engine/             # Database engines (SQLite, DuckDB)
│   ├── model/              # Core data models
│   ├── api.py             # FastAPI endpoints
│   └── logging_config.py   # Logging configuration
├── tests/
│   ├── persistence/       # Database persistence tests
│   └── unit/             # Unit tests
├── docs/                  # Project documentation
├── data/
│   ├── db/               # Database files
│   └── files/            # General files
└── logs/                 # Application logs
```
## Configuration
### Environment Setup
Create a .env file with the following variables:

```plaintext
MCARD_DB_PATH=data/db/mcard_demo.db
TEST_DB_PATH=data/db/test_mcard.db
MCARD_SERVICE_LOG_LEVEL=DEBUG
 ```

## Development Guidelines
### Using MCardFromData
When retrieving stored data, use MCardFromData instead of the base MCard class:

```python
from mcard.model.card import MCardFromData

stored_card = MCardFromData(content=content, hash=hash, g_time=g_time)
 ```

### Hash Algorithm Configuration
The default hash algorithm is SHA-256, but it's configurable:
```python
from mcard.algorithms import HASH_ALGORITHM_SHA256
 ```

## Installation

To set up the project, follow these steps:

1. Create a virtual environment:
   ```bash
   python -m venv .venv
   ```

2. Activate the virtual environment:
   - On macOS and Linux:
     ```bash
     source .venv/bin/activate
     ```
   - On Windows:
     ```bash
     .venv\Scripts\activate
     ```

3. Configure your environment:
   - Copy `.env.example` to create your own `.env` file.
   - The default configuration uses:
     - Database path: `data/db/mcard_demo.db`.
     - Hash algorithm: SHA-256.
     - Connection pool size: 5.
     - Connection timeout: 30 seconds.

## Directory Structure

- **mcard/**
  - **engine/**: Contains the database engine implementations, including SQLite and DuckDB.
  - **model/**: Contains the core data models, including `MCard`.
  - **tests/**: Contains all test cases for the MCard library, ensuring functionality and correctness.

## SQLite Persistence Testing

- **tests/persistence/sqlite_test.py**: Contains test cases for SQLite persistence, ensuring data integrity and consistency.

The tests in `@test_sqlite_persistence.py` are designed to clear the database after each test function is run. This means that the `test_mcard.db` file will only contain the data from the last test executed. If the `clear()` function in the fixture is uncommented, it will remove the content of the last test as well. This behavior is intended to ensure that each test starts with a clean database, allowing for more accurate and reliable testing results.
