Metadata-Version: 2.4
Name: django-hashed-encrypted-fields
Version: 0.1.0
Summary: Django model fields with transparent encryption and searchable hash columns.
Project-URL: Homepage, https://github.com/kolanos/django-hashed-encrypted-fields
Project-URL: Repository, https://github.com/kolanos/django-hashed-encrypted-fields
Project-URL: Issues, https://github.com/kolanos/django-hashed-encrypted-fields/issues
Author-email: Michael Lavers <kolanos@gmail.com>
License-Expression: MIT
License-File: LICENSE
Keywords: cryptography,django,encrypted-fields,encryption,fernet,hash,privacy,searchable
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Web Environment
Classifier: Framework :: Django
Classifier: Framework :: Django :: 4.2
Classifier: Framework :: Django :: 5.0
Classifier: Framework :: Django :: 5.1
Classifier: Framework :: Django :: 6.0
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Security :: Cryptography
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Requires-Dist: cryptography>=41.0
Requires-Dist: django>=4.2
Description-Content-Type: text/markdown

# django-hashed-encrypted-fields

Django model fields that store encrypted values with separate searchable hash columns.

Data is encrypted at rest using Fernet symmetric encryption and can be queried via HMAC-SHA256 hashes without ever decrypting. Encryption keys, hashing, and storage are all pluggable.

## Features

- Transparent encryption/decryption on model fields
- Searchable fields via a separate indexed hash column (`{field}_hash`)
- Encrypted column stored as `{field}_encrypted`
- Pluggable key providers (use Django settings, a vault, or any custom source)
- Pluggable encryption/hash providers (Fernet + HMAC-SHA256 by default)
- Per-field provider overrides
- Key rotation support via MultiFernet
- Django admin integration with configurable value masking
- Django forms support (encryption is transparent)
- Encrypted file storage for FileField and ImageField
- 11 field types: Char, Text, Integer, Boolean, Date, DateTime, Decimal, JSON, Binary, File, Image

## Requirements

- Python 3.10+
- Django 4.2+
- cryptography 41.0+

## Installation

```bash
pip install django-hashed-encrypted-fields
```

Add to `INSTALLED_APPS`:

```python
INSTALLED_APPS = [
    # ...
    "encrypted_fields",
]
```

Generate an encryption key:

```bash
python manage.py generate_encryption_key
```

Add the output to your settings:

```python
ENCRYPTED_FIELDS_KEY = "your-generated-fernet-key"
ENCRYPTED_FIELDS_HASH_SALT = "your-generated-salt"
```

## Quick Start

```python
from django.db import models
from encrypted_fields import EncryptedCharField, EncryptedIntegerField

class Patient(models.Model):
    name = EncryptedCharField(max_length=100)
    ssn = EncryptedCharField(max_length=11, searchable=True, unique=True)
    age = EncryptedIntegerField()
```

This creates two database columns for `name` (`name_encrypted`), three for `ssn` (`ssn_encrypted` + `ssn_hash`), and two for `age` (`age_encrypted`):

```python
# Create
patient = Patient.objects.create(name="Alice", ssn="123-45-6789", age=30)

# Read — decryption is transparent
patient = Patient.objects.get(pk=patient.pk)
print(patient.name)  # "Alice"
print(patient.ssn)   # "123-45-6789"
print(patient.age)   # 30

# Search — only on searchable fields
Patient.objects.filter(ssn="123-45-6789")           # exact
Patient.objects.filter(ssn__in=["123-45-6789"])      # in
Patient.objects.filter(ssn__isnull=False)             # isnull
```

## Field Types

| Field | Base Type | Notes |
|---|---|---|
| `EncryptedCharField` | CharField | Requires `max_length` |
| `EncryptedTextField` | TextField | |
| `EncryptedIntegerField` | IntegerField | |
| `EncryptedBooleanField` | BooleanField | |
| `EncryptedDateField` | DateField | Stored as ISO format |
| `EncryptedDateTimeField` | DateTimeField | Stored as ISO format |
| `EncryptedDecimalField` | DecimalField | Requires `max_digits`, `decimal_places` |
| `EncryptedJSONField` | JSONField | JSON serialized with sorted keys |
| `EncryptedBinaryField` | BinaryField | `searchable=True` not supported |
| `EncryptedFileField` | FileField | Uses encrypted file storage |
| `EncryptedImageField` | ImageField | Uses encrypted file storage |

All fields accept standard Django field options (`null`, `blank`, `default`, etc.) plus:

| Parameter | Default | Description |
|---|---|---|
| `searchable` | `False` | Creates a `{name}_hash` column for lookups |
| `unique` | `False` | Enforced on the hash column (requires `searchable=True`) |
| `mask` | `'last4'` | Admin display masking (`'last4'`, `'full'`, or callable) |
| `key_provider` | `None` | Per-field key provider override |
| `encryption_provider` | `None` | Per-field encryption provider override |

## Searchable Fields

When `searchable=True`, a companion `HashField` column is created with a database index. Lookups are performed against the hash, so the encrypted data never needs to be decrypted for queries.

Supported lookups:

```python
# Exact match — WHERE ssn_hash = hash('123-45-6789')
Patient.objects.filter(ssn="123-45-6789")

# IN — WHERE ssn_hash IN (hash('...'), hash('...'))
Patient.objects.filter(ssn__in=["123-45-6789", "987-65-4321"])

# IS NULL — WHERE ssn_encrypted IS NULL
Patient.objects.filter(ssn__isnull=True)
```

Unsupported lookups (`contains`, `startswith`, `gt`, `lt`, etc.) raise `LookupNotSupported`. Non-searchable fields only support `isnull`.

## Settings

```python
# Required: Fernet encryption key (string or list for key rotation)
ENCRYPTED_FIELDS_KEY = "base64-encoded-fernet-key"

# Optional: Salt for HMAC-SHA256 hashing (recommended)
ENCRYPTED_FIELDS_HASH_SALT = "random-salt-string"

# Optional: Global key provider (dotted import path)
ENCRYPTED_FIELDS_KEY_PROVIDER = "myapp.providers.VaultKeyProvider"

# Optional: Global encryption provider (dotted import path)
ENCRYPTED_FIELDS_ENCRYPTION_PROVIDER = "myapp.providers.AESGCMProvider"
```

## Key Rotation

Key rotation is supported via MultiFernet. Provide keys as a list — the first key is used for new encryptions, and all keys are tried for decryption:

```python
ENCRYPTED_FIELDS_KEY = [
    "new-primary-key",    # Used for encrypting new data
    "old-key-1",          # Can still decrypt existing data
    "old-key-2",          # Can still decrypt older data
]
```

Hashes are salt-based (not key-based), so search continues working without any changes after key rotation. To fully migrate, re-save each record to re-encrypt with the new key, then remove old keys.

## Custom Providers

### Key Provider

Implement `BaseKeyProvider` to load keys from a vault or other source:

```python
from encrypted_fields import BaseKeyProvider

class VaultKeyProvider(BaseKeyProvider):
    def __init__(self, secret_path="/encryption/keys"):
        self.secret_path = secret_path

    def get_keys(self) -> list[str]:
        # First key = primary (for encryption), rest = decryption only
        return vault_client.get_secret(self.secret_path)
```

### Encryption Provider

Implement `BaseEncryptionProvider` to use a different algorithm:

```python
from encrypted_fields import BaseEncryptionProvider

class AESGCMProvider(BaseEncryptionProvider):
    def encrypt(self, value: bytes, keys: list[str]) -> str:
        # Encrypt using primary key, return string
        ...

    def decrypt(self, value: str, keys: list[str]) -> bytes:
        # Decrypt trying all keys, return bytes
        ...

    def hash(self, value: bytes, salt: str | None) -> str:
        # Return deterministic 64-char hex hash
        ...
```

### Configuration

Providers can be set globally in settings or per-field:

```python
# Global (in settings.py)
ENCRYPTED_FIELDS_KEY_PROVIDER = "myapp.providers.VaultKeyProvider"

# Per-field (overrides global)
class Patient(models.Model):
    ssn = EncryptedCharField(
        max_length=11,
        searchable=True,
        key_provider=VaultKeyProvider(secret_path="/pii/keys"),
        encryption_provider="myapp.providers.AESGCMProvider",
    )
```

Resolution order: field-level parameter > Django settings > library default.

## Admin Integration

Encrypted fields appear in Django admin forms as normal input fields. In list views, values are masked automatically.

Masking options via the `mask` parameter:

```python
# Show last 4 characters (default): "*******6789"
ssn = EncryptedCharField(max_length=11, mask="last4")

# Fully masked: "***********"
secret = EncryptedCharField(max_length=100, mask="full")

# Custom callable
token = EncryptedCharField(max_length=100, mask=lambda v: v[:4] + "****")
```

The companion `_hash` column is never shown in admin forms (`editable=False`).

## File Fields

`EncryptedFileField` and `EncryptedImageField` encrypt file content at rest using `EncryptedFileSystemStorage`. Files are encrypted on save and decrypted on open.

```python
from encrypted_fields import EncryptedFileField, EncryptedImageField

class Document(models.Model):
    file = EncryptedFileField(upload_to="documents/")
    photo = EncryptedImageField(upload_to="photos/")
```

File content on disk is always ciphertext. Reading through Django's file API returns decrypted content transparently.

## Management Commands

### generate_encryption_key

Generate a Fernet encryption key and optional hash salt:

```bash
python manage.py generate_encryption_key
```

Output:

```
ENCRYPTED_FIELDS_KEY = "base64-encoded-key"
ENCRYPTED_FIELDS_HASH_SALT = "random-hex-salt"
```

Pass `--no-salt` to skip salt generation.

### reencrypt

Bulk re-encrypt and re-hash all encrypted field values. Use after key rotation or hash salt changes:

```bash
# Re-encrypt all encrypted fields across all models
python manage.py reencrypt

# Re-encrypt a specific model
python manage.py reencrypt myapp.Patient

# Re-encrypt a specific field
python manage.py reencrypt myapp.Patient.ssn

# Control batch size (default: 100)
python manage.py reencrypt --batch-size=500

# Preview without modifying data
python manage.py reencrypt --dry-run
```

## Development

### Setup

```bash
uv sync
```

### Testing

```bash
uv run pytest
```

The test suite includes 94 tests covering fields, lookups, providers, admin integration, forms, encrypted file storage, and the re-encryption command. Tests use pytest-django with an in-memory SQLite database.

### Type Checking

```bash
uv run mypy encrypted_fields
```

Uses mypy with the [django-stubs](https://github.com/typeddjango/django-stubs) plugin for full Django type awareness.

### Linting

```bash
uv run ruff check .    # Lint
uv run ruff format .   # Format
```

## License

MIT. See [LICENSE](LICENSE) for details.

## Limitations

- **`QuerySet.update()`** encrypts the value but does not update the hash companion field. Use model instance `save()` instead.
- **`values()` / `values_list()`** returns ciphertext, not decrypted values. Access encrypted fields through model instances.
- **`order_by()`** is not meaningful on encrypted fields (ciphertext has no sort order).
- **Lookups** are limited to `exact`, `in`, and `isnull`. Partial matches (`contains`, `startswith`) and comparisons (`gt`, `lt`) are not possible on encrypted data.
- **`EncryptedBinaryField`** does not support `searchable=True`.
