Encrypted Storage Process¶
This guide documents how lx-annotate encrypts managed files on disk today,
how files are read back through Django storage, and how operators can verify or
repair the managed storage tree.
Scope¶
This guide is about local managed storage implemented by:
lx_annotate.storage.encrypted.EncryptedStoragelx_annotate.storage.encryption
It does not describe outbound transfer encryption. Per the current architecture, transport security is handled separately, and raw media export remains prohibited.
Summary¶
When encrypted storage is enabled, Django default_storage is configured to use
EncryptedStorage. Plaintext content is accepted at the application boundary,
but ciphertext is what gets persisted on disk. Each saved file gets:
a fresh per-file Data Encryption Key (DEK)
a file header containing wrapped key metadata
chunked AES-GCM encryption for the file payload
atomic replacement semantics when writing or repairing files
The long-lived master key is used only to wrap and unwrap the per-file DEK. It is not embedded into file payloads and must not be committed to version control or transmitted over the network.
Key Loading¶
EncryptedStorage resolves its master key at initialization time.
LX_ANNOTATE_MASTER_KEYmay contain the key directly.LX_ANNOTATE_MASTER_KEY_FILEmay point to a file containing the key.The key material must be urlsafe-base64 encoded.
After decoding, the raw key length must be 16, 24, or 32 bytes for AES-GCM.
If neither variable is set, storage initialization fails closed with a runtime error. The application does not generate a fallback key.
On-Disk Format¶
Each encrypted file starts with a small framing structure before the encrypted payload bytes:
Magic prefix:
LXENC01\nFour-byte big-endian header length
JSON-encoded header
Repeated encrypted chunks:
four-byte big-endian ciphertext length
ciphertext bytes for that chunk
The header contains:
versionalgorithmchunk_sizewrapped_dekwrap_noncenonce_prefix
The current algorithm value is AESGCM-chunked-v1.
Write Path¶
Normal encrypted writes happen through EncryptedStorage._save().
Django hands plaintext content to storage.
EncryptedStoragecreates a temporary file in the target directory.encrypt_stream()generates a fresh 32-byte per-file DEK.The DEK is wrapped with the long-lived master key using AES-GCM and stored in the file header.
The plaintext stream is read in chunks. The default chunk size is 1 MiB.
Each chunk is encrypted with AES-GCM using the unwrapped DEK.
The chunk nonce is derived from:
a random per-file
nonce_prefixa monotonically increasing chunk counter
The encrypted temp file is flushed and
fsync()’d.The temp file is atomically moved into place with
endoreg_db.utils.file_operations.atomic_move_file.
If any step fails, the temp file is removed and the original target path is not left half-written.
Read Path¶
Reads go through EncryptedStorage._open().
The raw on-disk file is opened as ciphertext.
A
DecryptedStreamreads the header and unwraps the per-file DEK using the configured master key.Each ciphertext chunk is decrypted on demand.
Django callers receive plaintext bytes through the storage API.
This means application code that uses Django storage sees normal file contents, while the filesystem only stores ciphertext.
Random Access And Indexing¶
The storage implementation also supports byte-range reads for streamable access.
build_chunk_index()walks the encrypted file and records the ciphertext and plaintext offsets for each chunk.iter_decrypted_byte_range()uses that index to decrypt only the required chunk range.EncryptedStoragecaches the chunk index using path, modification time, and file size as the cache key.
This avoids decrypting the whole file when only a portion of the plaintext is needed.
How Encryption Is Detected¶
EncryptedStorage.is_encrypted(name) checks whether the raw file starts with
the expected magic prefix LXENC01\n.
That check is intentionally simple:
if the magic prefix is present, the file is treated as encrypted
if the prefix is absent, the file is treated as plaintext or unsupported
Repair Path For Accidentally Plaintext Files¶
The repair_managed_payloads management command exists for cases where a file
was copied directly into managed storage and bypassed the encrypted save path.
The command:
Verifies that Django
default_storageis actuallyEncryptedStorageScans the managed storage root, or a
--path-prefixsubtreeSkips symlinks
Uses
is_encrypted()to classify each regular fileRe-encrypts plaintext files in place with
EncryptedStorage.repair_plaintext_file()
repair_plaintext_file() works by:
Opening the existing on-disk file as raw plaintext
Encrypting it into a temp file in the same directory
Flushing and syncing the temp file
Copying the original file mode onto the temp file
Atomically replacing the plaintext file with the encrypted version
This is an in-place repair of storage state, not a decryption step.
Verification Command¶
The verify_encrypted_storage management command performs a round-trip probe:
Writes a unique plaintext probe through Django storage
Reads the probe back through Django storage and verifies the plaintext
Reads the raw file directly from disk
Fails if the plaintext appears directly on disk
Fails if the file does not start with the encrypted-file magic header
This command is useful after deployment changes, key provisioning changes, or storage migrations.
Operational Notes¶
The application service must be able to read the configured master key file.
If encrypted storage is active but the key cannot be read, Django storage initialization fails and repair or verification commands will not run.
The file header contains the wrapped per-file DEK and nonce metadata, but not the long-lived master key itself.
Encryption at this layer protects managed files on disk. It does not replace deployment requirements such as protected mounts, strict permissions, and transport security controls.