Unfortunately the above documentation is more focused on its
properties and usage instead of how this construct works.
libsodium's secretstream is defined in the documentation, although I will admit that some of the notation looks a bit confusing. As a summary:
- Like XChaCha20, it uses HChaCha20 to derive a subkey and takes the last 64 bits of the nonce as the nonce for encryption. Importantly, this is only done once for all chunks, not for every chunk, so it's not the same as chunked XChaCha20-Poly1305. That would be less efficient.
- Encryption is done using ChaCha20-Poly1305, with the subkey and a 32-bit counter starting at 1 prepended to the 64-bit nonce. The message is the tag specifying what type of chunk it is (
TAG_MESSAGE
, TAG_FINAL
, TAG_PUSH
, or TAG_REKEY
) concatenated with zeros for padding and the actual message. When outputting the ciphertext, the first block (containing the tag and zeros) is truncated to the tag size, which reduces ciphertext expansion.
- The 64-bit nonce becomes the nonce XOR the first 64 bits of the Poly1305 tag.
- The 32-bit counter gets incremented by 1.
- If the counter is 0, secretstream automatically rekeys, which avoids any limits on the total length of the stream.
- This process repeats for the number of chunks. Rekeying manually or automatically involves encrypting the subkey concatenated with the nonce using ChaCha20. The output is used as the new subkey and 64-bit nonce, and then the 32-bit counter gets reset to 1.
TAG_FINAL
performs rekeying to indicate the final chunk.
Why is this secure?
- Stream truncation: avoided by using and checking
TAG_FINAL
.
- Chunk removal: the wrong nonce would be used, producing an AEAD decryption error.
- Chunk reordering: the wrong nonce would be used, producing an AEAD decryption error.
- Chunk duplication: the wrong nonce would be used, producing an AEAD decryption error.
- Chunk modification: this is what an AEAD is designed to detect.
However, this is relatively complicated to implement compared to alternatives, it doesn't support random-access decryption because of the nonce XOR, it requires encrypting an extra block because of the tag and padding, and the ciphertext is 1 byte longer per chunk than normal due to the tag functionality.
With that said, if you have access to the API, I would recommend using it because it prevents mistakes and the tag and rekeying functionality may be useful.
Is there any documentation or similar patterns which explain simply
how nonce / key is derived for the subsequent chunks and how certain
protections (E.g. reordering prevention) can be implemented in
practice (e.g. if Authenticated Data is used / if chunks are linked).
The most influential paper in this area is probably Online Authenticated-Encryption and its Nonce-Reuse Misuse-Resistance, which introduced the STREAM and CHAIN constructions. STREAM is for nonce-based AEAD schemes, whereas CHAIN is for misuse-resistant AEAD schemes.
Variations of STREAM have been used in several projects, like age, Google Tink, Miscreant, and Kryptor. Tink's scheme was analysed in this paper. The original STREAM looks like this:
The nonce consists of a random prefix, a counter starting at 1 that gets incremented per chunk, and a single 0 or 1 byte to indicate the final chunk.
However, because a 256-bit key protects against batch/multi-target attacks, this can be simplified to using a counter nonce rather than a random nonce prefix combined with a counter. This is a lot easier to implement than secretstream and supports random-access decryption.
STREAM is also more efficient than using the associated data to indicate the chunk because associated data is extra data that needs to be processed compared to the nonce. It also frees up the associated data for specifying other information more easily, which often only needs to be specified for the first chunk.
Why is this secure?
- Stream truncation: avoided by using the single 1 byte in the nonce to indicate the final chunk.
- Chunk removal: the wrong nonce would be used, producing an AEAD decryption error.
- Chunk reordering: the wrong nonce would be used, producing an AEAD decryption error.
- Chunk duplication: the wrong nonce would be used, producing an AEAD decryption error.
- Chunk modification: this is what an AEAD is designed to detect.
In contrast, the CHAIN construction looks like this:
Like secretstream, this dependency between chunks prevents random-access decryption.
For nonce-based AEAD schemes, a similar technique is possible where you use the previous tag as the associated data of the next chunk, with any header information as the associated data for the first chunk. This makes swapping chunks between files harder with nonce reuse. To prevent stream truncation, you can use the length of the ciphertext as associated data for the first chunk or include a flag as associated data for the last chunk.
As a final example, I will mention the recently added Monocypher API. Instead of using associated data or nonces, it changes the key per chunk by taking the 32 spare bytes after those used for the Poly1305 key in ChaCha20-Poly1305. Since these bytes were being encrypted before, this is efficient. This also prevents the need for separate rekeying functionality.
However, the API provides no built-in way of detecting stream truncation, and this is trickier to implement than STREAM in the sense that you can't use existing ChaCha20-Poly1305 APIs from cryptographic libraries.
I would also like to understand if it's feasible to design such stream
cipher to achieve random access (think of decrypting the video stream
from certain point in time).
Yes, you can seek to the final chunk to verify that the stream hasn't been truncated before seeking to the target chunk.
However, seeking can reveal that the same ciphertext chunk was accessed/transmitted twice for services like Netflix if you're not careful.
I would like to encrypt big files using an authenticated cipher. I am
convinced to use approach where file is divided into smaller
manageable chunks that fit easily in memory (e.g. 1-10MB size) which
are encrypted, authenticated separately. Unfortunately such approach
is prone to "reordering" attack at least.
I am looking for some theory behind the streamed protocols that apply
best practices in order to implement continuous encrypted stream. I
would be looking to implement it using: either Chacha20-Poly1305 or
XChacha20-Poly1305 which uses 24 bytes nonce.
I would recommend a chunk size of 16, 32, or 64 KiB as this uses little memory and matches some existing protocols. If you use a new key per stream, ChaCha20-Poly1305 with STREAM is a simple, efficient approach. XChaCha20-Poly1305 will be less efficient due to the subkey derivation, and the random nonce isn't necessary when you rekey for every stream.