Score:2

Encrypt multiple chunks of data with an AEAD

cl flag

Assuming that I want to encrypt a 1 GB file with e.g. AES in GCM mode or ChaCha20Poly1305.

[I'm specifically referring to the cryptography module for Python: https://cryptography.io/en/latest/hazmat/primitives/aead/ but I can't find anything in the documentation]

For "non-AEAD" ciphers, the syntax is basically

cipher = Cipher(key)
cipher.encrypt(data)

and I can call .encrypt() as many times as I like.

However for AEAD ciphers, the syntax is

aead = AEAD(key)
aead.encrypt(nonce, data, associated_data)

So I need to pass "nonce" and "associated_data" as arguments when I want to encrypt "data".

I understand that the last 16 bytes of the cipher text are used to ensure the integrity and that the AEAD ciphers can operate on data up to 2^31-1 bytes.

Does that mean that I can't read and encrypt the file in chunks, I have to load the whole file into memory and there's essentially no way to encrypt anything beyond 2 GB? (It can't possibly be correct to pass the nonce as an argument every time I encrypt a data chunk...)

Score:3
bs flag

This is a good question that should be answered more frequently in documentation.

Does that mean that I can't read and encrypt the file in chunks, I have to load the whole file into memory and there's essentially no way to encrypt anything beyond 2 GB?

Nope, you can and should encrypt files in chunks. The main benefit aside from reduced memory usage is that you detect modification/corruption sooner.

I would recommend 16, 32, or 64 KiB as the chunk size and that you use a unique key for each file.

(It can't possibly be correct to pass the nonce as an argument every time I encrypt a data chunk...)

Actually, it is. However, the nonce should change each time.

  • With AEADs that have a short nonce (e.g. 96 bits) like AES-GCM and ChaCha20-Poly1305, it's best to use a counter starting at 0 and increment it for each chunk.
  • With AEADs that have a longer nonce (e.g. 128, 192, or 256 bits) like XChaCha20-Poly1305 and AEGIS-256, it makes more sense to randomly generate the nonce for each chunk. Otherwise, you may as well just use an AEAD with a shorter nonce.

Unfortunately, it's not this simple.

  1. You need to prevent an attacker rearranging/duplicating chunks. A counter nonce solves this problem. With random nonces, one approach is to use the previous authentication tag as associated data for the next chunk.
  2. You need to prevent an attacker truncating the entire file. If you can calculate the encrypted file length in advance, it can be converted to bytes and used as associated data for the first chunk. However, a more common solution is the STREAM construction, which involves a counter nonce with the last byte reserved to indicate whether it's the final chunk.

Here's some further reading:

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.