Score:3

Does AEAD provide any benefit over raw cipher in this setting?

cn flag

I'm working on a cryptographic data store where blobs need to be identified and referenced via a hash of the encryped data. Think Merkle tree with encrypted nodes. In such a setting where the hash already establishes authenticity (assuming the hash function itself is not broken), is there any value in using an AEAD rather than just using the cipher directly?

I believe this is different from the classic encrypt-then-MAC topic because there is no hash or MAC stored with the blob to authenticate it. Rather, the hash is an external reference from elsewhere that is already authenticated (not subject to aleration by an attacker).

A further detail I originally omitted thinking it was irrelevant, but in hindsight it seems to clarify the problem: there is no preshared symmetric key; the key that could be used in a raw cipher or AEAD is one derived from an ephemeral secret and the receiving party's public key via ECDH. As such, any attacker who knows the public key can produce a blob with a valid AEAD tag using their own ephemeral secret. However, assuming the hash function is not broken, such a blob will not hash to a value the receiving party expects, and thus will never be used.

Score:1
in flag

If there are already means to authenticate the plaintext then it is indeed possible to skip authentication of plaintext or ciphertext by other means.

There are of course some catcha's.

First of all, the plaintext should not be used by any means before the authenticated hash value is verified. If this isn't the case then the attacker can change the plaintext, which means that the party is subject to many types of attacks including plaintext oracle or possibly fault injection.

Furthermore, the implementation should not provide any information on the decryption process. If it does, then the implementation may become subject to side channel attacks. Worse, if e.g. CBC is used then padding oracles apply. So is makes more sense to use AES-CTR or a stream cipher such as ChaCha20.

The last two design/implementation mistakes can be avoided when using an authenticated mode. So there can be some use to authenticated mode even if the key cannot be trusted. One disadvantage is that other developers could assume that authenticated mode does provide the authentication required, and start using the plaintext even if the messages haven't been authenticated yet.

Personally I would not use authenticated mode for this.


Note that the IV handling is not specified in the protocol that you describe. It is not needed to be part of the hash if that's protecting the plaintext rather than the ciphertext. However, you should make sure that it is unique for each message that is encrypted with the same key (and if you use the CBC mode, unpredictable).

I'm also not seeing if the protocol is susceptible to replay attacks and plaintext guessing attacks, if the authenticated hash is used both for identification and authentication.

Maarten Bodewes avatar
in flag
Is there a better term for "plaintext guessing attacks" or "message guessing attacks"?
R.. GitHub STOP HELPING ICE avatar
cn flag
It's actually the ciphertext (including the nonce and ephemeral public) that's already authenticated because data is addressed by hash of the ciphertext. Because of this there is no information for plaintext guessing. IVs are handled correctly (not reused).
R.. GitHub STOP HELPING ICE avatar
cn flag
If you're interested in more details of the system this is about, it's now public: https://github.com/richfelker/bakelite
Maarten Bodewes avatar
in flag
Very interesting! What would be nice is if you could write a more formal data model for it. Currently it is just a textual description and code. Furthermore, for a backup program reliability is the main concern, I would expect some code to perform functional testing of the routines used. If the design is known other people may be able to help you with these kind of things. A white paper could do wonders to help people adopt this.
Score:-1
si flag

A hash establishes integrity, not authenticity. An AEAD tag or MAC of the ciphertext establishes both integrity and authenticity.

If the hash is of the ciphertext, the attacker can simply modify the ciphertext and compute a new hash, since hashes don't depend on a secret key.

If the hash is of the plaintext, you get the same weaknesses as with "MAC-then-encrypt" schemes where you violate the cryptographic doom principle.

R.. GitHub STOP HELPING ICE avatar
cn flag
I think you misread the question. The hash is known and trusted out-of-band, so attacker computing a new hash is useless. They won't produce a blob matching the expected hash unless the hash function is broken.
SAI Peregrinus avatar
si flag
The problem is that trust. A hash of the ciphertext out of band can provide authentication if the out of band channel provides the authentication (MACs or signs the hash), but without that the out of band channel isn't as safe as a simple in-band MAC of the ciphertext or AEAD tag.
R.. GitHub STOP HELPING ICE avatar
cn flag
The OOB channel is fundamental to the setting I'm asking about; the hash there is the *identity* of the data being referenced. (Think analogously to how hashes are identities in git or a hash based filesystem.) I'm asking in this setting what if any protection AEAD could provide.
Manish Adhikari avatar
us flag
It may be safe if your out-of-the-band channel is trustworthy but I always use AEAD regardless.
R.. GitHub STOP HELPING ICE avatar
cn flag
I think you're still missing the point. If the user of the data has been given the wrong hash, it's already a critical breach because the wrong data will be accessed. For example, they might load `$malware_sample` rather than `$trusted_script`. In the setting here, the concept of authenticity seems meaningless without authenticity of the reference.
R.. GitHub STOP HELPING ICE avatar
cn flag
Moreover - I omitted this from the question because I thought it wasn't relevant, but maybe I shouldn't have - the symmetric key the blobs are encrypted with is not a preshared key but output of public key crypto. As such, anyone who knows the public key can make a blob with a valid AEAD tag, but this isn't useful because the resulting hash will not match anything the process consuming the blob is looking for.
SAI Peregrinus avatar
si flag
A hash doesn't say who created the hash. If the trusted channel provides authenticity (only an authenticated party could send messages on the channel, and no attacker can modify the messages) then the hash does too. If either of those things isn't true, then a hash isn't enough and you want a MAC or AEAD tag. AEADs are also often a lot faster than hashing the ciphertext, since you usually only have to process the data once instead of twice.
R.. GitHub STOP HELPING ICE avatar
cn flag
@SAIPeregrinus: It's not a question of "hash or AEAD". The hash is fundamentally necessary (Merkle tree type construct) and the question was just whether AEAD adds any value. But I think it's become clear that it doesn't.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.