Score:2

Can I use last N blocks of AES-CBC +IV as hash?

in flag

Suppose I want to take advantage of hardware accelerated symmetric encryption like AES and use it to compute a “hash”. How collision resistant would it be if I’d take the IV + last N blocks of an AES CBC transformation and use it as my hash. What would be good Values for N?

In my hypothetical use case I am not immediately concerned about adversarial inputs / pre image attacks.

DannyNiu avatar
vu flag
The [SPHINCS+](http://sphincs.org/resources.html) hash-based digital signature scheme has a variant that uses Haraka as the underlaying hash function, which can be computed using AES-NI CPU instructions. It's mostly suited for short inputs.
Marc Ilunga avatar
tr flag
If you fix the IV and assuming, inputs are prefix free, CBC is considered a PRF. So the last block is a kind of hash. But I don't know by heart the security bound for input length limits
Gilles 'SO- stop being evil' avatar
cn flag
@MarcIlunga A PRF doesn't give a hash. It gives a MAC, but it doesn't say anything about the ability to generate collisions if you know the key.
Marc Ilunga avatar
tr flag
The question is somewhat underspecified and not sure we should assume a secret key and that the "hashes" are more like "unique" fingerprints or not.
Konrads avatar
in flag
@MarcIlunga you're right - I wasn't too much concerned about adversarial inputs, I was more after: "is this file/data the same as that other file/data and can I use hardware acceleration for this".
Konrads avatar
in flag
@DannyNiu SPHINCS+ is quite interesting for this purpose, thanks!
Score:3
cn flag

No. You can't use the last block or even the last blocks of CBC encryption as a cryptographic hash. What you describe is a slight generalization of CBC-MAC. CBC-MAC is a one-time authentication code: an adversary who does not have the secret key, and does not have access to authentication tags from distinct messages with the same IV, cannot forge an authentication tag for a new message. CBC-MAC can be made into an actual MAC (not requiring an IV) with a slight modification, with CMAC being a popular standard. However, both for CBC-MAC and CMAC, anyone who knows the secret key can construct collisions.

Consider a block cipher $E$, an IV $\mathrm{IV}$ and a message $P_1||P_2||\ldots$ (cut into blocks for the block cipher). The CBC accumulator calculation after processing the first two plaintext block is $E(E(\mathrm{IV} \oplus P_1) \oplus P_2)$. Let $P'_1$ be any block value:

$$E(E(\mathrm{IV} \oplus P_1) \oplus P_2) = E(E(\mathrm{IV} \oplus P'_1) \oplus E(\mathrm{IV} \oplus P'_1) \oplus E(\mathrm{IV} \oplus P_1) \oplus P_2)$$

Let $P'_2 = E(\mathrm{IV} \oplus P'_1) \oplus E(\mathrm{IV} \oplus P_1) \oplus P_2$: the messages $P_1||P_2||\ldots$ and $P'_1||P'_2||\ldots$ will have the same CBC accumulator, and thus the same hash. This shows that it's possible to construct collisions for any message that's large enough.

It is possible to construct a hash function generally from a block cipher, but you can't just use the block cipher in CBC mode. A popular technique is to construct a one-way compression function and use the Merkle-Damgård construction to build a hash on top of that. At least with this approach, you can't just run the block cipher once per input block.

There are several well-established generic ways to construct a one-way compression function and thus a hash on top of a block cipher: Davies-Meyer, Matyas-Meyer-Oseas, Miyaguchi–Preneel, Hirose, MDC-2 and MDC-4... I'm not familiar with all the constructions and their respective security properties and performance. The only one that I'm aware of that has an established standard is Matyas-Meyer-Oseas (MMO), which is used in a deprecated method to prepare passwords for PBKDF2 in EAP-PSK and in Zigbee (§B.6) (where it's used with HMAC as well).

Marc Ilunga avatar
tr flag
Why is CBC-MAC a one time MAC?
Gilles 'SO- stop being evil' avatar
cn flag
@MarcIlunga Because it's possible to forge collisions by extending a message with a known MAC. It can be used as a MAC if there's an additional constraint that all valid messages must have the same length, or (I think) if the message encodes its own length in a sufficiently unforgeable way (in a fixed-size mandatory header).
Konrads avatar
in flag
@Gilles'SO-stopbeingevil' thank you for the detailed explanation! I'll mark the answer as accepted but I'd like to also thank others who commented
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.