Score:0

Meaning of the term "irreversible" for hashing

in flag

I was in an interesting discussion with Jon Skeet on StackOverflow. He indicated that hashes are irreversible, but he extended this to non-cryptographic hashes. A hash function has a specific output size while the function can handle any sized messages. So if you argue from that point of view, a hash is always irreversible as there are many messages that have the same hash.

However I wonder if this is the correct definition of "irreversible". In cryptography, would irreversible mean that we cannot establish the correct message with any large amount of certainty, or does it mean that we cannot find any information about the message, other than when comparing it to known message / hash pairs?

Score:1
my flag

In cryptography, hashes are often used as place holders for the original message; by using the hash, we are effectively using the original message. For example, when signing a long message, we hash the long message and then run the signature algorithm on the hash; if we assume that the signature algorithm is strong (and so only a message that generates that hash will validate), and we assume that it is hard to find a second message that generates the same hash (so that it must be that original message that was signed), then the entire system is strong.

As such, we generally don't consider 'information leakage' when talking about hash functions; that is, we generally don't worry whether examining the hash output would leak information about the original message. Instead, we worry about the difficulty of finding a message that generates that hash (either preimage resistance, second preimage resistance or collision resistance), because that directly relates to the security assumptions we commonly need from hash functions.

That said, if someone were to find a leakage in (say) SHA3, for example, by examining the hash, we can determine the length of the message, well, that'd be quite worrisome (and we'd likely start deprecating SHA3); we sometimes want to treat our hash functions as a random oracle, and that would indicate that it doesn't act like one [1]. However, the prospect of such leakage would appear to be unlikely.

[1]: Yes, I know length extension attacks show that SHA-2 doesn't act like a random oracle; that is well understood, and we just avoid using SHA-2 in a way where that specific attack is applicable.

Maarten Bodewes avatar
in flag
I can see your reasoning here and I agree with it. Still, if we would use a cryptographic hash as a primitive for a password hash or KDF, and if the hash leaks parts of the message, wouldn't that pose a problem? Do we then assume that it *should* act as a random oracle, and thus not leak information about the message?
poncho avatar
my flag
@MaartenBodewes: well, if we assume that the Keccak permutation is a 'random permutation', then I believe that it is straight-forward to show the SHA-3 doesn't leak any information about the image (and similarly with SHA-2 and the hash compression operation). Is that good enough?
Maarten Bodewes avatar
in flag
Sure, that's good enough for the purpose. My question is: "*if* they would leak data, say if the final bit is set or not, would we still consider them "irreversible?" Not sure if we can actually answer this, as "irreversible" is not really a security requirement for hash functions, so maybe we're just outside the scientific terminology - and that's it.
poncho avatar
my flag
@MaartenBodewes: hmmm, the problem is that 'irreversible' doesn't have a standard agreed upon definition in the crypto community, and so it's hard to say whether such a hash function would meet that definition. On the other hand, for SHA-2/3, it would indicate that the hash compression operation/permutation doesn't meet the security assumptions, and because of that, we shouldn't trust that hash function in general...
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.