Pre-image attack on non-cryptographic hash functions

Eugene Sirkiza

3/12/23, 1:12 PM

I am not good at cryptography so please :)

After reading this discussion it is now clear to me that xxHash is not resistant to collision attacks and is not secure for MAC usage. But after reading it, I still don't understand how resistant XXH3 (one of xxHash family) is to preimage attacks.

Yes, XXH3 output is $64$/$128$ bits which means that the probability to find image is $2^{64}$/$2^{128}$ corresponding. But I also read that hashes that are not resistant to collision attack are easier to revert. Consider this, to revert this hash function ($128$ bit version) using brute-force will around $2^{64}$ has calculations which is huge amount of time. According to this question even if XXH3 is 1000 times faster than SHA-512/64 to revert it will take at least $\frac{2^{64}}{1000 * 2^{20}}$ seconds or 557 years.

Which brings me to my first question. Is my conclusion is true and XXH3 is almost as resistant to pre-image attack as SHA-512/64? Am I missing something? Maybe there is some kind of not brute-force attack?

And the second question that bother me much. It is not a secret that hashes (especially XXH3) produce collisions. For cryptographical hashes it is hard to find them but they are exist. When we have successfully run the preimage attack, how can we be sure that the value we find is the original value? If an attacker manages to get an image, how likely is it that this is the original image and not one of the many collisions?

Thank you in advance.

128

0 + 0

hash

preimage-resistance

Score:3

Crypto

poncho

3/12/23, 9:16 PM

But after reading it, I still don't understand how resistant XXH3 (one of xxHash family) is to preimage attacks.

Not at all - it's easy to create an image that hashes to an arbitrary value.

Yes, XXH3 output is 64/128 bits which means that the probability to find image is $2^{64}/2^{128}$ corresponding.

The probabilities hold if the strategy you use is "pick a image, hash it, and see what's the result". There are far more efficient strategies.

Maybe there is some kind of not brute-force attack?

Sure is.

xxhash consists of steps that:

Transform the current state in an invertable way; or
Add in a word that depends (again, in an invertable way) on the next word of the image (and no other operation within xxhash depends on that input word).

Hence, to generate a image that hashes to a specific value, all you need to do is select a template that consists of values except for one word; then, you evaluate the xxhash forward until you get to that one word, coming up with an internal state value A. Then, you take the target hash value, and you compute the xxhash backwards until you get to that one word, coming up with an internal state value B. Then, all you need to do is select the one unknown word to be something that converts the state value A into the state value B - that's easy; insert that word into the template and you're done.

And, to address your last question:

For cryptographical hashes it is hard to find them but they are exist. When we have successfully run the preimage attack, how can we be sure that the value we find is the original value?

Obviously, collisions are inevitable for any function that converts a potentially long string into a fixed length shorter one. Now, if we find an image that hashes to the same value, how do we ensure that it's the same image that was originally hashed? Answer: in general, we can't, unless we know a lot about the original image (e.g. in the above attack on xxhash, we knew all the words of the input except for one). On the other hand, for most attacks against cryptographical hashes, we don't care - the attacker usually wins if he can find any preimage.

0 + 0

Eugene Sirkiza

3/15/23, 11:30 AM

Thanks a lot @poncho for your answer. Is there any way to estimate how much it takes to find the "right" hash? I mean, it's for sure another kind of question, but given your answer now I am interested in the following. What is the complexity of finding all possible images that hash into specified value? And can we estimate how many of these preimages exist?

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: Pre-image attack on non-cryptographic hash functions

TH: ภาพล่วงหน้าโจมตีฟังก์ชันแฮชที่ไม่ใช่การเข้ารหัส

RO: Atacul pre-imagine asupra funcțiilor hash non-criptografice

RU: Pre-image атака на некриптографические хеш-функции

VI: Tấn công trước hình ảnh vào các hàm băm không mã hóa

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.