Score:2

Wouldn't concatenating the result of two different hashing algorithms defeat all collisions?

in flag

Let's say I have three messages: A B C

And I run each of these through two different Hashing algorithms: MD5 and SHA1 for this example

MD5(A) = X
MD5(B) = Y
MD5(C) = Y

SHA1(A) = N
SHA1(B) = N
SHA1(C) = M

Notice the MD5 hash of B and C collide. And the SHA hash of A and B collide.

If I simply concatenate the digests, however, the results would be unique:

Combined Digest of A:  XN
Combined Digest of B:  YN
Combined Digest of C:  YM

The underlying principle would be that whatever pair of messages could be found or constructed to form a collision with one hashing algorithm, wouldn't also form a collision with another hashing algorithm.

The combined digest length (for MD5/SHA1) would be 288 bits (128+160) -- but unless I'm missing something, this would be significantly more secure than a single hashing algorithm with a 288-bit digest.

Granted, in the example above I'm using MD5 and SHA1 which are both known to be effectively broken, but I'm hoping an answer exists that applies more conceptually to the premise than simply the choice of algorithms.

i.e., In a situation where collision resistance is critical, wouldn't the combination of SHA2-256 + SHA3-256 concatenated be more secure than a single iteration of SHA2-512, or SHA3-512?

Eddie avatar
in flag
FYI, the closest question I found that is similar is this one: https://crypto.stackexchange.com/questions/81634/security-implication-of-concatenating-two-hashes-using-different-algorithms But it didn't quite ask the exact premise of my question.
Gilles 'SO- stop being evil' avatar
The short answer is no: [a collision for MD5 || SHA-1 is almost as easy as a collision for SHA-1 alone](https://crypto.stackexchange.com/questions/36988/how-hard-is-it-to-generate-a-simultaneous-md5-and-sha1-collision) — much worse than SHA-256 despite SHA-256 being slightly shorter, but unbroken.
Eddie avatar
in flag
@Gilles'SO-stopbeingevil' Thanks for the insight, Gilles. This would help in the specific case of `md5` and `sha1`, but not in the *conceptual*, underlying case.
Gilles 'SO- stop being evil' avatar
No, it's not specific to MD5 and SHA1: the fact that the attack on the concatenation is basically only as hard as the attack on the longest hash is mostly generic.
Eddie avatar
in flag
@Gilles'SO-stopbeingevil' Would you be willing to expand on that in an answer =). Specifically why the "difficulty" to find a collision wouldn't increase multiplicatively ?
Vaelus avatar
co flag
What leads you to believe "whatever pair of messages could be found or constructed to form a collision with one hashing algorithm, wouldn't also form a collision with another hashing algorithm"?
fgrieu avatar
ng flag
@Gilles'SO-stopbeingevil' : your "almost as easy" is a stretch of "wouldn't be that much more expensive" in the [accepted answer](https://crypto.stackexchange.com/a/36993/555), which proposes a strategy where the cost is increased by a factor $64=2^6$, and adds a parallel collision search for the MD5 part.
kodlu avatar
sa flag
You cannot defeat all collisions for a hash function with finite codomain.
Eddie avatar
in flag
@Vaelus Admittedly, I don't know the true inner workings of many hashing algorithms. But my theory was based on the internal mechanisms being different, which would make finding a collision for two algorithms that "go about creating a digest" differently exponentially more difficult. However... you're answer has clarified that my theory was incorrect. Thank you.
JimmyJames avatar
cn flag
The [pigeon-hole principle](https://en.wikipedia.org/wiki/Pigeonhole_principle) alone tells us that collisions are still possible.
Score:4
in flag

No, concatenating two hashes gives you at least the collision resistance of either but in many practical cases it will give you little more.

This is especially truely for MD hash functions where we know how to convert collisions into many way multi collisions. We can make 2^64 multi way sha1 collision and expect one will collide also in MD5.

Eddie avatar
in flag
`but in many practical cases it will give you little more` -- could you expand on this a bit? And specifically speak to why it wouldn't give you an additional "the full collision resistance" of the other hashing algorithm?
Meir Maor avatar
in flag
try this question: https://crypto.stackexchange.com/questions/103630/does-using-multiple-hashes-to-check-if-a-file-has-been-spoofed-reduce-collisio
Score:4
co flag

No, concatenating the result of two different hashing algorithms does not defeat all collisions. You've overlooked the case where $\text{MD5}(A)=\text{MD5}(B)=X$ and $\text{SHA1}(A)=\text{SHA1}(B)=N$. In English, that's when a pair of inputs collides for both hash functions.

Furthermore, assuming a hash function's output is truly uniformly distributed for any given set of inputs (this isn't actually true, but for our purposes, it's close enough to true for modern cryptographic has functions), the collision resistance of $\text{HASH}^P_\text{256-bit}(A) +\text{HASH}^Q_\text{256-bit}(A)$ is exactly equal to the collision resistance of $\text{HASH}_\text{512-bit}(A)$.

Again, assuming a uniform distribution, the chance two inputs collide for an N-bit hash is $\left(\frac{1}{2}\right)^N$, or one in two for each bit of output. Assuming the chance that a pair of inputs collides for hash $P$ is independent of the chance of collision of the pair for hash $Q$, the chance a pair collides for both is the product of the chance it collides for each hash individually. Given this, it's clear the chance of collision is identical either way, since $\left(\frac{1}{2}\right)^{256}\cdot\left(\frac{1}{2}\right)^{256}=\left(\frac{1}{2}\right)^{512}$.

Jeremy Friesner avatar
ws flag
Perhaps some people are sufficiently paranoid that they would prefer not to assume a uniform distribution -- e.g. if in the future it turns out that one of the hashing algorithms contains an exploitable flaw but the other algorithm does not, then using the concatenation of the two algorithms' output could be more resistant to exploitation than just using a single algorithm to generate twice the number of hash-bits?
Eddie avatar
in flag
This was the confirmation I was seeking. Thank you. Follow on question, is there a reason why doing a single 512-bit hash is "better" than concatenating two different 256-bit hashes (faster, more secure, simpler, lower cpu, anything)?
Vaelus avatar
co flag
@Eddie This will depend entirely on the specific hash functions. For example, some processors implement hardware acceleration for specific hash functions, so one method may be significantly faster than the other on some hardware. In general though, assuming none of the hash functions have a known weakness, I don't think there's a strong technical reason to choose one or the other. That said, I think many people would find the decision to concatenate two cryptographic hashes perplexing in the absence of a strong technical reason, since it presumably increases implementation complexity.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.