General wisdom
As a general bit of wisdom, combining cryptographic primitives is rarely a straight win. It can strengthen some properties if done right, but weakens other properties. So it should only be done if it strengthens the properties you care about and doesn't weaken the properties you care about.
combining H and F is a common practice to stay safe when one of them becomes unsafe, but there were mixed opinions when it comes to whether such combination is as safe as MAX(H, F) or MIN(H, F) in terms of safety, almost nobody suggested than safety will be higher or lower than this.
You found mixed opinions because it depends how you do the combination. (That, or because people were uninformed. It happens.)
I also have some personal thoughts on this, originating from school math. It seems obvious to me that if f:A -> B
and greatly |A| > |B|
and g:B -> C
and greatly |B| > |C|
then g(f(x))
should have much more collisions than f or g.
Counting collisions is irrelevant. What matters is the difficulty of finding them. MD5(x)||SHA1(x)
is a 288-bit hash and probably has fewer collisions on 257-bit strings than SHA256(x)
— in fact it's plausible that MD5(x)||SHA1(x)
has no collisions at all on 257-bit strings, whereas SHA256(x)
must have collisions by the pigeonhole principle. But we know how to find collisions on MD5(x)||SHA1(x)
for slightly longer strings in long-but-doable time (it's only a little more expensive than SHA1 collisions), whereas we don't know how to find SHA256 collisions at all.
But also a fact that we combine two complex functions should be somewhat harder to "break".
No, this is not a fact. It entirely depends on what you're combining and how.
Composing two hashes
Regarding hashes and collision resistance, composing two hashes reduces the security. In other words, $H \circ F$ is less collision-resistant than $H$ or $F$ on its own.
It's easy to see that if $F$ has a collision, then so does $H \circ F$: if $F(x_1 = F(x_2)$ with $x_1 \ne x_2$ then $(H \circ F)(x_1) = H(F(x_1)) = H(F(x_2)) = (H \circ F)(x_2)$. This alone means that composing two hashes is, at best, useless if what you care about is collision resistance: you might as well just use $F$.
The collision resistance of $H \circ F$ can be better than that of $H$ alone. If $H$ has a collision $H(y_1) = H(y_2)$ with $y_1 \ne y_2$, to use this fact to find a collision for $H \circ F$, you need to find a preimage for both $y_1$ and $y_2$. So you're confident that $F$ is collision-resistant and preimage-resistant, then $H \circ F$ can be safer than $H$ in terms of collision resistance. However, since it isn't safer than $F$, the only reason to use this construction is if it improves some other property.
If $H \circ F$ has a collision, i.e. if $H(F(x_1)) = H(F(x_2))$ with $x_1 \ne x_2$, then either $F(x_1) = F(x_2)$ and this is a collision for $F$, or $F(x_1) \ne F(x_2)$ and this is a collision for $H$. So the composition is no worse than the worst of the two, for collision resistance.
Composing two hashes can improve preimage resistance. To take advantage of the structure of $H \circ F$ when looking for a preimage, you have to both find a preimage through $H$ and then a preimage of that through $F$. However, no cryptographic hash function that I would qualify as having been mainstream has ever had preimage resistance broken, so it's not a practical concern.
Preimage resistance is a concern for password hashing functions. But password hashing functions are a completely different kind of cryptographic primitive from “ordinary” hash functions. They have different parameters and different security objectives. Collision resistance is not relevant to password hashing. Composing two password hashing functions can make sense, but you need to be careful: it definitely is possible to do it wrong.
Composing two hashes can also improve security properties other than the properties that define a cryptographic hash function. Hashes are often used as random oracles, and commonly used hashes (such as the SHA-2 family) are known to be imperfect as random oracles due to a length extension attack. Composing two hashes, even the same function, as in SHA256d(x) = SHA256(SHA256(x))
, eliminates the length extension attack and does not introduce another known weakness on its own. (“On its own” is important: if you mix SHA256d and SHA256 on the same data, the results can be disastrous.)
Concatenating two hashes
Another simple way to combine two hashes is to concatenate them: $H(x) || F(x)$. This is a definite improvement in terms of collision resistance: if there's a collision for $H || F$, it's also a collision for $H$ and $F$. Concatenation also improves second-preimage resistance: if you know $H(x_1) || F(x_1)$ and you want to find $x_2 \ne x_1$ such that $H(x_1) || F(x_1) = H(x_2) || F(x_2)$, you need to find a second preimage for both $F$ and $H$. For both collision resistance and second-preimage resistance, the concatenation is at least as strong as the weaker of the two, and possibly stronger.
An example where concatenation was used in a real-world protocol is older versions of the SSL/TLS protocol, up to version 1.1. They used MD5(x)||SHA1(x)
for the handshake signature, where the essential security concern is second-preimage resistance. TLS 1.2 replaced this by a single, customizable hash function (typically SHA-256 or SHA-384): the additional complexity wasn't worth it (and there was no sensible popular hash function to combine with SHA-256 at the time anyway).
Concatenation is not an unambiguous win. For example, it reduces first-preimage resistance to the weaker of the two functions: if you know $H(x) || F(x)$ then you can find $x$ if you know how to do this either via $H(x)$ or via $F(x)$.
Xoring two hashes
Yet another way to combine two hashes is to xor them: $H(x) \oplus F(x)$. The security properties of the result depend on the choice of functions. This has an obvious potential for going horribly wrong: the special case $H = F$ results in output that's all-bits-zero. On the other hand, if $H$ and $F$ are independent, then $H \oplus F$ is at least as good as the stronger of the two as a random oracle. I'm not sure off the top of my head if there's a sensible condition that would give any guarantee on collision resistance.