Score:2

Does SHA-256 have (128-time + 128-space = 256-overall)-bit collision resistance?

vu flag

First, we consider those hash functions that can actually provide 256-bit pre-image security, and not something like SHAKE128<l=256bits> where the sponge parameters provides only a security capacity of 128-bit.

We know that cryptanalysis doesn't have just a time dimension - it also has a space dimension, i.e. the amount of working memory needed to execute the cryptanalysis algorithm. So if we expect to find a SHA-256 collision after about $2^{128}$ attempts, this in theory also mean that it needs $2^{128}$ space to hold candidate collision inputs.

So is this true? Does this mean that SHA-256 has an overall 256-bit collision resistance when taken space into account?

vn flag
Apart from the fact that you don't need $2^{128}$ space to conduct the collision attack, as the accepted answer goes into, there's also the fact that $2^{128} + 2^{128} = 2^{129}$, so even if there was no low-memory attack, and we (mis)counted the resource usage this way, it would still be nowhere near to $2^{256}$.
Score:4
ng flag

No, breaking the collision property of SHA-256 does not require any close to $2^{128}$ space. We know how to exhibit collision in any $n$-bit hash $H$ with $\mathcal O(2^{n/2})$ hash evaluations and $\mathcal O(n)$ space.

The simplest suitable method is Floyd's cycle finding, which will exhibit with non-vanishing probability two distinct $n$-bit bitstrings $r$ and $s$ with $H(r)=H(s)$, in the orbit a given initial point $t$ when iterating $H$

  • $m\gets\lceil\,2^{n/2+1}\,\rceil$ (increasing $+1$ reduces failures).
  • $u\gets H(t)$ .
  • $r\gets u$ ; $s\gets H(u)$ .
  • while $r\ne s$ :  (find cycle)
    • if $m=0$ then stop in failure (long orbit, rare).
    • $m\gets m-1$ .
    • $r\gets H(r)$ ; $s\gets H(H(s))$ .
  • if $t=s$ then stop in failure ($t$ in cycle, vanishingly rare).
  • $s\gets H(s)$ .
  • while $r\ne s$ :  (offset $u$ by one cycle)
    • $s=H(s)$ ; $u=H(u)$ .
  • while $t\ne u$:  (locate collision)
    • $r\gets t$ ; $s\gets u$ .
    • $t\gets H(t)$ ; $u\gets H(u)$ .
  • output $(r,s)$ and stop in success.

Try it online! for collision of a 24-bit hash (the first $k=3$ bytes of SHA-256). Please be kind enough to run this Python code on your machine if increasing $k$.

The method uses that the orbit of $t$, defined as the $u$ reached by iterating $u\gets H(u)$ starting from $u=t$, tends to cycle within $\mathcal \Theta(2^{n/2})$ steps. The algorithm detects a cycle is reached, find $u$ after as many steps from $t$ as the cycle's length, then where the cycle is entered, which yields a collision. It can be shown that for random function $H:\{0,1\}^*\mapsto\{0,1\}^n$ and except for very small $n$, the success probability of this algorithm from any starting point $t$ is at least $3/4$ (failures are for overly long orbit of $t$, or when $t$ belongs to a cycle).

In case of failure, it's often enough to restart from another random point. That typically works fine for common cryptographic hashes $H$, but even for these it can happen that most starting points lead to a cycle too large to be found. In a general case, we want to switch to using the algorithm with $H'$ defined by $H'(x)=H(F(x))$ for a suitably random efficiently computable and invertible injection $F$ chosen at start of the algorithm. That's so exhibiting a collision for $H'$ using the algorithm exhibits a collision for $H$, but $H'$ can have a different cycle structure. For most $n$-bit hash $H$ suitable for cryptographic use, $F$ can be XOR with a fixed $n$-bit bitstring, or adding a fixed prefix or/and suffix. This is not illustrated in the above pseudocode and linked Python code.

It's possible to distribute the work on many machines running in parallel, each with little memory, moderate communication, and moderate extra work. See Paul C. van Oorschot and Michael J. Wiener, Parallel Collision Search with Cryptanalytic Applications, in Journal of Cryptology, 1999.

gnasher729 avatar
kz flag
Outside crypto, the article you linked seems very hard to apply to the pollard-rho factoring algorithm itself, because there we don’t look for x=y but for gcd(x-y, N) > 1.
DannyNiu avatar
vu flag
I think if experiment results of the Floyd's algorithm over smaller transformations (e.g. 24 or 32 bits) were added, it would be more than convincing.
fgrieu avatar
ng flag
@DannyNiu: The original algorithm was wrong. This one comes with a working demo.
gnasher729 avatar
kz flag
The Pollard-tho algorithm can easily find factors of a 128 bit number in a few seconds. Which involves finding collisions between 64 bit numbers. Usually takes around 2^32 steps.
Score:0
kz flag

No, SHA-256 collisions can be found with practically no space at all.

The trick is that you don’t calculate for example H(1), H(2), H(3) etc. which would require storing all the results in a table. Instead we let x0 = some value, x1 = H(x0), x2 = H(x1) and so on. Then we compare x1 and x2, x2 and x4, x3 and x6, xk and x2k, and this will find a collision in about 2^128 steps with no storage.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.