Score:1

Crypto

Is recursive hashing cyclic?

Sherwood Botsford

1/11/23, 5:08 AM

If I feed the output of H back into H will it cover the entire output space of H before repeating?

Consider the following scenario:

A=1;
While(){
   A=H(A);
   print(A)
}

Will there be short cycles? E.g. Are there values of A that H(A) = A (a cycle of 1) Values of A where H(A) = B and H(B) = A (cycle of 2)

Is there a way to prove that either no such cycles exist for a given hash, or to put a lower bound on the shortest cycle.

Douglas Hofstadter in one of his books (Either "The Mind's I", or "Metamathematical Themas" I think) has a sentence

This sentence has one 'a', one 'b', ...

The problem is to find the values that make it true. If you just count the current crop, and correct the sentence, and repeat, the sentence converges on a true statement fairly quickly.

So how does this connect to cryptography.

I suspect, but haven't demonstrated, that the 'strange attractor' above is fairly arbitrary.

{Entire text of War and Peace} This statement has...

will also converge on a correct statement.

Consider a hash, H. Suppose that there are short cycles of H. Will there also be cycles of (D+H) where D is an arbitrary chunk of data. Will these be the same length? (I don't see a good reason why they should be)

If an appreciable fraction of hashes are members of a short cycle (for some value of short: 1 million? 1 billion?) then the following vulnerability exists:

Take file D.

Modify D to suit your purposes. Part of this process should be to shorten by length(hash value) Call this file D'

+ stands or concatenate

Compute A = H(D)
Compute A2 = H(A+D')
Compute A3 = H(A2+D')

If A is cyclic you will eventually reach some A_n whose next iteration produces A.

Replace file D with A_n+D'

In terms of security we need to know what the probability of a given A being cyclic in reasonable time. Of course reasonable depends on how important the document is. But if you can show that there is a vanishingly small probability to find a cycle under 10^15 or so, then this sort of attack isn't practical. If there is a 2% chance of a given hash being cyclic in under a million iterations, there is a potential problem.

0 + 0

hash

collision-attack

CodesInChaos

1/11/23, 7:11 AM

Short cycles exist, but they only cover a tiny fraction of the space, so you'll never find them.

fgrieu

1/11/23, 8:02 AM

The part _"If an appreciable fraction of hashes are members of a short cycle..."_ needs to be reformulated. A hash is a function, not a value reached by that hash, which is what can belong to that hash's cycles. Perhaps it's meant: _if a particular hash function was such that an appreciable fraction of the values it reaches when hashing an input from the set of it's possible output are members of a short cycle..._. That reformulated hypothesis is meaningfull, but is vanishingly unlikely for a cryptographic hash.

Score:2

Crypto

fgrieu

1/11/23, 7:28 AM

When studying such problem, cryptographers assimilate a cryptographic hash $H$ to a random function or random oracle (or random mapping though that's a bit dated) that's iterated. Probabilities are computed over the set of all possible hashes. That's an approximation, but if actual results where distinguishably different, that would be a way to distinguish the hash from a random function, thus a break of the hash by modern defintions. Therefore, said approximation is satisfactory, and the better there can be for an unbroken cryptographic hash.

Will there be short cycles?

Likely, and the more so as we loosen the definition of short. But (except for very small hashes) it's unlikely that a short cycle would be reached from a random starting point $A$; and (for standard cryptographic hashes with large enough output that they are collision-resistant) it's unlikely we could exhibit any cycle, regardless of it's size.

Are there values of $A$ such that $H(A)=A$ (a cycle of 1)

If the destination set of the hash has size $n$ (where $n=2^b$ for a $b$-bit hash, e.g. $n=2^{256}$ for SHA-256), then the probability that there is a cycle of size 1 is easily computed as the complement of the probability that none of the $n$ points hashes to itself, that is $p_1(n)=1-(1-1/n)^n$. This starts from $p_1(1)=1$, $p_1(2)=3/4=0.75$, $p_1(3)=19/27\approx0.7037$, and $p_1$ quickly converges to $1-1/e\approx0.6321$.

Thus there's >63.2% probability there exists a cycle of size 1 in a given cryptographic hash like SHA-256, SHA-384, or SHA-512. And we can't tell better for any of these hashes.

I don't know how to properly do the same for cycles of size 2, but do not doubt it is feasible, and the probability is sizable for $n\ge2$.

It's possible to compute the expected value of many characteristics of cycles for an iterated random function/hash. In particular, for large $n$, the expected number of steps before hitting a previous value starting from a random point is $\approx\sqrt{\pi\,n/2}$, and the expected length of the cycle reached is half that. A classical reference is Philippe Flajolet and Andrew M. Odlyzko, Random mapping statistics, in proceedings of Eurocrypt 1989, and Research Report RR-1114, INRIA, 1989.

Is there a way to prove that either no such cycles exist for a given hash, or to put a lower bound on the shortest cycle.

No, for an unbroken cryptographic hash of large enough output size that it's collision resistant (that is, $\sqrt n$ is so large that this number of hashes can't be computed); say, any unbroken hash 256-bit or wider. For the part of the question asking is it's possible to prove that no such cycle exist, we'd even need to compute $n$ hashes (until the last we can't be sure there's not a cycle of size 1), thus any unbroken hash 128-bit or wider will do.

So how does (Douglas Hofstadter's iterated construction) connect to cryptography?

A quite similar technique is used in attacking some cryptosystems. We construct a function, iterate it until finding a cycle (which thus is short, else we could not find it), and the entry point in the cycle gives a collision, and the collision solves the problem because the function was constructed with that explicit purpose. That's the heart of Pollard's rho method to solve the Discrete Logarithm Problem, which is the best theoretical attack we have for this problem in some groups used in cryptography.

Notice that neither the function in the above, nor the function constructed by Douglas Hofstadter, constitutes a secure cryptographic hash. It's because they are not that we can find a cycle, and solve the problem at hand.