Displaying the hash of a file on a website in order to provide data integrity relies only on the preimage-resistance property of the underlying hash function.
Is this true or false?
False, for several reasons:
- The practice of "displaying the hash of a file on a website in order to provide data integrity" for that file relies, among other things, on the assumption of the integrity of the hash displayed. That assumption is true or false, and unrelated to "preimage-resistance property of the underlying hash function".
- If (on top of the assumption in 1) we assume that the file was prepared randomly and is known to the attacker, then the safety of said practice coincides with (a non-quantitative definition of) second-preimage-resistance. But the usual meaning of "preimage-resistance" is first-preimage-resistance, which makes a different assumption†, not met in the use case at hand. And first-preimage-resistance does not imply second-preimage-resistance. Thus even under hypothesis such that second-preimage-resistance is the relevant property, (first)-preimage resistance of the hash is not sufficient.
- Further, the assumption (of 2) that the file was prepared randomly is practically unwarranted. The safer thing to assume is that the file may have been intentionally prepared to allow undetected substitution without changing the hash. Under that sometime realistic assumption, the safety of said practice coincides with (a non-quantitative definition of) collision-resistance. And (whatever) preimage-resistance does not imply collision-resistance (in non-quantitative definitions, and in quantitative definitions for the same fixed security level).
SHA-1 is a practical example of (first and second) preimage-resistant hash that's unsafe for the practice considered, under the assumption in 3. It's even unsafe under the assumption, in-between those of 2 and 3, of a file bound to be in a prescribed and common format (e.g. PDF) and prepared by a non-malicious actor (e.g. the one that also non-maliciously and correctly computes the hash), assuming the preparation is with a maliciously crafted computer tool (e.g unknowingly to the said non-malicious actor). See the 2017 shattered attack for illustration.
For simple (non-quantitative) definitions of (first-)preimage-resistance, second-preimage-resistance, and collision-resistance, refer to this.
The following would-be answer reaches the correct conclusion, but uses the incorrect argument of proposing an unrelated method to reach the goal considered:
False because why are other cryptographic algorithms with a key not used?
For several reasons, this other would-be answer is wrong:
True since pre-image resistance is enough as it's a one way function; this therefore makes it impossible to crack. All the system does is compare if the hashes are the same to prove integrity.
- "impossible to crack" refers to (first) pre-image and One-Way function (which are synonymous, for non-quantitative definitions at least). As developed in 2, that's not sufficient in the use case, even with strong assumptions.
- The definition of (first) pre-image resistance and One-Way function is not in term of equality of two hashes, as in the argument in the second part of this argument.
As an aside, answering with "pre-image" when the problem statement uses "preimage" is bad from the standpoint of maximizing odds of succeeding to exams. Towards this goal, it's best to use the problem statement as the reference, unless it's indisputably wrong; in which case clearly pointing why can sometime be a reasonable course of action. On the other hand, it's often better (especially in MCQ) to correct the problem statement. From this standpoint, my argument 1, though formally correct, is perhaps best omitted, for my guess is that the intended question was:
It's displayed the hash of a file on a website in order to provide data integrity. Does the hash function require only the preimage-resistance property?
† Depending on authors, the (non-quantitative) definition of first-preimage-resistance assumes that the target hash is random, or that it is the hash of an unknown random secret. Whatever definition, that assumption does not match the use case.