From a cryptographic standpoint, a hash function with fixed-size output:
- Must be a deterministic function: the same input must always generate the same output.
- Must accept input messages in a wide input set. Ideally that's should be arbitrary bitstring, but often is arbitrary bytestrings or character strings, perhaps up to some size.
- Baring special qualifier (like keyed or secret or random), must be public: every step and constant needed for the calculation of the output for any given input is known to all.
- Must compute it's result in time polynomial with the input size. That should be time roughly linear with the input size.
- Should have collision-resistance: it is computationally impossible to exhibit two distinct messages with the same output (this property implies second preimage resistance, which is that given a random input, it is computationally impossible to find another input yielding the same output). Notice that if there are $n$ possible outputs, there are generic attacks that break collision-resistance with about $\sqrt n$ evaluations of the function, thus collision-resistance implies $n$ is large (say, at least $2^{192}$ nowadays); if not, the definition of collision-resistance must be loosened to: the best method to exhibit two distinct messages with the same output is a generic attack.
- Should have (first) preimage-resistance: given the output for a random unknown input, it is computationally impossible to find that input (or another input with the same output) easier than by trying inputs. Notice that if there are $n$ possible outputs, there are generic attacks that break preimage-resistance with about $n$ evaluations of the function, thus preimage-resistance implies $n$ is large (say, at least $2^{96}$ nowadays); if not, the definition of preimage-resistance must be loosened.
More generally, a modern cryptographic hash should broadly behave as a random oracle, that is a box implementing a function, which output is random for every particular input. For large-enough output set, that implies collision-resistance, preimage resistance, and more:
- For unknown random input with any natural characteristic (that is, independent of the definition of the hash function; for example, input that consists of 40 decimal digits which sum is divisible by 10), the output should be computationally indistinguishable from uniformly random in the output set.
- Resistance to length-extension attack: given output for unknown input $M$, it should be impossible to compute output for input an extension of $M$ (that is $M\mathbin\|E$ for non-empty $E$) easier than by finding $M$ by trying inputs. Notice that still-common hashes such as SHA-256 do not have that later property.