I have been playing with SHA-2-256 in Julia and I noticed that the hashes produced don't appear to follow a uniform distribution. My understanding of secure hashing algorithms is that they should approximate a uniform distribution well, so they are not predictable.
Here is the Julia code I'm using:
using BitIntegers, Distributions, HypothesisTests, Random, SHA
function sha256_rounds()
rounds::Array{Array{UInt8,1}} = Array{Array{UInt8,1}}(undef, 10000) # 10000 Samples
hash::Array{UInt8} = Array{UInt8}(undef, 64) # 64-byte array
for i = 1:10000
hash = sha2_256(string(rand(UInt64), base = 16)) # Random number, convert to hex string, then seed
rounds[i] = hash
end
return rounds
end
sha256_str_vals = [join([string(x, base = 16) for x in y]) for y in sha256_rounds()] # Stitch the bytes together into strings
sha256_num_vals_control = [parse(UInt256, x, base = 16) for x in sha256_str_vals] # Get the numerical value from the strings
OneSampleADTest(sha256_num_vals, Uniform()) # One sample Anderson-Darling test
And the result of the test:
One sample Anderson-Darling test
--------------------------------
Population details:
parameter of interest: not implemented yet
value under h_0: NaN
point estimate: NaN
Test summary:
outcome with 95% confidence: reject h_0
one-sided p-value: <1e-7
Details:
number of observations: 10000
sample mean: 8.73991847621225e75
sample SD: 2.2742656031884893e76
A² statistic: Inf
To me this says that the produced hashes do not conform to a uniform distribution. Am I using the test incorrectly, or is my sample faulty? Thank you for your thoughts.