Score:3

ChaCha-based Sponge PRNG fails PractRand suite

in flag

TL;DR: My simple ChaCha-based sponge PRNG is getting "unusual" evaluation from PractRand test battery pretty reliably, sometimes even within the first GB; I'm trying understand why.

I was in need of a fast, non-cryptographic PRNG, preferably one without a bloated state, or one that would use a portion of its internal state as the output buffer. One of my initial ideas was to use a low-round count ChaCha, but N=4 is considered insufficient (cf. https://cr.yp.to/snuffle/diffusion.html, althogh ChaCha diffusion is much better than Salsa's) and N>6 proved to be a bit too slow. Additionally, an argument against pure ChaCha is that for each chunk of data (64B) the whole state has to be recreated instead of just reshuffled (which means twice the memory footprint).

This prompted me to look into creating a ChaCha-based sponge, as shown in the image below:

ChaCha-based sponge

The principle is simple - after the initial shuffle, 128 bits of state are kept hidden (first row), while the remaining 384 bits (48B, 12 words) is used as PRNG output. When the current state is depleted, another N rounds are applied and the process is repeated.

The construction worked well in our setting, expectedly beating comparable "normal" ChaCha implementation, but I was surprised, when I learned that (at least with some seeds) it is getting "unusual" evaluation from PractRand, even for values of N that are considered safe for ChaCha.

For instance:

N = 2

RNG_test using PractRand version 0.95
RNG = RNG_stdin, seed = 00112233
test set = core, folding = standard(unknown format)

rng=RNG_stdin, seed=00112233
length= 8 gigabytes (2^33 bytes), time= 43.7 seconds
  Test Name                         Raw       Processed     Evaluation
  [Low1/32]Gap-16:A                 R=  +5.4  p =  4.8e-4   unusual
  ...and 299 test result(s) without anomalies

N = 4

RNG_test using PractRand version 0.95
RNG = RNG_stdin, seed = 00112233
test set = core, folding = standard(unknown format)

rng=RNG_stdin, seed=00112233
length= 1 gigabyte (2^30 bytes), time= 6.8 seconds
  Test Name                         Raw       Processed     Evaluation
  [Low4/32]BCFN(2+0,13-3,T)         R=  -7.3  p =1-5.0e-4   unusual
  ...and 250 test result(s) without anomalies

N = 6

RNG_test using PractRand version 0.95
RNG = RNG_stdin, seed = 77777777
test set = core, folding = standard(unknown format)

rng=RNG_stdin, seed=77777777
length= 2 gigabytes (2^31 bytes), time= 12.4 seconds
  Test Name                         Raw       Processed     Evaluation
  [Low1/8]DC6-9x1Bytes-1            R=  -4.6  p =1-2.9e-3   unusual
  [Low4/32]Gap-16:B                 R=  +4.7  p =  6.9e-4   unusual
  ...and 267 test result(s) without anomalies

N = 8

RNG_test using PractRand version 0.95
RNG = RNG_stdin, seed = 00112233
test set = core, folding = standard(unknown format)

rng=RNG_stdin, seed=00112233
length= 2 gigabytes (2^31 bytes), time= 12.2 seconds
  Test Name                         Raw       Processed     Evaluation
  [Low1/8]Gap-16:A                  R=  -4.4  p =1-3.9e-4   unusual
  ...and 268 test result(s) without anomalies

This doesn't apply to every seed w/ every N (for instance N=6 with seed=00112233 went up to 2^38 without a hitch), but it's quite easy to come up with one that fails. Still, if this was purely due to low quality of the seed, I'd expect it to fail early on, before the diffusion makes the state indistinguishable from random.

Question

The construction since has been replaced with JSF32 in our application (with comparable performance to ChaCha sponge with N=2), but it left me wondering which part of the construction is responsible for the "failed" tests; so this question is mostly academic. My understanding is that for N=8, ChaCha internal function should be indistinguishable from a random permutation, and a random permutation can be used to create a PRNG via a sponge. One guess is that I'm hitting some short permutation cycle, but then, from the examples above, I have found at least 2 "short cycles" for double rounds with cycle lengths under 2^30.

Final notes:

  • k = 20, but it doesn't really matter. Any value that is a multiple of N is equivalent to skipping k/N rounds of output.
  • The seeds I posted here are ones that I came up with while writing this question; in reality I've tested many more and the results are reproducible.
  • The tests "failing" are usually of the following classes:
    • Gap-16
    • BCFN
    • DC6-9x1Bytes-1
  • I know that "unusual" is not strictly "FAIL", but it's appearing way too regularly to ignore.
fgrieu avatar
ng flag
Does your PractRand reliably recognizes the output of your RNG from that of /dev/urandom, for equal input length? Only then is your generator to question.
Marandil avatar
in flag
@fgrieu not only /dev/urandom, but also ChaCha4 and all "traditional" RNGs I've tested worked fine.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.