Score:1

About two sequence conversion methods in NIST 800-90B

in flag

I am reading the 800-90B document. In particular, I'm looking at Chapter 5, the chapter on checking that samples conform to IID. There are 11 tests such as Excursion Test Statistic and Number of Directional Runs. All these tests can be performed on binary data as well as non-binary data.

In the case of some tests, in the case of binary data, the test is performed through conversion in one of two ways.

Conversion I partitions the sequences into eight-bit non-overlapping blocks, and counts the number of ones in each block. Zeroes are appended when the last block has less than eight bits. For example, let the 20-bit input be (1,0,0,0,1,1,1,0,1,1,0,1,1,0,1,1,0,0,1,1). The first and the second eight-bit blocks include four and six ones, respectively. The last block, which is not complete, includes two ones. The output sequence is (4, 6, 2).

Conversion II partitions the sequences into eight-bit non-overlapping blocks, and calculates the integer value of each block. For example, let the input message be (1,0,0,0,1,1,1,0,1,1,0,1,1,0,1,1,0,0,1,1). The integer values of the first two blocks are 142, and 219. Zeroes are appended when the last block has less than eight bits. Then, the last block becomes (0,0,1,1,0,0,0,0) with an integer value of 48. The output sequence is (142, 219, 48).

From my point of view, conversion 1 follows a normal distribution, and conversion 2 seems to follow a uniform distribution.

But I don't understand why the conversion method is different depending on the test. For example, average collision test and maximum collision test use conversion 2, and tests such as number of directional runs test and length of directional runs tests use conversion 1 (even excursion tests do not require conversion).

In summary, I would like to know why NIST proposed two conversion methods and why the conversion method is different for each test.

Thank you.

Paul Uszak avatar
cn flag
A higher level question that encompasses yours is why is conversion necessary at all? I have two of my own novel IID tests, and neither requires funky conversion.
Score:-1
sa flag

Runs are an intrinsic property of bit sequences which are not preserved/obvious without a table lookup when the integer conversion is used. If the hypothesis is that we have an IID uniform sequence of bits, we know the run properties it has but need to preserve it as bits to check these.

There is no Gaussian involved, under ideal conditions hamming weight (the number $w$ of 1 bits in an $n$ bit window) is distributed as binomial, $\textrm{Bin}(n,p)$ which can look like Gaussian if $p\approx 1/2,$ otherwise Poisson $\textrm{Poi}(w/n)$ if $p$ is near zero or one. Actually it's always poisson but Poisson converges to Gaussian when $p$ is near $1/2$ due to symmetry.

So if structural (memory dependent) properties are being tested bit conversions make sense. If all you are testing for is uniformity then it doesn't matter, you can just test for uniformity of all (say) 8 bit patterns or all integers between $0$ and $255.$

Paul Uszak avatar
cn flag
Re. Last para: This is an IID test, not a cryptographic randomness test. Thus all bit patterns and all integers are not required, expected or tested for. That's not how permutation testing works. The sample files can be extremely biased yet still be IID.
Paul Uszak avatar
cn flag
pioneer's 20-bit example has 1 bit/byte only samples.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.