About some tests in NIST SP 800-22 rev 1a and erfc function

pioneer

7/19/23, 6:40 AM

I'm learning the randomness test of the NIST SP 800-22 rev 1a documentation.

https://csrc.nist.gov/publications/detail/sp/800-22/rev-1a/final

As I was reading, a few questions came up and I put them up like this. My questions are:

In the Frequency Test within a Block of 2.2, looking at (3) of 2.2.4, there is a part that is calculated as follows. $$\chi^2(obs)=4M\sum_{i=1}^{N}(\pi_i - 1/2)^2$$ I don't understand why it's multiplied by 4 here.
In the Runs Test of 2.3, the p-value is calculated as $$P-value = erfc(\frac{|V_n(obs)-2n\pi(1-\pi)|}{2\sqrt{2n}\pi(1-\pi)}).$$ What is the value of the denominator here? I know that $2\pi(1-\pi)$ is the mean, but I don't know where the denominator came from.
Is there any reason to find the p-value through erf instead of the normal distribution in (2)?

Thank you.

0 + 0

random-number-generator

randomness

Score:0

Crypto

Daniel S

7/19/23, 9:09 AM

This is worrisome. The usual Pearson $\chi^2$ test would be $$M\sum_{i=1}^n\frac{\left(\pi_i-\frac12\right)^2}{\frac12}=2M\sum_{i=1}^n\left(\pi_i-\frac12\right)^2.$$ I will think some more, but it might be worth dropping aquifers to NIST.
It's supposed to be the standard deviation. For independent, identically distributed bits the run statistic should be binomially distributed. The central limit theorem then tells us that the binomial can be approximated with the normal distribution. Note however that the square root sign should extend over the $\pi(1-\pi)$ factor.
The erf function computes the two-tailed normal distribution. We're interested in both the cases where runs are prevalent (i.e. where there are many 00 and 11 consecutive pairs) and where they are deficient (i.e. where there are many 01 and 10 pairs). Both show issues with the distribution, but are accounted for by the opposite tails.

0 + 0

pioneer

7/19/23, 10:39 AM

Thanks for your reply. But I don't quite understand the second answer. If I write the expression as you say, shouldn't $p-value$ be $erfc(\frac{|V_n(obs)-n\pi|}{\sqrt{n\pi(1-\pi)}})$?because for CLT I need to construct the expression $\frac{V_n-np}{\sqrt{npq}}$.

Daniel S

7/19/23, 11:13 AM

Your statistic would be correct for testing the number of 1s ($\pi$ is the probability that a 1 is output). They're counting the number of 01s and 10s. These occur with probability $(1-\pi)\pi$ and $\pi(1-\pi)$ so that we have a $\mathrm{Bin}(n-1,2\pi(1-\pi))$ distribution. Looking deeper, there seems to be some strange approximation going on. Fortunately, this document is being revised, so there may be a chance to clear things up in the new version.

pioneer

7/19/23, 11:31 AM

As you mentioned, it is understandable that $V_n(obs)$~$Bin(n-1, 2\pi(1-\pi))$ is satisfied because they count the number of 10s and 01s. However, since even such a case satisfies expression $P-value = erfc(\frac{|V_n(obs)-2n\pi(1-\pi)|}{\sqrt{n2\pi(1-\pi)(1-2\pi(1-\pi))}}).$, the expression in the document is not satisfied. Is the expression in the document wrong? (+Actually, it should be written as $n-1$ instead of $n$, is it okay to change it arbitrarily?)

Daniel S

7/19/23, 12:04 PM

I hesitate to claim incorrectness as they might be using some approximation to the variance that I'm not aware of. Note that your erf expression requires a $\sqrt 2$ in the denominator. The error switching from $n-1$ to $n$ is negligible as $n$ grows. I do think that it's worth raising this and your first query with NIST if the expressions still appear in the revised version and I will try to do so.

pioneer

7/19/23, 12:34 PM

I get it. Next time, I hope that NIST will revise these facts properly or publish a document with sufficient explanations in this regard. Thank you very much for the good answer.

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: About some tests in NIST SP 800-22 rev 1a and erfc function

TH: เกี่ยวกับการทดสอบบางอย่างในฟังก์ชัน NIST SP 800-22 rev 1a และ erfc

RO: Despre unele teste în NIST SP 800-22 rev 1a și funcția erfc

RU: О некоторых тестах в NIST SP 800-22 rev 1a и функции erfc

VI: Giới thiệu về một số thử nghiệm trong NIST SP 800-22 rev 1a và chức năng erfc

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.