Score:1

About some tests in NIST SP 800-22 rev 1a and erfc function

in flag

I'm learning the randomness test of the NIST SP 800-22 rev 1a documentation.

https://csrc.nist.gov/publications/detail/sp/800-22/rev-1a/final

As I was reading, a few questions came up and I put them up like this. My questions are:

  1. In the Frequency Test within a Block of 2.2, looking at (3) of 2.2.4, there is a part that is calculated as follows. $$\chi^2(obs)=4M\sum_{i=1}^{N}(\pi_i - 1/2)^2$$ I don't understand why it's multiplied by 4 here.

  2. In the Runs Test of 2.3, the p-value is calculated as $$P-value = erfc(\frac{|V_n(obs)-2n\pi(1-\pi)|}{2\sqrt{2n}\pi(1-\pi)}).$$ What is the value of the denominator here? I know that $2\pi(1-\pi)$ is the mean, but I don't know where the denominator came from.

  3. Is there any reason to find the p-value through erf instead of the normal distribution in (2)?

Thank you.

Score:0
ru flag
  1. This is worrisome. The usual Pearson $\chi^2$ test would be $$M\sum_{i=1}^n\frac{\left(\pi_i-\frac12\right)^2}{\frac12}=2M\sum_{i=1}^n\left(\pi_i-\frac12\right)^2.$$ I will think some more, but it might be worth dropping aquifers to NIST.

  2. It's supposed to be the standard deviation. For independent, identically distributed bits the run statistic should be binomially distributed. The central limit theorem then tells us that the binomial can be approximated with the normal distribution. Note however that the square root sign should extend over the $\pi(1-\pi)$ factor.

  3. The erf function computes the two-tailed normal distribution. We're interested in both the cases where runs are prevalent (i.e. where there are many 00 and 11 consecutive pairs) and where they are deficient (i.e. where there are many 01 and 10 pairs). Both show issues with the distribution, but are accounted for by the opposite tails.

pioneer avatar
in flag
Thanks for your reply. But I don't quite understand the second answer. If I write the expression as you say, shouldn't $p-value$ be $erfc(\frac{|V_n(obs)-n\pi|}{\sqrt{n\pi(1-\pi)}})$?because for CLT I need to construct the expression $\frac{V_n-np}{\sqrt{npq}}$.
Daniel S avatar
ru flag
Your statistic would be correct for testing the number of 1s ($\pi$ is the probability that a 1 is output). They're counting the number of 01s and 10s. These occur with probability $(1-\pi)\pi$ and $\pi(1-\pi)$ so that we have a $\mathrm{Bin}(n-1,2\pi(1-\pi))$ distribution. Looking deeper, there seems to be some strange approximation going on. Fortunately, this document is being revised, so there may be a chance to clear things up in the new version.
pioneer avatar
in flag
As you mentioned, it is understandable that $V_n(obs)$~$Bin(n-1, 2\pi(1-\pi))$ is satisfied because they count the number of 10s and 01s. However, since even such a case satisfies expression $P-value = erfc(\frac{|V_n(obs)-2n\pi(1-\pi)|}{\sqrt{n2\pi(1-\pi)(1-2\pi(1-\pi))}}).$, the expression in the document is not satisfied. Is the expression in the document wrong? (+Actually, it should be written as $n-1$ instead of $n$, is it okay to change it arbitrarily?)
Daniel S avatar
ru flag
I hesitate to claim incorrectness as they might be using some approximation to the variance that I'm not aware of. Note that your erf expression requires a $\sqrt 2$ in the denominator. The error switching from $n-1$ to $n$ is negligible as $n$ grows. I do think that it's worth raising this and your first query with NIST if the expressions still appear in the revised version and I will try to do so.
pioneer avatar
in flag
I get it. Next time, I hope that NIST will revise these facts properly or publish a document with sufficient explanations in this regard. Thank you very much for the good answer.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.