Score:2

Why does the von Neumann corrector only work when the input bits are independent of each other and same bias?

de flag

I read some papers that mention the von Neumann corrector. They always assume that for the von Neumann corrector, the input bits are independent of each other and same bias

Why does the von Neumann corrector only work when the input bits are independent of each other and same bias?

Score:2
my flag

Why Von Neumann corrector only works when the input bits are independent of each other and same bias ?

The basic Von Neumann dibiaser works like this: you generate two bits X, Y, and then output a bit (or not output a bit) with this logic:

X=0 Y=0 -> No output
X=0 Y=1 -> Output a 0
X=1 Y=0 -> Output a 1
X=1 Y=1 -> No output

If X and Y are uncorrelated and share the same bias (that is, the probability of them being 0 is $p$ and the probability of 1 is $(1-p)$), then it immediately follows that the probability of the above procedure outputing a 0 is $p(1-p)$ and the probability of a 1 is $p(1-p)$, which is identical (and there's a probability $p^2 + (1-p)^2$ of not outputing anything - we'll just run it again with two more input bits). In addition, because we assume that the input bits are independent, running the debiaser multiple times on independent input bits will generate independent outputs.

However, you're question is "what if the bits aren't independent or don't share the same bias").

In that case, the above logic doesn't hold.

If the bits contain different biases, that is, if the first bit is 0 with probability $p$, and the second bit is 0 with probability $q \ne p$), then probability of a 0 is $p(1-q) = p - pq$ and a probability of a 1 is $q(1-p) = q - pq$, as we assumed that $p \ne q$, these are obviously different, and so the output of the dibiaser is biased.

If the bits share the same average bias, but are correlated, what may happen is that sequence bits from the debiaser may be correlated. For example, if a 01 sequence is followed by another 01 sequence 90% of the time (and similarly for a 10 sequence), then adjacent outputs from the debiaser will themselves be strongly correlated, that is, the dibiaser has failed to turn the raw input string into something 'random'.


Previous answer (which just gave counterexamples):

Consider a case where the even input bits are 0 90% of the time, and the odd input bits are 1 90% of the time.

In that case, the output of the Von Neumann corrector is even more strongly biased than the input.

Exercise for you: devise a similar counterexample where the bits are not independent of each other...


Paul complained that my example was "too extreme" (whatever that means - I had thought that any counterexample would apply). In any case, I'll respond with another, even more extreme, example.

Consider a raw source that generated pairs of bits with this probability distribution:

  • 00 with probability 1/3
  • 01 with probability 1/3
  • 11 with probability 1/3

(with each pair being uncorrelated with any other pair).

Simple computation shows that each pair has Shannon entropy of $log_2(3) \approx 1.5850$ bits (or about $0.7925$ entropy bits per bit).

However, if we run that through a Von Neumann corrector, we get a stream that has, well, 0 bits of entropy...

Paul Uszak avatar
cn flag
Well, Paul's a git which is why nobody likes him. But on the substance, your final para is mathematically wrong. Sorry. Entirely probabilistic runs will occur like `000000110001` from which vN can extract entropy. H(stream) $\ne$ 0. And the more advanced vN version (that uses more than pairwise comparisons) will do even better.
Paul Uszak avatar
cn flag
Not very good with the maths thingie, but `01` means $a < b$ which is a valid extracted bit. So I estimate H(stream) efficiency = 17%, or 0.27 bits/pair in the long view. Nobody said vN is efficient. And the more advanced vN version (that uses more than pairwise comparisons) will do even better.
Paul Uszak avatar
cn flag
And please upvote the question if you thought it worthwhile answering it...
Paul Uszak avatar
cn flag
Re: _"The basic Von Neumann dibiaser (sic)_" is also incorrect. vN isn't bitwise based, it's sample based as detailed [here](https://crypto.stackexchange.com/a/87362/23115). That's what I meant with $a$ and $b$.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.