Score:1

What are the expected values of a particular rotational-XOR property of a sequence of random bitstrings?

de flag

Assuming that $x$ is a sequence of $l$ bits and $0 \le n < l$, let $R(x, n)$ denote the result of the left bitwise rotation of $x$ by $n$ bits. For example, if $x = 0100110001110000$, then $$\begin{array}{l} R(x,0) = {\rm{0100110001110000}},\\ R(x,1) = {\rm{1001100011100000}},\\ R(x,2) = {\rm{0011000111000001}},\\ \ldots \\ R(x,15) = {\rm{0010011000111000}}. \end{array}$$

Let $A \oplus B$ denote the result of the XOR operation for two sequences of $l$ bits. For example, $$0100110001110000 \oplus 1010010001000010 = 1110100000110010.$$

Let $H(x)$ denote the number of non-zero bits in $x$ (i.e. the Hamming weight of $x$).

Assuming that $x$ and $y$ are two bitstrings of the same length $l$, let $f(x, y)$ denote the minimal element (the smallest number) in the tuple $$\begin{array}{l} (H(x \oplus y),\\ H(x \oplus R(y,1)),\\ H(x \oplus R(y,2)),\\ \ldots \\ H(x \oplus R(y,l - 1))). \end{array}$$

Suppose that we have a TRNG which generates sequences of random bits. Generate a sequence of $L = k \times l$ bits. Split this sequence into $k$ words (so the length of each word is $l$): $w_0, w_1, \ldots, w_{k-1}$. Then compute the following tuple $T$ of numbers:

$$\begin{array}{l} (f({w_0},{w_1}),\\ f({w_0},{w_2}),\\ \ldots \\ f({w_0},{w_{k - 1}}),\\ f({w_1},{w_2}),\\ f({w_1},{w_3}),\\ \ldots \\ f({w_1},{w_{k - 1}}),\\ f({w_2},{w_3}),\\ \ldots \\ f({w_{k - 2}},{w_{k - 1}})). \end{array}$$

In other words, for any pair of words $(w_i, w_j)$ such that $i \neq j$, compute the corresponding $f({w_i},{w_j})$.

Question 1: given $k$ and $l$, how to compute the expected value of the minimal number $M_T$ in $T$?

Question 2: given $k$ and $l$, how to compute the expected value of the average number $A_T$ in $T$? Here the number $A_T$ is computed as follows: sum all elements of $T$, then divide the sum by the total number of elements in $T$.

The expected number here implies the number with the maximum probability. For example, the expected number of zero bits in a sequence of $l$ random bits is $l/2$.

kodlu avatar
sa flag
"The expected number here implies the number with the maximum probability. For example, the expected number of zero bits in a sequence of random bits is /2." This statement need not be true.
kodlu avatar
sa flag
The question as asked is very difficult. I will type up an answer to a related question when I have more time.
de flag
@kodlu: I am interested in an explanation of the first comment. If the probability of either 0 or 1 is 50%, what is the expected number of zero bits in a sequence of $l$ bits, assuming that $l$ is even?
kodlu avatar
sa flag
I only meant that for more general distributions that can arise in the analysis, that property need not hold. you are taking weights and doing minimizations of Hamming weights.
Score:0
sa flag

Since you say the input strings are outputs of a TRNG, I shall take each of the terms as uniformly distributed random vectors.

You define
$$f(x, y)=\min\{H(x \oplus y),H(x \oplus R(y,1)),\ldots,H(x \oplus R(y,\ell-1)\} $$ which I shall model as the minimum Hamming weight of a set of $k$ randomly uniformly independently chosen binary $\ell-$tuples.

Each Hamming weight in this set is distributed as $\textsf{Binomial}(\ell,1/2).$ So we have the minimum of $k$ unbiased binomial samples on $\{0,1,\ldots,\ell\}$. The minimum is also called the first Order Statistic and letting $F(u)=\mathbb{Pr}[X\leq u]$ where $X$ is binomially distributed as above (you are already using $x$ as a variable hence the $u$), we have $$ \mathbb{Pr}[f(x,y)\leq x]=1-(1-F(u))^{k}. $$

If the randomness hypothesis holds, then your next step looks at all $\binom{k}{2}$ pairs of these minima, so in effect you are looking at the minimum of a collection of $$\binom{k}{2}k$$ binomial samples. This is $O(k^3)$ samples and the minimum will go to zero quite fast, if the randomness hypothesis above holds, so you'd want to consider $$ 1-\left[1-(1-F(u))^k\right]^{\binom{k}{2}} $$

Based on this my tentative answers are:

Question 1: given $k$ and $l$, how to compute the expected value of the minimal number $M_T$ in $T$?

This will be close to zero for any sizable $k$ compared to $\ell.$ You might want to use the Gaussian approximation to the binomial and use the Gaussian cumulative distribution function to evaluate an estimate.

Question 2: given $k$ and $l$, how to compute the expected value of the average number $A_T$ in $T$? Here the number $A_T$ is computed as follows: sum all elements of $T$, then divide the sum by the total number of elements in $T$.

This is now the average of the individual order statistics instead of the minimum of the minima. It will be very likely comparable to $$ 1-(1-F(u))^k. $$

Remark: If you want to directly use approximations to the binomial you can see my answer to the following mathoverflow question. Note that for the binomial $\textsf{Binomial}(\ell,1/2),$ and for any $u \in \{0,1,\ldots,\ell\}$ we have $$ 2^{-\ell}\sum_{j\leq u-1} \binom{\ell}{j}\leq F(u)\leq 2^{-\ell}\sum_{j\leq u} \binom{\ell}{j} $$

de flag
Is it possible to have an explicit formula to find the expected value of $M_T$, assuming that the formula must only contain $k$ and $l$ as variables?
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.