How fast does revealing parity bits leak information?

interfect

12/15/23, 3:59 AM

I've got a scheme where I XOR a secret key value with a public (but random) value, XOR together all the bits of the result, and publish that bit (0 or 1), which is the parity of the result of the XOR. My goal is for this published bit to be hard to predict, for a given public value, before it is published.

I'm certain that this will leak information about the key: I'm doing XOR with a static key and a known plaintext, which is a classic easy-to-break cipher, and even if I'm only publishing a little information about each value that results from the reused key, eventually there's got to be a way to get the key from the published bits.

My question is:

How fast does this leak information? If the key and the public values are, say, 128 bits, is the key "good for" producing unpredictable bits for 128 values on average before the next value can be reliably predicted? Or more? Or less?
How do you go about exploiting the leak to recover the key from the public values and the parity bits from the XORs? You end up knowing the parity of a bunch of XORs against the same number, which is like a certain number of equations over unknowns which are the key bits, but I never learned how to do boolean linear algebra.
Is there an easy way to fix this scheme? The key material is supposed to be (handwaving) shared between two parties; I want them to both be able to agree on this unpredictable bit. Do I just swap out the static shared key with a real stream cipher on a shared secret seed, so the key can rotate?

2 + 1

cryptanalysis

stream-cipher

algorithm-design

fgrieu

12/15/23, 8:03 AM

The goal _"published bit be hard to predict"_ is missed: knowing message $m_0$ and $m_1$ (of the same size), and the output $b_0$ for message $m_0$, we can compute the output $b_1$ for message $m_1$ as$$b_1\gets b_0\oplus\operatorname{parity}(m_0)\oplus\operatorname{parity}(m_1)$$What leaks about the key depends on if message and key are required to be the same size (as assumed in [this answer](https://crypto.stackexchange.com/a/103286/555)), and what happens if not; which is untold.

Score:3

Crypto

Daniel S

12/15/23, 7:13 AM

As noted by Mark, the parity bit leaks precisely one bit of information about the key. This bit of information is the parity of the key. The key observation here is that $$\mathrm{parity}(k\oplus p)=\mathrm{parity}(k)\oplus \mathrm{parity}(p)$$
The leak cannot be further exploited to enable key recovery, because the same bit of information is leaked each time. To see this, note that if we swap out $k$ for a different number with the same parity, we will leave the sequence of bit reveals unchanged. However, the leak of the first reveal means that the desire to keep future reveals hard to predict is not met. Seeing the first reveal lets me compute $\mathrm{parity}(k)$. Seeing any subsequent public value $p$ now allows me to compute $\mathrm{parity}(k\oplus p)$ using the formula above even though I know nothing further about $k$.
Could you elaborate further on your design goals? Why are the bit reveals made public? I’m reluctant to offer suggestions when I’m not sure what the end goal is.

+ 1

interfect

12/15/23, 4:26 PM

I'm trying to produce a set of bit streams from $N$ parties, where each stream is, independently, a cryptographically-strong keystream against up to $N-2$ other colluding participants, but each stream equals the other $N-1$ streams XORed together. If I structure the $k$ values right I can do a multi-party dance to set them up so each party only has their own, while the parity of everybody's "unpredictable" bits is collectively 0. I think I can skip this whole widget and use pairwise-shared stream ciphers, where each party's keystream is the XOR of all the shared ones with the other parties.

Score:1

Crypto

Mark

12/15/23, 6:38 AM

A few things.

This (clearly) leaks at most one bit per invocation
it might be good for slightly more than 128 calls, but only slightly. Roughly what happens is that each public value $\vec p_i$ gives you a linear equation $\langle \vec p_i, \vec k\rangle = b_i$, where $\vec p_i$ is the public value, and $b_i$ is the computed parity bit ($\langle \vec p_i, \vec k\rangle$ is notation for an inner product, which modulo 2 is just computing a parity after XORing two values), and $\vec k$ is the key

Anyway, given sufficiently many equations $\vec p_i$, one can produce a linear system $P\vec k = \vec b$ of all of the parity bits. Provided $P$ is "full rank", one can uniquely invert it over $\mathbb{F}_2$, and recover $k$. The problem then reduces to computing the probability that $P$ is full-rank. If you assume each $\vec p_i$ is uniformly random this can be done, but it's not pretty (meaning the analysis is ugly. actual key recovery is simple and efficient). In general though you might need slightly more samples than 128 to recover $\vec k$, but in general I'd be surprised if you needed more than some small additive term more (for example, by the time you get 150 samples I'd expect you to be able to solve the linear system).
Yes, there are many ways. Swaping out $\vec k$ with a stream cipher (or block cipher) should work. Note that (for a block cipher) you will need to have both parties maintain synchronization. But you can compute $b_i = \langle \vec p_i, E_{\vec k}(i)\rangle$ and it should be fine (the "maintain synchronization" means remember $i$ to compute $E_{\vec k}(i)$). Downside of this is if you lose synchronization you lose agreement, and reusing $i$ is also very bad (it doesn't obviously lead to key recovery, i.e. not as bad as before, but it will mean things are predictable)

There are other things you can do as well that might be simpler. In particular, I imagine computing $H(\vec k||\vec p_i)$ for a cryptographic hash function should be fine. This will give many output bits at once, but you could always just take the parity of the result. The downside of this compared to before is that hash function calls tend to be more expensive than block/stream cipher calls. If it doesn't matter in your case its maybe preferrable though because it is stateless, and the only risk of things being predictable is if the same public value $\vec p_i$ occurs again.

+ 0

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: How fast does revealing parity bits leak information?

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.