Score:3

Privacy intuition vs formal definition

ng flag

Suppose we define privacy as a game where a machine $M$ has a coin $b$, and on input $M_0, M_1$ always replies with encrypted $M_0$ if $b=0$ and encrypted $M_1$ if $b=1$. The adversary can send as many pairs $M_0, M_1$ as he wants. The goal is to guess $b$. If he cannot do better than guessing at random (i.e., with probability $1/2$), then privacy holds (Simplified Benaloh's ballot privacy definition, if anyone is curious).

Out toy machine $M$, works as follows:

  • samples a random permutation $P$.
  • applies $P$ to the base mapping ["Yes" = $A$; "No" = $B$;"Maybe" = $C$].
  • replies with a new code corresponding to an answer.

Example: if $P = [2,3,1]$, then new map is: "Yes" = $C$, "No" = $A$ and "Maybe" = $B$ and $M$ will reply with $C$ for "Yes", $A$ for "No" and $B$ for "Maybe".

We know $M$ will use one of $3!=6$ possible permutations, which means that "Yes" in $2/3$ of cases will not be $A$. However, I cannot construct a distinguisher for the privacy game from this fact.

So, on the one hand, intuition says that toy machine $M$ leaks information - looking at the code, we can make a reasonable guess. However, I cannot formally prove it's not private under the simplified definition.

I don't know if it's the (simplified) definition's drawback, faulty intuition, or my poor understanding of conditional probabilities. Any ideas?

Edit: No leakage will be if we cannot distinguish between 'real usage' ($N$ different replies with (possibly) unequal prior-known probabilities of distribution among "Yes"/"No"/"Maybe") and 'ideal case' (where $M$ outputs $N$ codes $A$, $B$ or $C$ at random regardless any probability distribution among answers).

fgrieu avatar
ng flag
It's correct that Yes in 2/3 of the case will not be A. You can't conclude from this fact that the output leaks information about the input (argument: No in 2/3 of the case will not be A either). And the output does not leak information. That's why you cannot construct a distinguisher. You want to state your definition of "leaks information", and then verify that it does not hold.
pintor avatar
ng flag
@fgrieu, why don't you think it's a leakage? Say someone uses the machine $M$ to reply to "Do you support a cause X?". If I see "A", most likely the person is not a supporter. It's not 100%, but more than nothing, which counts as leakage, no? So, you say intuition is faulty, right?
fgrieu avatar
ng flag
In the case at hand, you can't conclude anything about what someone encoded from seeing "A" out of the machine $M$. That has probability 1/3 regardless of what someone encoded. To convince you of this, check that for any of the 3 encoded choices, 2 out of the 6 permutations lead to an output of A. Any contrary intuition is incorrect. Again, you want to state your definition of "leaks information", and then verify that it does not hold.
pintor avatar
ng flag
@fgrieu I added the leakage definition. As I understand, no leakage means codes are uniform (1/3 for each letter) even when answers' selections are not (say we know for a fact that "Yes" is selected at 75% of cases, "No" at 20%, and "Maybe" at 5%).
fgrieu avatar
ng flag
By my reading, your definition of "No leakage" is met, because regardless of input, the output of $M$ is A, B or C each with probability 1/3. Thus nothing can be told about the input or it's distribution from the output.
pintor avatar
ng flag
@fgrieu, I'm slowly getting your point. Still looking for possible scenarios to cause privacy problems, but so far, they all fail when I compute probabilities. Do you mind writing an answer before bots take over? Special kudos if you cite a formal definition of information leakage
Score:2
ng flag

No information can leak about the input of machine $M$ from the output of machine $M$, in any game. That's because $M$ can be viewed as a probabilistic Turing machine (with no memory of earlier runs), and for each fixed input of $M$ the distribution of the output is the same. It's easy to check the later by examination: there are three possible inputs, and for each there are six equaly likely permutations, of which two yield output A, two yield output B, two yield output C, so that each of these output has probability 1/3

Note: what matters is not that the output values have the same probability. What matters is that the probability of each output value for a given input is independent of the input.

Note: in the privacy game, the input of machine $M$ is restricted to two inputs A/Yes and B/No. Therefore, to prove that $M$ does not leak information in the privacy game, we only need to check that the distributions of the output of $M$ for each of these two inputs of $M$ are the same.

For a quantitative approach of information leakage, I think we could define it as the base-2 logarithm of the ratio of the probability of guessing input correctly knowing output of $M$, to the probability of guessing input correctly not knowing the output of $M$. That could be computed from the distribution of inputs in the experiment, and the conditional probabilities $P(o|i)$ of output $o$ given input $i$ for the machine under consideration. I wish I could tell exactly how, and hope it's in this.

Score:1
ru flag

I’m going to assume that when you say $M$ samples a random permutation, it does so uniformly so that each permutation is equally likely. If this is not the case, then there is information leakage.

One way to measure information leakage is to compute the mutual information of the input and output. If we write $p_y$, $p_n$ and $p_m$ for the probability of input “Yes”, “No” and “Maybe” respectively, we can easily see that $$\mathbb P(\mathrm{input}=\mathrm{“Yes”},\mathrm{output}=A)=p_y\times\frac13=\mathbb P(\mathrm{input}=\mathrm{“Yes”})\mathbb P(\mathrm{output}=A)$$ and similar independence in the other eight possible combinations. It follows that $$\log\left(\frac{\mathbb P(\mathrm{input}=\mathrm{“Yes”},\mathrm{output}=A)}{\mathbb P(\mathrm{input}=\mathrm{“Yes”})\mathbb P(\mathrm{output}=A)}\right)=0$$ and likewise for the other 8 logarithms in the expression for mutual information. Hence $$I(\mathrm{input},\mathrm{output})=0.$$

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.