Score:1

The Diffie-Hellman-based Private set intersection protocol cannot pass simulation proof?

vn flag

Given the popular Private Set Intersection (PSI) protocol first described in [1]:

  • Alice choose a random $a$, and sends $\{H(x_i)^{a}\bmod p\}| (i=1,...m)$ to Bob.
  • Bob choose a random $b$, and sends $\{H(y_i)^{b}\bmod p\}| (i=1,...n)$ to Alice.
  • Alice computes and sends $\{H(y_i)^{ba}\bmod p\}| (i=1,...n)$ to Bob.
  • Bob computes and sends $\{H(x_i)^{ab}\bmod p\}| (i=1,...m)$ to Alice.
  • Each party locally compute the intersections.

Question 1: If (with a very small chance) there exist $x_1$ and $x_2$ that $H(x_1)=H(x_2)^2\bmod p$, then Bob could discover it because $H(x_1)^a=(H(x_2)^a)^2\bmod p$. Surely Bob could not simulate this information. Does it mean that this protocol violate semihonest security?

I discussed this with friends, some said this does not violate semihonest security, because the chance of $H(x_1)=H(x_2)^2\bmod p$ is negligable. But I think it does, because in Definition 4.1. of the simulation proof tutorial [2], the simulation should always succeed, and should not depend on the input $\{x,y\}$.

Question 2: In the PSI protocol (Fig.3) of [3], they use $H(H(x_i)+H(x_i)^{a})$ instead of $H(x_i)^{a}$ (notice that the protocol is still correct if they use $H(x_i)^{a}$), this seems to strengthen my point (naive DH-PSI protocol is not provable), because $H(H(x_i)+H(x_i)^{a})$ is easier to simulate than $H(x_i)^{a}$ . Is my understanding correct?

Thanks.

  1. Huberman B A, Franklin M, Hogg T. Enhancing privacy and trust in electronic communities[C]// ACM Conference on Electronic Commerce. ACM, 1999:78-86
  2. Lindell Y, How to simulate it, https://eprint.iacr.org/2016/046.pdf
  3. Heinrich A, Hollick M, Schneider T, et al. PrivateDrop: Practical Privacy-Preserving Authentication for Apple AirDrop, USENIX'SEC 21
Score:1
us flag

If (with a very small chance) there exist $x_1$ and $x_2$ that $H(x_1) = H(x_2)^2$, then Bob could discover it ... the simulation should always succeed, and should not depend on the input $\{x,y\}$.

I agree with your friend that this observation doesn't violate semi-honest security.

In the semi-honest model, the inputs are independent of the random oracle. In other words, the inputs are fixed first, and then $H$ is sampled. The environment does not have access to the random oracle (in the local random oracle model), so its choice of inputs for the honest parties' doesn't depend on the random oracle. So the event that $H(x_1) = H(x_2)^2$ does not depend on $x_1, x_2$. It is negligibly likely for all inputs, so the simulator can safely ignore this case.

In the PSI protocol (Fig.3) of [3],

I'm not sure what your question here is. In [3] they need a malicious-secure PSI protocol, and the classic DH-PSI protocol is not. So they use the more complicated protocol of Jarecki & Liu.

The bigger question seems to be about whether classic DH-PSI "can't be simulated," where presumably you mean against malicious adversaries. I agree. Just think about the case where the parties have 1 item each. Consider the case where a corrupt Alice queries the random oracle at $x_0$ and $x_1$, flips a coin $b$, and sends $H(x_b)^a$ for random exponent $a$. The random exponent $a$ makes the simulator's view (random oracle queries $x_0, x_1$ and protocol message $H(x_b)^a$) perfectly independent of $b$. But the simulator must guess $b$ in order to extract correctly.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.