Score:3

Question about Theorem 2 in CRYSTALS-Kyber Paper

mq flag

I have some questions about the Kyber paper, especially about Theorem 2 on page 6, which I would like to ask here. First of all I quote the following theorem from the paper and ask my questions afterwards.

Theorem 2. For any adversary A, there exists an adversary B such that $Adv_{Kyber.CPA'}^{cpa}(A) \leq 2 \cdot Adv_{k+1,k,\eta}^{mlwe}(B)$.


  1. My first question is about the definition of $Adv_{Kyber.CPA'}^{cpa}(A)$, how exactly is this defined here? If I understand CPA correctly, the idea in game $G_0$ is that the adversary picks two plaintext messages, transmits them to the challenger, and the latter randomly encrypts one message. The adversary then has to guess which message the challanger has chosen (the bit). In this setup, all variables as defined in algorithms 1-3 (in the paper) remain the same? But how exactly is $Adv_{Kyber.CPA'}^{cpa}(A)$ actually defined, what are the parameters?

  2. In the change from game 0 to game 1, the $\mathbf{t}$, which was not uniformly distributed before, is replaced in game 1 with $\mathbf{t}' = \mathbf{As} + \mathbf{e}$ which is now uniformly randomly distributed. Now it is claimed that there is an adversary $B$ for which holds: $|Pr(b=b' \text{ in game } G_0) - Pr(b=b' \text{ in game } G_1)| \leq Adv_{k,k,\eta}^{mlwe}(B) \leq Adv_{k+1,k,\eta}^{mlwe}(B)$. I have some questions about this and the equation.

  • First, how would such an adversary $B$ look like? If I understand correctly, it would have to try to distinguish between the games 0 and 1? The difference in the games consists here only in the $\mathbf{t}$? Just an idea, could we then perhaps define the adversary $B$ to take as input a $\mathbf{t}$ and otherwise proceed as in the encryption, so that the values $\mathbf{u}$ and $v$ then follow the distribution of $\beta$ if we are in game 0 ($\mathbf{t} = \mathbf{t}$) and otherwise follow the uniformly random distribution if we are in game 1 ($\mathbf{t} = \mathbf{t}'$)? Now if $B$ could distinguish games 0 and 1, surely it could also distinguish the samples from each other, then MLWE would not be harder than distinguishing the games? Would this be an approach? Unfortunately, it is not clear from the paper how the adversary $B$ is specifically defined.

  • How is the inequality $|Pr(b=b' \text{ in game } G_0) - Pr(b=b' \text{ in game } G_1)| \leq Adv_{k,k,\eta}^{mlwe}(B)$ to be interpreted here? Does this mean that distinguishing games 0 and 1 is harder than the advantage of the adversary in ordinary MLWE? Why would that be the case? Or is it rather that the distinction of the games is not harder than the advantage of the adversary in MLWE?

  • Last, I am interested in how to say that $Adv_{k,k,\eta}^{mlwe}(B) \leq Adv_{k+1,k,\eta}^{mlwe}(B)$ holds? That would mean that the advantage of the adversary in the setting $(k,k,\eta)$ is smaller than in the setting with $(k+1,k,\eta)$, how can that be? I thought with increasing $k$ the complexity of MLWE increases and thus the security increases, an attacker should find it harder to extract an advantage? This question goes back to the right of the previous paragraph on interpreting the inequality.


I hope that even though the questions are very technical, they are understandable so far. I would be very happy to receive helpful comments and answers!

Score:2
ng flag
  1. your understanding is correct. If you mean by "what are the parameters" the question "what are the parameters of kyber", it (mostly) doesn't matter. You have some cryptosystem $Kyber.CPA′$ with some implicit parameters. You use those parameters. The only part that matters is that the parameters of kyber match with the MLWE instance used to bound the advantage of kyber. In particular, I assume Kyber is defined relative to constants $k, \eta$. As these are used in the MLWE instance, they should be used in kyber as well.

  2. You misread. Game $G_0$ has $\mathbf{t}$ being structured. Game $G_1$ has $\mathbf{t}$ being uniformly random. Distinguishing $G_0$ and $G_1$ will (eventually) reduce to distinguishing $(A, As + e)$ and $(A, u)$, i.e. an LWE-type assumption (MLWE in this case)

  3. The MLWE adversary is as follows. It gets $(A, b)$, where $b$ is either $u$ or $As + e$. It then uses this like MLWE samples are used in Kyber to construct a PKE scheme. This PKE scheme is either Kyber in game $G_0$ or Kyber in game $G_1$. This is to say that $B$ simulates a game that is either game $G_0$ or $G_1$, and then uses a Kyber adversary to distinguish between these two games, and therefore distinguish between which MLWE distribution the MLWE adversary's sample is from.

    Note that there is some technical nuance here --- one has to use two hybrids for Kyber's security proof. First, one switchs from public keys being MLWE samples to being uniformly random. Then you do a similar switch for encryption itself. This is where the factor 2 comes from in the advantage expression.

  4. Roughly, any adversary that can distinguish between games $G_0$ and $G_1$ leads to an adversary of that advantage (or better) in the MLWE game.

  5. It depends on precisely what the advantage expression means. There are two natural parameters for MLWE --- a dimension/rank type parameter, and a # of samples one gets parameter. Increasing the dimension/rank should make the problem harder (as you seem to be saying), so it is not clear the inequality should hold. Increasing the # of samples should make the problem easier (one can always ignore some samples). So if that parameter corresponds to dimension/rank, I agree it seems odd. If it corresponds to # samples, it makes sense --- just have the adversary ignore its last sample, or whatever.

P_Gate avatar
mq flag
Hey @Mark, thanks for your reply! One minor question, in the second point you say that $t$ would be "structured". What do you mean by that specifically? I would have thought that in $G_0$ it is $t = As + e$, where $e$ comes from some distribution (It's the original $t$ without a change).
Mark avatar
ng flag
Yes sorry, in $G_0$ $t = As + e$ is "structured". In $G_1$, it is uniformly random ("unstructured").
Daniel S avatar
ru flag
In re point 5, the parameter does indeed correspond to $m$ the number of samples per section 2.3. The inequality is saying that the MLWE problem with $m=k$ is easier than the MLWE problem with $m=k+1$ (and same $\eta$) which as Mark remarks is clearly true as one can choose to ignore one sample.
P_Gate avatar
mq flag
I would then be interested in the following. How it is possible to say: $|Pr(b=b' \text{ in game } G_1) - Pr(b=b' \text{ in game } G_2)| \leq Adv_{k+1,k,\eta}^{mlwe}(B)$ Why does this have to be $k+1$ here? Is for example this: $|Pr(b=b' \text{ in game } G_1) - Pr(b=b' \text{ in game } G_2)| \leq Adv_{k,k,\eta}^{mlwe}(B)$ not right? Or did the authors just take the $\leq Adv_{k,k,\eta}^{mlwe}(B) \leq Adv_{k+1,k,\eta}^{mlwe}(B)$ estimate here or am I missing something here?
Daniel S avatar
ru flag
@P_Gate In $G_1\mathrm{\vs.\}G_2$ the goal is to distinguish "real Kyber" `Kyber.CPA.Enc` from fake (as opposed to $G_1\mathrm{\ vs.\ }G_1$ where the goal was to distinguish `Kyber.CPA.KeyGen`). In this case there is an extra sample that might have been substituted with an uniformly random element in the form of $v$.
Mark avatar
ng flag
There are two places one uses a hybrid to switch things for LPR-type schemes (which Kyber is one). To switch the public key for a uniformly random string (the $G_0$ and $G_1$ switch), then to switch the encryption output for a uniformly random string ($G_1$ to $G_2$). This typically leads to an advantage bound of $adv' + adv''$, where both terms are (slightly differently parameterized) MLWE advantage terms. It wouldn't surprise me if the authors did this, then applied the estimate you refer to to simplify the end expression somewhat --- this is a typical approach, and loses essentially nothing
Mark avatar
ng flag
because MLWE is (roughly) independent of the number of samples given, as you can take a fixed (I think linear in $n$? I'd have to check) number of samples and generate an *arbitrary* number of samples (at slightly higher noise rate) roughly by taking random linear combinations of your pre-existing samples. This is to say that the bound $Adv_{k+1,k,\eta}^{mlwe}(B)\leq Adv_{k,k,\eta}^{mlwe}(B')$ is obvious from dropping samples, but the bound $Adv_{k,k,\eta}^{mlwe}(B')\leq Adv_{k+1,k,\eta'}^{mlwe}(B'')$ also holds, for $\eta'$ only mildly larger than $\eta$ (I would guess $O(\sqrt{k}$).
P_Gate avatar
mq flag
Hey Mark, thanks for your reply, can you maybe elaborate on point 3 a bit I'm still not quite clear on the connection of adversary A and B and the relation to the MLWE adversary.
P_Gate avatar
mq flag
Point 2 is well explained! How does this behave between Game 1 and Game 2? What do we distinguish there? Is this for example, $G_2 \, (v',u')$ (where both are uniform) from $G_1 \, (v, u)$?
Mark avatar
ng flag
$B$ is the MLWE adversary. It runs $A$ on game $G_0$, and runs $A$ on game $G_1$. Any observable difference of $A$ on these games can be directly leveraged to break MLWE.
Mark avatar
ng flag
And yes (I think so). The second application of MLWE is a little more subtle in LPR type schemes. You have $v = r^tA$ for $A$ the random matrix in the public key, and $r$ uniformly random. In terms of this $v$, one can then see that $u$ is a MLWE sample, typically of different dimension/rank parameters. Anyway, after swapping out $(A, t)$ for uniformly random strings, one then just applies MLWE to $(v,u)$ to swap these out for uniformly random strings as well.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.