Question on the proof of correctness in CRYSTALS-Kyber

P_Gate

11/13/23, 6:17 PM

I am currently trying to follow the proof of correctness in the CRYSTALS-Kyber paper. The following is an excerpt of the proof:

Excerpt of the proof, in origional see source above.

On the one hand, I am interested in how exactly one justifies/argues that $\mathbf{y}$ is pseudorandom, based on the MLWE assumption. About the difference $(\mathbf{y} - \text{Decompress}_q(\text{Compress}_q(\mathbf{y},d),d))$ I think myself as $(\mathbf{y} - inaccuracy)$ conditioned by the deviation of the concatenation of the functions Compress and Decompress. But I can't quite reconcile this with the MLWE assumption.
Secondly, I am interested in how one derives the inequality $\lceil q/4 \rfloor \geq || v - \mathbf{s}^T \mathbf{u} - \lceil q/2 \rfloor\cdot m'||_\infty$. The rest of the following algebraic transformations is understandable, but not the deduction of this inequality. We have with the previous calculations $||v - \mathbf{s}^T \mathbf{u} - \lceil q/2 \rfloor \cdot m||_\infty = ||w||_\infty < \lceil q/4 \rfloor$ this is a rearrangement from the corresponding paragraph. Next, the $m'$ is then defined in the paper and somehow the inequality comes up, I'm not quite clear on that.
Last, why does $\bigl\| \lceil q/2 \rfloor \cdot (m - m') \bigr\|_\infty < 2 \lceil q/4 \rfloor$ implies for odd $q$ that $m = m'$, from what does this follow? I'm thinking of a coefficient comparison here, but I can't understand the restriction to odd $q$'s.

Thanks for upcoming answers, I hope my questions are understandable.

176

1 + 0

lattice-crypto

kyber

Score:2

Crypto

Mark

11/14/23, 5:10 AM

First Bullet Point:

Kyber is a "LPR-type" scheme, or often called a "Noisy Diffie Hellman"-based cryptosystem. The main idea is to have

Alice send a "Noisy Diffie Hellman" share $(A, b_A:=As_A + e_A)$
Bob respond with a share with respect to $A^t$, i.e. $(A^t, b_B:=A^t s_B + e_B)$

Then, using knowledge of $(s_A, b_B)$, or $(s_B, b_A)$, both parties can reconstruct $s_B^t A s_A$, up to lower order error terms. Specifically

Alice can reconstruct $b_B^t s_A = (A^ts_B + e_B)^t s_A = s_B^t As_A + e_B^ts_A$, and
Bob can reconstruct $s_B^tb_A = s_B^t(As_A + e_A) = s_B^tAs_A + s_B^te_A$.

Bob then proceeds as follows. Under the MLWE assumption, $(A, b_A)$ is indistinguishable from uniform, i.e. we can use a hybrid argument to switch to a world where $b_A\approx u$. Then, Bob uses this $u$ to MLWE-encrypt his message, namely computes $b' := s_B^tu + \mathsf{encode}(m) + e'$, where $\mathsf{encode}(m)$ is some form of error-correction (typically $m\mapsto (q/2)m$), and $e'$ is appropriately-sized MLWE noise.

Anyway, we invoke MLWE a few times, namely to show that $b_A, b_B$, and $b'$ are indistinguishable from uniform. These are precisely what are fed into the compression algorithm.

Second Bullet Point:

I believe all that's happening here is that $\mathsf{Decompress}(\mathsf{Compress}(x,d),d)$ satisfies a certain bound (see equation 1). What they have here isn't precisely this expression --- but it is close. In particular, if they had written $\lfloor q/2\cdot m'\rceil$ rather than $\lfloor q/2\rceil \cdot m'$, it would be exactly this expression.

I'm honestly kind of surprised their definition here --- $\lfloor q/2\rceil\cdot m'$ is really the more natural way to define decompression from my understanding of things --- then compression is simply solving CVP on a lattice $\lfloor(q/2^{d})\rceil\mathbb{Z}^n$, and decompression is encoding $m\mapsto \lfloor q/2^d\rceil\mathbb{Z}^n$ back into this lattice. I imagine this is a small difference, but perhaps doesn't help understanding Kyber's exact specification. But also, I only now found out that my (conceptual) understanding of Kyber differs from the exact specification in this point, so perhaps I'm the wrong person to go to for clearing up this particular difference.

That being said, if you redefine compression to compute $m\mapsto \lfloor q/2^d\rceil\cdot m$, everything else still works, you get the above nice conceptual definition, and then the expression $\lfloor q/2\rceil \cdot m'$ is precisely $\mathsf{Decompress}(\mathsf{Compress}(m,d),d)$, and one gets the bound you are curious about via the equation just under equation 1.

Last bullet point

It might not be fundamental. For simplity, consider the case that $4\mid q$. Then $2\lfloor q/4\rceil = q/2$ exactly. In this setting, it should be simple to see that $m = m'$ (otherwise, $m-m'$ has a non-zero component, and the inequality shows that $(q/2) \leq \lVert (q/2)(m-m')\rVert_\infty < q/2$, a contradiction). This is to say that the argument works for $4\mid q$ as well. It's likely that one can get something to work when $2\mid q$ as well, but the above shows that the restriction to odd $q$ is not necessary.

+ 1

P_Gate

11/21/23, 10:26 AM

I can't quite understand your comments on the second point, can you maybe concretize it a bit? I understand the third point, the example with 4|p is also well chosen, but for 2 as a divisor of p, I don't see such a nice transformation, which then leads to a contradiction.

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: Question on the proof of correctness in CRYSTALS-Kyber

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.