LWE KEMs and message coding

yoyo

4/10/24, 8:44 PM

In many proposed lattice PKE schemes, the plaintext is encoded or modulated in a simple fashion, e.g. using Kyber-ish notation:

key gen: $pk=(A, t=As+e)$, $\quad sk=s\quad$ ($A$ random, $s$, $e$ small random)
encrypt: $u=A^tr+e_1$, $\quad v=\langle r,t\rangle +e_2+\tilde{m}\quad$ ($m$ message, $r$, $e_1$, $e_2$ small random)
decrypt: $v-\langle s,u\rangle=(\langle r,e\rangle+e_2-\langle s,e_1\rangle)+\tilde{m}=\text{''noise'' } + \tilde{m}\approx\tilde{m}$

where $\tilde{m}$ encodes plaintext bits of $m$ as $0\mapsto 0$ or $1\mapsto\frac{q-1}{2}$. If the "noise" term in parentheses above is small enough, i.e. less than $q/4$ in magnitude in each coordinate, then the message can be decoded or demodulated.

My question is: would these schemes benefit from a more complicated encoding than $\tilde{m}$ as described? Wouldn't it be better to encode $m$ with some appropriate error-correcting code so as to either

tolerate larger errors, or
avoid decryption failures?

I assume if there were any cool tricks from noisy analog-to-digital communications, they would have been incorporated, but is the above the best that can be done?

Recent paper: Error Correction and Ciphertext Quantization in Lattice Cryptography

1 + 0

post-quantum-cryptography

reference-request

lattice-crypto

lwe

kyber

Score:2

Crypto

Mark

4/13/24, 1:00 AM

In the case of lattice-based public-key encryption, the answer is mostly "no". For security reasons, we need $n\geq 512$ (roughly). If we assume the moduli $q$ is a free parameter, then we want to send

a payload of size at least $n\log_2 q\approx 512\log_2 q$ bits, to
transmit generally $128$ to $256$ bits of data (i.e. a shared secret key)

The rate of the scheme is therefore required to be between $\frac{1}{4\log_2 q}$ and $\frac{1}{2\log_2 q}$. Even for aggressively small $q$ (say $\log_2 q \approx 8$, which I believe appeared in schemes such as LAC and ROUND5), one can easily get by with quite bad encoding techniques.

Instead, PKE often gets smaller ciphertexts by adding a post-processing compression step. For a value modulo $q$, one often chops off the lowest-order (noisy) bits. The linked paper calls this the quantized Regev cryptosystem, i.e. $\mathsf{LWE}[(q/p)\mathbb{Z}^n, k\mathbb{Z}^n]$ in Section 4.1. This can help one get smaller ciphertexts, but is somewhat limited in effectiveness due to the results of Section 5.3.

One can do better, but at least in the paper you linked this is only done for FHE relevant parameters. Essentially, the issue is that $E = (q/p)\mathbb{Z}^n, Q = k\mathbb{Z}^n$ have "shared low-dimensional structure", which leads to strong bounds on the rate of the resulting ciphertexts (this is the aforementioned Section 5.3). There is a particular object $Q = \Lambda_{q/p}((1,1,\dots,1)^t)$ known such that post-processing ciphertexts via compressing with $Q$ leads to much smaller ciphertexts. The issue is that to maintain correctness one requires something like

$$q/p \geq \Omega(n^2),$$

for correctness, which for $n\approx 512 = 2^9$ means $q/p\geq 2^{18}$, larger than one wants for PKE ($q\approx 2^{13}$ is more typical). This is to say one can send vastly more data (which we don't care about for PKE) at the cost of slightly bigger ciphertexts, so it is a tradeoff that isn't super worth it. Note that it is plausible there exist $Q$ without this limitation, just I don't believe they are known currently. Also, such $Q$ is plausibly useful for FHE (and was initially introduced for this purpose some years ago, though it appears to mostly have been of theoretical interest so far).

It is worth mentioning there may be a social reason the work you describe has not been done. As mentioned before, LAC and ROUND5 tried doing something similar to what you are suggesting, namely using fancier coding-theoretic techniques to shrink ciphertexts while (hopefully) maintaining the same decryption failure rate. Note that the fancier coding-theoretic techniques are different than what your linked paper talks about (they use standard error-correcting codes, the linked paper uses a "euclidean" variant of these). Moreover, it was not properly done, leading to attacks. I haven't read this paper in some years, but the short of it is (I think)

often in analysis algebraically structured cryptosystems assume the noise in each coordinate is independent, even though this isn't true
the noise in each coordinate instead ended up being correlated
errors therefore come in "bigger clumps" than one expects
therefore the error correction is less likely to be useful (the clumps either mean error-correction isn't needed, or it "overwhelms" the error-correction capacity all at once).

The paper you link to is in the algebraically unstructured setting, where this isn't an issue (for example, in Table 1 of the linked attack paper FrodoKEM, the only algebraically unstructured scheme, has no security reduction at all). It is plausible people could work out the details in the algebraically structured setting correctly, but it just hasn't been done yet.

+ 0

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: LWE KEMs and message coding

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.