Score:2

Why does NTRUEncrypt fail on different values for large modulus?

dm flag
max

I am trying to closely follow the algorithm here (keeping the same variable names) and reconstruct the cryptosystem in Sage Math engine. It seems to work on parameters (N, p, q) = (11, 3, 31) but raises assertion error at the end on (N, p, q) = (11, 3, 29). Please see the code below:

N = 11
p = 3
q = 31 # fails for 29

R.<x> = PolynomialRing(ZZ)
RR = R.quotient(x^N - 1)
P.<x> = PolynomialRing(Zmod(p))
PP = P.quotient(x^N - 1)
Q.<x> = PolynomialRing(Zmod(q))
QQ = Q.quotient(x^N - 1)

f = RR([-1, 1, 1, 0, -1, 0, 1, 0, 0, 1, -1])
g = RR([-1, 0, 1, 1, 0, 1, 0, 0, -1, 0, -1])

fp = PP(f) ^ -1
fq = QQ(f) ^ -1
assert fp * f == 1
assert fq * f == 1

h = QQ( (p * fq) * g )
m = PP([-1, 0, 0, 1, -1, 0, 0, 0, -1, 1, 1])
r = RR([-1, 0, 1, 1, 1, -1, 0, -1, 0, 0])
e = QQ(r * h + m)
a = QQ(f) * e
a = ZZ['x']([coeff.lift_centered() for coeff in a.lift()])
b = PP(a)
c = fp * b
assert c == m

Am I doing something wrong here?

P.S. To quickly check Sage piece of code you may use this site: https://sagecell.sagemath.org/

Score:1
ru flag

There's nothing incorrect, lattice-based cryptography has a chance of decryption failure and this chance increases as degree and noise grow relative to the modulus. I think that it's quite instructive to look at the actual numbers here in order to get a feel for some of the uncertainty that is inherent in lattice-based cryptography. If we expand your sage code to generate the two cases side by side

N = 11
p = 3
q1 = 31
q2 = 29

R.<x> = PolynomialRing(ZZ)
RR = R.quotient(x^N - 1)
P.<x> = PolynomialRing(Zmod(p))
PP = P.quotient(x^N - 1)
Q1.<x> = PolynomialRing(Zmod(q1))
QQ1 = Q1.quotient(x^N - 1)
Q2.<x> = PolynomialRing(Zmod(q2))
QQ2 = Q2.quotient(x^N - 1)


f = RR([-1, 1, 1, 0, -1, 0, 1, 0, 0, 1, -1])
g = RR([-1, 0, 1, 1, 0, 1, 0, 0, -1, 0, -1])

fp = PP(f) ^ -1
fq1 = QQ1(f) ^ -1
fq2 = QQ2(f) ^ -1
assert fp * f == 1
assert fq1 * f == 1
assert fq2 * f == 1

h1 = QQ1( (p * fq1) * g )
h2 = QQ2( (p * fq2) * g )
m = PP([-1, 0, 0, 1, -1, 0, 0, 0, -1, 1, 1])
r = RR([-1, 0, 1, 1, 1, -1, 0, -1, 0, 0])
e1 = QQ1(r * h1 + m)
e2 = QQ2(r * h2 + m)
a1 = QQ1(f) * e1
al1 = ZZ['x']([coeff.lift_centered() for coeff in a1.lift()])
a2 = QQ2(f) * e2
al2 = ZZ['x']([coeff.lift_centered() for coeff in a2.lift()])
b1 = PP(al1)
b2 = PP(al2)
c1 = fp * b1
c2 = fp * b2

note that I've also used separate variables for $a(x)\mod q$ and its lift to $\mathbb Z[x]$. If we compare a1 and a2 we see

sage: a1
27*xbar^10 + 3*xbar^9 + 30*xbar^8 + 4*xbar^7 + 15*xbar^6 + 10*xbar^5 + 4*xbar^4 + 20*xbar^3 + 27*xbar^2 + 24*xbar
sage: a2
25*xbar^10 + 3*xbar^9 + 28*xbar^8 + 4*xbar^7 + 15*xbar^6 + 10*xbar^5 + 4*xbar^4 + 18*xbar^3 + 25*xbar^2 + 22*xbar

these are reductions of the same polynomial $$a(x)=r(x)g(x)p+f(x)m(x)$$ $$=-4x^{10}+3x^9-x^8+4x^7+15x^6+10x^5+4x^4-11x^3-4x^2-7x$$ over the integers and for decryption we would like to lift each of the reductions of this polynomial to the original integer version. The coefficients of the integer polynomial are sums of products of small integers and so we hope that they do not grow to large. In particular, if all of the coefficients lie in the range $(-q/2,q/2)$ then the reduction can be interpreted as residues in this interval and lifted back to the original polynomial without error. However, if we look at the coefficient of $x^6$ t is 15. Although this lies in the interval $(-31/2,31/2)$, it does not lie in the interval $(-29/2,29/2)$ so that our reduction is not trivial and our lift does not work. Specifically, if we look at al1 and al2 we have

sage: al1
-4*x^10 + 3*x^9 - x^8 + 4*x^7 + 15*x^6 + 10*x^5 + 4*x^4 - 11*x^3 - 4*x^2 - 7*x
sage: al2
-4*x^10 + 3*x^9 - x^8 + 4*x^7 - 14*x^6 + 10*x^5 + 4*x^4 - 11*x^3 - 4*x^2 - 7*x

and in the second case we have recovered the wrong polynomial over the integers causing future steps to fail. The larger $q$ is, the less likely it is for coefficients of the integers polynomial to fall outside of the recoverable range and so decryption failures reduce. However, increasing $q$ can make life easier for the attacker and a trade-off must be reached.

In the case of decryption failure, it is possible to correct such errors by scanning the lifted polynomial for coefficients that are close to $q/2$ and then seeing what happens if we adjust these suspect coefficents by $\pm q$. However, this leads to non-constant decryption time and creates all manner of sidechannel, spoofing and fault injection vulnerabilities.

max avatar
dm flag
max
Thank you very much for this great explanation. Would you like to take a look at my follow-up question, maybe? https://crypto.stackexchange.com/questions/102999/ntruencrypt-fails-on-complex-algebra
Mark avatar
ng flag
It is worth mentioning your suggestion at the end does show up in practice. Instead of precisely what you write though, one *preemptively* checks where various decryption errors could occur (say for worst-case errors), and include information to shift those (possibly) suspect coefficients. This processes is called "error reconcilation", and has shown up in some (earlier) lattice-based KEM proposals, for example the initial version of NewHope, as well as some lattice-based compression mechanisms, namely BDGM19's work on [rate 1 FHE](https://eprint.iacr.org/2019/720.pdf).
Daniel S avatar
ru flag
@Mark I agree that there's much more that could be said about information reconciliation, hints and inducing faults (e.g. by illegal choices of $r(x)$), but I'm not sure where the best place to stop would be. It's a broad topic.
max avatar
dm flag
max
@DanielS : Would you please mind looking at my follow-up question? It might be about the illegal choice of $r$ indeed. https://crypto.stackexchange.com/questions/103511/ntruencrypt-fails-on-quaternion-algebra
Score:1
ng flag

I won't check details now, but I doubt you are doing anything wrong. The modulus $q$ can be chosen lower or higher to achieve different decryption failure rates. I would guess that $q\approx 31$ is on the cusp of being correct for your chosen parameters. This is to say that, even for "correct" NTRU implementations, if one misparameterizes them it is expected for them to produce the wrong answer.

To check, you would need to look into an analysis of the decryption failure rate of your particular instantiation of NTRU (or rederive it). For example, see theorem 1 of this. I'll copy it briefly below

Theorem 1: The algorithms $\mathsf{KeyGen}, \mathsf{E}$, and $\mathsf{D}$ with parameters $p = 3$ and $q > 8\sqrt{n}$ are a correct probabilistic encryption scheme.

I would not blindly assume this theorem applies to the variant of NTRU that you have implemented. But something similar to it should, i.e. there should be some explicit threshold value of $q_{thresh}$ ($8\sqrt{n}$ in the above) such that

  1. for $q > q_{thresh}$, decryption is always correct (or correct with high probability --- there are many different analysis you can do)
  2. for $q < q_{thresh}$, decryption starts failing (or starts failing with noticeable probability).

If $8\sqrt{n}$ is the right threshold for your cryptosystem, then $8\sqrt{11}\in [24, 32]$, i.e. where you have started noticing decryption failures is approximately where the threshold is. But this threshold value depends on many of the specifics of the underlying scheme (for example, the particular noise distributions one samples the various keys from), which many different authors make mild modifications to without claiming to fundamentally change the underlying cryptosystem, so one should check everything matches up before saying something conclusive.

max avatar
dm flag
max
Thank you very much for this insight, would you like to take a look at my follow-up question too? https://crypto.stackexchange.com/questions/102999/ntruencrypt-fails-on-complex-algebra
Mark avatar
ng flag
@max I'm unfamiliar with that cryptosystem, and lattice attacks against NTRU have gotten much stronger in recent years (look up "NTRU overstretched"), so would not abstractly trust what it says with respect to being resistant to lattice attacks.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.