Score:12

Number of bit-operations required for information set decoding attacks on code-based cryptosystems?

dk flag

This question is potentially relevant to NIST post-quantum cryptography standards, involving code-based cryptosystems such as McEliece, BIKE and HQC.

This paper estimates the concrete number of bit operations required to perform the May–Meurer–Thomae (MMT) information set decoding (ISD) attack. It shows that this is significantly less than for any of the other ISD variants considered, including the BJMM algorithm, which is widely considered to be an improvement over MMT. (E.g., for classic McEliece's 8192128 parameter set, the paper gives the complexity of MMT as $2^{249.74}$ as opposed to $2^{299.89}$ for BJMM.) Is this analysis correct?

If the analysis is correct, it seems like this could threaten not just some of the McEliece parameters, but also some of the parameters of the other code-based schemes. If not, what's a good source for accurate numbers?

(The paper claims $2^{87.95}$ memory complexity for MMT, as compared to $2^{79.97}$ for BJMM, and $2^{67.64}$ for Stern's algorithm, see Table 5. This does not seem large enough to make up for the discrepancy on any plausible model for the cost of memory vs bit operations. MMT is claimed similarly cheaper for other parameter sets and schemes.)

This question is crossposted on the NIST PQC forum here.

Score:5
jp flag

I could not reproduce the exact bit complexities from the mentioned paper [1], the authors did not provide the source code. I'm posting my estimators for MMT and BJMM attacks here.

The conclusion that BJMM algorithm is worse than MMT is incorrect because MMT is a special case of BJMM. Briefly, BJMM is MMT with no representations of type $1 = 0+0 \bmod 2 $ for the error vector. The confusion comes from the fact the authors of [1] consider BJMM with a certain fixed depth of the search tree (namely, depth 3 as it is done in the original paper [2]. However, this may not be optimal for a given sparsity regime, hence, the wrong conclusion that BJMM is worse than MMT. I give more details below.

At the core of both MMT and BJMM algorithms lies the idea of ambiguously splitting the error vector $e \in \{0,1\}^k$ of weight $p$ as $e = e_1 + e_2$. MMT takes $e_1, e_2 \in \{0,1\}^k$ each of weight $p/2$, thus giving $\binom{p}{p/2}$ ways to represent $0$-coordinates as $0+1$ or $1+0$ in $e$. BJMM takes $e_1, e_2 \in \{0,1\}^k$ each of weight $p/2 + \varepsilon$, thus giving $\binom{p}{p/2} \cdot \binom{k-p}{\varepsilon}$ representations (the second multiple is the number of representations of type $0 = 1+1$).

It turns out that for the decoding problem in the dense regime, i.e., when the weight of $e$ is of order $\Theta(n)$, the BJMM algorithm is faster when we repeat the process by also representing $e_1$, $e_2$ in an ambiguous way. Thus, one ends up with a search-tree structure of the algorithm, where the optimal depth of the tree is a parameter to be optimize. From [2] for the dense regime it happens to be 3.

For Classic McEliece parameters, it turns out that doing depth-2 is optimal. Intuitively, the sparser the error, the shallower the optimal tree will be, because one cannot split a small weight in halves too many times.

In particular, I obtain the following bit securities (and memory complexities) with my estimator. These are likely to be underestimates, as $poly(n)$ factors are ignored. You can reproduce them by running the script.

-------------------- MMT --------------------

n = 3488 k = 2720 w = 64 {'runtime': 133.610045757394, 'mem': 61.4108701659799}

n = 4608 k = 3360 w = 96 {'runtime': 174.170500202444, 'mem': 76.7183814578096}

n = 6960 k = 5413 w = 119 {'runtime': 244.706198594600, 'mem': 123.451330788046}

n = 8192 k = 6528 w = 128 {'runtime': 277.268692835670, 'mem': 140.825234863797}

BJMM depth 2 slightly outperforms both MMT and BJMM depth 3.

----------------BJMM depth 2 ----------------

n = 3488 k = 2720 w = 64 {'runtime': 127.121142192395,'mem': 65.4912086419963,'eps': 4}

n = 4608 k = 3360 w = 96 {'runtime': 164.510671206084, 'mem': 88.1961572319148, 'eps': 6}

n = 6960 k = 5413 w = 119 {'runtime': 231.228308300186, 'mem': 118.193072674123, 'eps': 8}

n = 8192 k = 6528 w = 128 {'runtime': 262.367185095806,'mem': 134.928413147468, 'eps': 8}

----------------BJMM depth 3 ----------------

n = 3488 k = 2720 w = 64 {'runtime': 132.929177320070, 'mem': 30.8744409777593, 'eps': [2, 0]}

n = 4608 k = 3360 w = 96 {'runtime': 167.443669507488, 'mem': 45.4015594758618, 'eps': [6, 0]}

n = 6960 k = 5413 w = 119 {'runtime': 236.346159271338, 'mem': 67.5649403829532,'eps': [6, 0]}

n = 8192 k = 6528 w = 128 {'runtime': 269.431207750362, 'mem': 70.1193015124538, 'eps': [6, 0]}

[1] https://www.mdpi.com/1999-4893/12/10/209/pdf

[2] https://eprint.iacr.org/2012/026.pdf

Alessandro Barenghi avatar
mu flag
The source code is available [here](https://github.com/LEDAcrypt/LEDAtools), the link is provided as reference [26] in the paper.
sa flag
TMM
Should $1 = 0 + 0 \mod 2$ be $0 = 1 + 1 \mod 2$?
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.