Score:1

Given $i$ keyed-$PRP$ labels $\ell_{i,x}$ from a $2^{256} \times 2^{256}$ Sudoku (Latin-square), how difficult is it for an adversary to solve?

in flag

There's a keyed-permutation I'm playing with, $\ell_{i,x} = \pi_i(x_i)$, which is a bijection $X \leftrightarrow X$, where $|X| = 2^{256}$, and whose evaluations on plaintext inputs $x_i$ perfectly fill out a Latin-square, $L$, when an appropriate incrementing function, $\pi_{i+1} = \pi_i\mathrm{.step()}$, is applied on the inner-state.

Each $x_i$ is the $i$-th block of some plaintext. Each row in $L$ is the enumerated list of all evaluations $\pi_i(x_i)$ for each $x_i \in X$ for the static step $i$ of the row. Each column in $L$ is the enumerated list of all evaluations $\pi_i(x_i)$ for the static $x$ of the column for each step $i < 2^{256}$. When $L$ is completed, every label only appears once in each row and column, and in total, each label appears $2^{256}$ times. However, in execution, ignoring the obvious computational limitations, $L$ is never completed, as only one $\ell_{i, x}$ is ever evaluated at each step $i$.

\begin{array}{|c|c|c|c|c|c|c|} \hline \\ \space\space\space\space\space\space i \space\space \backslash \space x & 0 & 1 & \space\space\space \cdots \space\space\space\space & 2^{256}-2 & 2^{256}-1 \\ \hline 0 & & \ell_{0, 1} \\ \hline 1 & & & & & \ell_{1, 2^{256}-1} \\ \hline \\ \vdots & & & \ddots \\ \\ \hline 2^{256}-3 & & \ell_{2^{256}-3, 1} \\ \hline 2^{256}-2 & \ell_{2^{256}-2, 0} \\ \hline 2^{256}-1 & & & & \ell_{2^{256}-1, 2^{256}-2} \\ \hline \end{array}

Fig. 1 | Example distribution of how labels $\ell_{i, x}$, as evaluations of $\pi$ on non-unique inputs $x$ at step $i$, may map to a partial Latin-square.

There's tons of confusing, seemingly contradictory statements about the hardness of solving Latin-squares. From it being reducible to 3SAT, to it being $\mathcal{O}(1)$, to isotopies being distinguishable in $\mathcal{O}(n^{\log_2(n)})$, I can't wrap my head around how solvable a concrete instance like this is. So, assuming only a simplified case where $\pi$ is some ideal random permutation, I'd like to know the following:

  1. How many labels $\ell_{i, x}$, with unknown $x_i$, does an adversary need for $L$ to be at all uniquely solvable?
  2. How does hardness scale as more $\ell_{i, x}$ are revealed?
  3. What's $\mathbf{NP}$ got to do with the hardness of this Latin-square?
  4. And, is a similar Latin-cube problem easier or harder?
fgrieu avatar
ng flag
Is "Sudoku" in the title used as an equivalent of "latin square", without any of the additional properties usually implied by "Sudoku", that is groupings beyond line and columns where each value appears exactly once? Here that could be dividing lines in $2^{128}$ groups of $2^{128}$, same for columns, forming $2^{256}$ sub-squares of size $2^{128}\times2^{128}$ bound to each contain each of the $2^{256}$ values exactly once.
in flag
AFAIK, the permutation does not produce any additional structure beyond unique labels for rows and columns.
kodlu avatar
sa flag
What do you mean "AFAIK, ..." you designed it but you suspect it does not?
in flag
@kodlu, precisely. There's no indication that it would behave that way, but I have not proved that it does not have sub-grid structure.
Score:1
in flag

A best attempt at answering my own question, provided for archival purposes.

Preliminary research insights:

  • Regarding ERT12, without fully understanding how to translate the findings:

    A. If the problem is massaged into one without sub-grid constraints, it can be represented by $(|X| \times 2 \cdot |X| + |X|^2)$ constraints in +$1$-in-$k$-SAT form, which is $(3 k^2)$ constraints, where $k = |X|$.

    B. Adding $i$-many $\ell_{i,x}$ clues / labels reduces the constraints to $c = (3 k^2 - 3 i)$.

    C. Any +$1$-in-$k$-SAT form problem can be written as a "$k$-SAT problem and $\frac{1}{2}k(k-1)$ of $2$-SAT problems." So, that would be $c$-many $k$-SAT problems and $\frac{1}{2}ck(k - 1)$-many of the $2$-SAT problems.

    D. Assuming $r$-many runs of an $\mathcal{O}{(y)}$ algorithm can be denoted as $r \cdot \mathcal{O}{\left( y \right)}$, then $r \cdot \mathcal{O}{\left( g(n) \right)}$ $\in $ $\mathcal{O}(r \cdot g(n))$;

    E. And, Assuming $n = c$ (unjustified??...)

    F. If $\mathcal{O}{\left( k\mathrm{-SAT} \right)}$ $\in $ $\mathcal{O}(f(n))$, then $c \cdot \mathcal{O}{\left( k\mathrm{-SAT} \right)}$ $\in $ $\mathcal{O}(c \cdot f(c))$

    G. If $\mathcal{O}{\left( 2\mathrm{-SAT} \right)}$ $\in $ $\mathcal{O}(n^2)$, then $\frac{1}{2} ck(k-1) \cdot \mathcal{O}{\left( 2\mathrm{-SAT} \right)} \in \mathcal{O}{\left(k^2 c \cdot c^2 \right)}$

    H. So, if $s(L_i)$ is an algorithm $s$ which solves $L$ given $i$ labels, then

$$\mathcal{O}{\left( s(L_i) \right)} \in \mathcal{O}{\left( (k^2 - i) f(k^2 - i) + k^2(k^2 - i)^3 \right)}$$

  • And, IP01 offers:

    A. If $f(n)$ is an exponential function of the form $2^{\delta n}$, where $\delta > 0$, then as $k \to \infty$, $\delta$ is proven to be "strictly increasing infinitely often" ($-§$Intro):

    B. Then, the polynomial terms are negligible, giving

$$\mathcal{O}{\left(s(L_i)\right)} \in \mathcal{O}{\left((k^2 - i) \space 2^{\delta \left( k^2 - i \right)} \right)}$$

Best-case scenario for solver: \begin{bmatrix} ? & \ell_{0,1} & \ell_{0,2} & \ell_{0,3} \\ \ell_{1,0} & ? & \ell_{1,2} & \ell_{1,3} \\ \ell_{2,0} & \ell_{2,1} & ? & \ell_{2,3} \\ \ell_{3,0} & \ell_{3,1} & \ell_{3,2} & ? \\ \end{bmatrix}

Worst-case scenario for solver: \begin{bmatrix} \ell_{0,0} & \ell_{0,1} & \ell_{0,2} & \ell_{0,3} \\ \ell_{1,0} & ? & ? & \ell_{1,3} \\ \ell_{2,0} & ? & ? & \ell_{2,3} \\ \ell_{3,0} & \ell_{3,1} & \ell_{3,2} & \ell_{3,3} \\ \end{bmatrix}

In the best-case, there's no ambiguity to the puzzle. Each row and column is missing at most one label. Surprisingly, the complexity for solving is still $\mathcal{O}(m \cdot k)$. This is seen from the solving algorithm: "for each row, $m$, missing a label, create a copy of set $X \to C$, then remove each existing label in the row from the set $C$, the remaining label fills in the blank space."

In the worst-case for a solver, the clues leave a sub-square in the center of the larger Latin-square, whose rows & columns are missing the same labels. It can be seen that the sub-square is itself a Latin-square whose sides are of length $\sqrt{\left(k^2 - i\right)}$ with no provided clues. So, in the worst-case, the solution to the puzzle is perfectly concealed.

Score:1
ng flag

How many labels $\ell_{i, x}$, with unknown $x_i$, does an adversary need for $L$ to be at all uniquely solvable?

J. H. Van Lint and R. M. Wilson's A course in combinatoric show that the number $L(n)$ of latin squares of order $n$ verifies $$L(n)>\frac{(n!)^{2n}}{n^{(n^2)}}$$ That approximation is very much by default: it's low by a factor $>2^{22}$ for $n=11$, and that factor grows with $n$.

This helps proving that for $n\ge8$, it holds $L(n)>2^{(n^2)}$. This approximation is even more very much by default: it's low by a factor $>2^{38}$ for $n=11$, and that factor grows with $n$.

Therefore, $L(2^{256})>2^{(2^{512})}$. In other words, $2^{512}$ bits of information is not enough to designate an arbitrary latin square as considered in the question.

Since revealing an $\ell_{i,x}$ reveals at most $2^8$ bits of information, we need on average more than $2^{504}$ values of $\ell_{i,x}$ (that is, more than one cell in $256$) to uniquely designate an arbitrary latin square as considered in the question.

We just demonstrated that the general problem can't be used as the basis of anything practical, even in symmetric crypto (much less in asymmetric crypto). That remains true if we changed $256$ to $64$ in the problem statement. That parameter is just too high.

How does hardness scale as more $\ell_{i,x}$ are revealed?

It can only decrease. However, since we need over $2^{512}$ bits of input, it remains infeasible.

What's $\mathbf{NP}$ got to do with the hardness of this latin square?

We have not tackled the problem of if solving a latin square with about the minimal necessary amount of information for that is

  • polynomial in that information's size,
  • or non-polynomial assuming that's a different complexity class.

Nor if, assuming the later, there's a phase transition to polynomial/feasible past a certain threshold, as in 3SAT. I would not be surprised of that.

Is a similar latin-cube problem easier or harder?

That would depend on what is taken as the security parameter. I think it's easier if that's the number of cells or amount of information, harder if that's the side size.


Note: this answer ignores whatever "Sudoku" in the title could imply beyond latin square.

in flag
Thank you! This was extremely helpful. I found $\pi$ by accident while trying to create something that ensured a repeated plaintext block never mapped to a repeated label (ciphertext block). Which I think may be useful in some use cases, particularly contingent on if the added constraints don't lead to non-negligible predictive advantages on future labels. Also, I saw this paper examining secret sharing using Latin-squares (https://eprint.iacr.org/2023/333.pdf).
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.