Score:1

Questions on LWE with a repeated secret matrix S

sy flag

Consider a formulation of LWE where we are given either $(x,S x+e)$ or $(x,u)$ --- where $S$ is an $m \times n$ secret/hidden matrix, $x$ is a randomly sampled $n \times 1$ vector, $e$ is an $m \times 1$ Gaussian error vector, and $u$ is a uniformly random sample --- and told to distinguish between these two cases. This should be hard for classical algorithms, according to the post here. Call this problem "reverse-LWE."

I had a few questions about the setting.


Is the distinguishing problem hard without $e$?

Note that in standard LWE, when we are given $(A,A s+e)$ or $(x,u)$, and told to distinguish between the two cases, the problem is easy without the error. We just solve a system of linear equations to get the $n$ entries of $s$.

However, here we need to find $m \times n$ entries of our secret matrix $S$. I do not see how to do that with just $m$ equations.


Consider a variant of the problem, where we are given either $$ \{ (x_1, Sx_1 + e_1), (x_2, Sx_2 + e_2), \ldots, (x_k, Sx_k + e_k)\} ~~\text{or}$$

$$\{ (x_1, u_1), (x_2, u_2), \ldots, (x_k, u_k)\},$$

and told to distinguish between the two cases. Call this problem "reverse LWE with a repeated secret." $k$ is polynomially large in $n$. Is this problem hard?

Note that a hybrid argument (like the one used in one of the answers here) indicates that the problem remains hard. Here is the hybrid $H_i$:

$$H_i = \{ (x_1, Sx_1 + e_1), (x_2, Sx_2 + e_2), \ldots, (x_i, Sx_i + e_i), (x_{i+1}, u_{i+1}) \ldots, (x_{k}, u_{k}) \} .$$

Then, there is a direct way to conclude that if we solve "reverse LWE with a repeated secret," we can solve reverse LWE. Since reverse LWE is hard, our problem should also be hard.

However, I am having difficulty wrapping my head around this fact.

Note that if we do not have the error term, there is a very easy way of distinguishing between the two cases, for $k \geq n+1$. Note that there can be only $n$ linearly independent $x_i$-s. So, the distinguisher just looks for $n$ distinct $x_i$-s in the given samples, notes where the matrix $S$ takes these vectors to, and for the $n+1^{\text{th}}$ distinct sample, uses linearity to first compute where $S$ takes it to, and then checks whether that's consistent with what was given.

Why is it that the error terms make this distinguisher fail? Even with a Gaussian error, because of linear dependence, shouldn't the $n+1^{\text{th}}$ distinct sample be sufficiently concentrated around some value for my distinguisher to succeed?

Score:3
ru flag

The distinguishing problem with a single sample $x$ is impossible.

This is because for any non-zero $x$ and any $u$ there exists an $S$ such that $Sx=u$.

ETA 20220405:

For the broader question of distinguishing $(\mathbf x_i,S\mathbf x_i+\mathbf e_i)$ from $(\mathbf x_i,u_i)$ with unknown $S$, we can write $X_{i,j}$ for the $m\times m$ diagonal matrix with constant diagonal of the $j$th entry of $\mathbf x_i$. Then the rows of the $mn\times km$ matrix $$\left[\begin{matrix} X_{1,1} & X_{2,1} & X_{3,1} &\ldots & X_{k,1}\\ X_{1,2} & X_{2,2} & X_{3,2} &\ldots & X_{k,2}\\ \vdots & \vdots & \vdots & & \vdots\\X_{1,n} & X_{2,n} & X_{3,n} &\ldots & X_{k,n}\\\end{matrix}\right]$$ form a lattice where the vector $((S\mathbf x_1+\mathbf e_1)^T,(S\mathbf x_2+\mathbf e_2)^T,(S\mathbf x_3+\mathbf e_3)^T,\ldots,(S\mathbf x_k+\mathbf e_k)^T)$ is a close vector (the difference vector has components that are the entries of the $\mathbf e_i$). For large $k$, a vector this close is highly unlikely to arise from a uniform distribution. This only tells us that the information for a distinguisher exists; finding such a close vector will be computationally very hard as $n$ grows and the variance of the Gaussian distribution grows.

BlackHat18 avatar
sy flag
This makes me very confused. Consider the problem of distinguishing $$ \{ (x_1, Sx_1), (x_2, Sx_2), \ldots, (x_k, Sx_k)\} ~~\text{or}$$ $$\{ (x_1, u_1), (x_2, u_2), \ldots, (x_k, u_k)\}.$$ This should also be hard then, by a hybrid argument. However, we know this problem is easy for $k > n +1$ by the distinguisher I outlined (checking for linear dependence and noting that there can only be $n$ linearly independent $x_i$-s.) How can both be true?
Daniel S avatar
ru flag
It's because linear systems have a sharp transition between underdetermined $k<n$ (large family of solutions), determined $k=n$ (exactly one solution) and overdetermined $k>n$ (zero or one solution). An overdetermined system is highly likely to have no solutions so the existence of a single solution is a strong distinguisher.
BlackHat18 avatar
sy flag
I didn't follow. Does it mean that the two cases: $$ \{ (x_1, Sx_1), (x_2, Sx_2), \ldots, (x_k, Sx_k)\} ~~\text{or}$$ $$\{ (x_1, u_1), (x_2, u_2), \ldots, (x_k, u_k)\},$$ are actually indistinguishable? Doesn't the hybrid argument say that they are?
Daniel S avatar
ru flag
They're indistinguishable for $k$ linearly independent $x_i$ with $k<n$. Note that unlike my answer to [your previous question](https://crypto.stackexchange.com/questions/99051/lwe-with-the-matrix-a-repeated) I cannot produce additional samples as $S$ is secret.
BlackHat18 avatar
sy flag
Thanks! It's clear now. One last question, for the noisy case, what can we say for the $k > n$ case? That is, are $$ \{ (x_1, Sx_1 + e_1), (x_2, Sx_2 + e_2), \ldots, (x_k, Sx_k + e_k)\} ~~\text{or}$$ $$\{ (x_1, u_1), (x_2, u_2), \ldots, (x_k, u_k)\}$$ distinguishable for $k > n$? I couldn't prove anything as, like you said, reductions that involve producing additional samples do not work.
Daniel S avatar
ru flag
There's a Close Vector Problem that can be set up for large $k$, but judicious choices of $e$ and $n$ should render it thoroughly intractable.
BlackHat18 avatar
sy flag
Might you elaborate about this case (ie, the noisy case for $k > n$) in your answer?
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.