Score:1 Crypto

# Solving system of linear equtions over binary field with error I have system of linear equations $$f_1, \ldots, f_m$$ over binary variables $$x_1,\ldots,x_n$$ where $$m$$ is much larger than $$n$$. We know if all equations are correct, we can find solution easily using Gaussian elimination. Among those $$m$$ equations, 90% equations are correct. For the remaining 10%, constant terms are altered. So if the actual constant term is 0, it is given 1 etc. Can we solve the system in polynomial time? Instead of 90%, if it is 99%, can we solve in polynomial time?  I would note that solving for $n=256$, $p=0.9$ is tractable (by the simple expedient of selecting 256 random equations; with probability circa $2^{-39}$, all 256 equations are correct, and so Gaussian Elimination gives the right answer (which is easy to verify).  Thanks. But if n is getting larger say n=1024, we can not solve using this idea.  Not for $p=0.9$; it'd still be quite tractable with $p=0.99$  And a schemes based on this.. [Are there any signatures based on matrix](https://crypto.stackexchange.com/q/37562/18298)
Score:3 Crypto The Learning Parity with Noise (LPN) problem is as follows

Let $$\vec s\in\mathbb{F}_2^n$$ be a fixed secret, and $$\mathcal{O}_{\vec s}$$ be an oracle that samples $$\vec a_i\gets \mathcal{U}(\mathbb{F}_2^n)$$, $$e_i \gets \mathsf{Bern}(\tau)$$, and returns $$(\vec a_i, \langle \vec a_i, \vec s\rangle + e_i)$$

The (search) LPN problem is, given query access to the oracle $$\mathcal{O}_{\vec s}$$, to recover the secret $$\vec s$$.

If you restrict some query bound of $$m$$ on how many times you call $$\mathcal{O}_{\vec s}$$, the problem is precisely the problem you're interested in.

When the noise rate $$\tau$$ is constant (as a function of $$n$$), iirc the best-known complexity for solving LPN is the Blum, Kalai, Wasserman (BKW) algorithm, which runs in time, memory, and sample complexity $$2^{O(n/\log n)}$$. So we shouldn't (asymptotically) expect poly-time complexity.

Concretely though, for small enough $$p$$ the situation is tractable. For more reading on this, see LPN Decoded. I have included an image of the running time of various LPN algorithms below. Here, $$\tau \in [0, 1/2]$$ is the "noise rate", and can be written as $$\tau = \min(p, 1-p)$$ in your notation. Note that for $$p = 0.99$$, we have that $$\tau = 1 / 100$$. Then theorem 5 of the linked paper solves LPN with time/query complexity

$$\tilde{\Theta}\left(\left(\frac{1}{(1-\tau)^n}\right)^{\frac{1}{1+\log\left(\frac{1}{1-\tau}\right)}}\right).$$

This gives time/query complexity $$\approx (100/99)^{\frac{n}{1+\log(100/99)}}$$, which while not polynomial time, should be quite reasonable for moderately-sized $$n$$.  What is "Samples" column? Is it m in my problem? In my case m is around 4n.  Yes it is. That will make your problem somewhat harder than the problem this paper solves. It's worth mentioning though that a similar problem (the "LWE problem") has what is called a "randomized self-reduction" --- given a number of (fixed) samples, you can make an *arbitrary* number of samples (although at higher noise rate). I don't know if LPN also has such a self-reduction, but I would expect that it does. In particular, if you "stack" your samples $\vec a_i$ into a matrix $A$, you can write them as $(A, As + e)$. You should then be able to make a new sample via $(xA, x(As + e))$ for ...  $x$ suitably distributed (probably uniformly random). I haven't formally analyzed this though, and in briefly searching "LPN randomized self-reduction" haven't found anything, so it is possible this sketch of a reduction isn't useful/is invalid for some reason.  I found a result, see [here](https://cseweb.ucsd.edu/~vlyubash/papers/parityproblem.pdf). Note that this particular result is too weak to "amplify samples" in your situation, but it is possible that follow-up work gets something strong enough.  Thank you so much. I will look into those papers.  