Score:18

Quantum Computing Used to Break RSA by "fixing" Schnorr's Recent Factorization Claim?

sa flag

There is a claim by Chinese researchers making the rounds (Schneier's blog here) that RSA can be broken by Quantum Computers. The paper is on arXiv.

Wading through the discussion in Schneier's blog, and distinguishing between noisy qubits and physical cubits, the claim does not seem as catastrophic but also not as outlandish as some comments might suggest. I think this comment summarizes the issue:

Schnorr’s algorithm is much slower than he proposed in his paper, because he incorrectly estimated how many relations (components of the factoring process) would come from solving a lattice problem.

The lattice problem has to be much bigger, to get the job done (which kills Schnorr’s supposed efficiency).

If the Chinese team has found a way to use their not-so-huge QC to solve the scaled-up lattice problem in reasonable time — which is what their paper seems to claim — then perhaps this is the RSA killer.

If they publish factors for challenge number RSA1024 anytime soon, the consequences could be world-shaking.

Question: Does the comment above stand up under scrutiny. I am hopeful to get good answers here given recent discussions and answers to PQC and similar topics.

Edit: Peter Shor is skeptical on Twitter. Scott Aaronson is extremely skeptical on his blog.

fgrieu avatar
ng flag
Some links: [Two](https://crypto.stackexchange.com/q/88582/555) [questions](https://crypto.stackexchange.com/q/92080/555) about [Claus Peter Schnorr's article](https://eprint.iacr.org/2021/933). An updated [answer](https://crypto.stackexchange.com/a/59796/555) on largest integers factored by QC.
kelalaka avatar
in flag
They needed to break one of the RSA challenges if they really sure about their claim. Snake oil as it stands.
Score:16
ru flag

TL;DR

This appears unlikely to work at scale, at least with current parameter choices. There's a classical post-processing step which is worse than the quadratic sieve in complexity unless the quantum part of the algorithm starts returning answers that are significantly better than those produced in initial experiments.

I'll try and walk through what's going on step-by-step.

Factoring 101: difference of squares

The last step of the method is well-understood and is the basis of almost all modern factoring methods. To factor a number $N$ we try and find two congruent squares modulo $N$: $$a^2\equiv b^2\pmod N\iff a^2-b^2\equiv 0\pmod N.$$ Given two such integers $a$, $b$, we can use the high school identity $a^2-b^2=(a-b)(a+b)$ to deduce that $$(a+b)(a-b)\equiv 0\pmod N$$ and so every prime divisor of $N$ divides either $a+b$ or $a-b$. If the prime divisors divide different brackets (which they will a positive proportion of the time) we can take $\mathrm{GCD}(a+b,N)$ and $\mathrm{GCD}(a-b,N)$ using Euclid's algorithm and recover factors.

Index Calculus: Making squares from smooth numbers

Modern factoring methods approach the challenge of finding congruent squares by making squares from products of smooth numbers. These are integers all of whose prime factors lie below some bound $B$. We can write smooth numbers in index notation which is a $\pi(B)+1$ long vector which records the exponents of the prime factorisation of the number. For example if $B=10$ then smooth numbers are those whose only prime divisors are 2, 3, 5 or 7 (and also possibly the unit -1), and I can write -700 in index notation thus: $$-700=-1^1\times 2^2\times 3^0\times 5^2\times 7=(1,2,0,2,1).$$ Note that if I multiply two smooth numbers, I get a smooth number whose index representation is the sum of the two index representations. Also note that if all of the components of an index representation are even, this is precisely equivalent to a smooth number being a perfect square. This allows us to find products of smooth numbers that are squares using linear algebra on index representations: if we find more than $\pi(B)+1$ $B$-smooth numbers then there is a linear combination of their index representations all of whose entries are 0 mod 2 and we can find this linear combination e.g. with Gaussian elimination. Dixon's random squares method is a simple example of a factorisation method using this idea. The general number field sieve (the best known classical factoring algorithm) essentially uses a slightly more complex version of this idea.

There's a trade-off in the choice of $B$: choose $B$ too small and smooth number become rarer and very hard to find; choose $B$ too large and we have to find too many smooth numbers.

Schnorr: Finding smooth numbers using short vector methods

The previous two sections are well-established and quite well-understood, we now head into choppier waters. Established factoring algorithms have found the smooth numbers necessary to make squares by taking a large set of small candidates (small numbers are more likely to be smooth) and testing each of them for smoothness using souped-up trial division methods (sieving). Schnorr's lattice-factoring method aims to build a machine that spits out smooth numbers thereby massively saving on the search time. This also could allow us to use much smaller $B$ values.

Schnorr's idea is to find triples of smooth numbers $u$, $v$ such that $u-vN$ is very small indeed (and hence very likely to be smooth). This gives us $$u\equiv u-vN\pmod N$$ and index calculus can then find a subset of the triples such that the product of both the $u$ and the $u-vN$ are both squares and we're home. The new paper gives examples of the triples from their smaller experiments on pages 22 and 24 of their paper.

Schnorr's idea is to write $u=p_0^{e_0}\cdots p_k^{e_k}$ and $v=p_0^{f_0}\cdots p_k^{f_k}$ and then take real logarithms to write $$\log u=e_0\log p_0+\cdots +e_k\log p_k$$ $$\log vN=f_0\log p_0+\cdots +f_k\log p_k+\log N.$$ The idea is that if $|u-vN|$ is small, then $u\approx vN$ and so $\log u\approx\log(vN)$ and so $\log u-\log(vN)$ is small. Moreover $\log u-\log(vN)$ is an integer linear combination of the known real numbers $\{\log p_i:p_i<B\}$ and $\log N$ and spitting out smooth values $u$ and $v$ should be achievable by spitting out small linear combinations of logarithms. Finding small linear combinations is a problem that can be expressed in terms of a close vector problem. If we consider the lattice of integer linear combinations of rows of the matrix $$\begin{pmatrix}1 & 0 & 0 &\cdots & 0 & 0 & \log p_0\\ 0 & 1 & 0 &\cdots & 0 & 0 & \log p_1\\ 0 & 0 & 1 &\cdots & 0 & 0 & \log p_2\\ \vdots & & & \ddots & & \vdots &\vdots\\ 0 & 0 & 0 &\cdots & 1 & 0 & \log p_{k-1}\\ 0 & 0 & 0 & \cdots & 0 & 1 & \log p_k\\ \end{pmatrix}$$ and we find a linear combination close to the vector $(0\ 0\ 0\ldots 0\ 0\ \log N)$, then we have found a pair of reasonably-sized smooth numbers $u$ and $v$ such that $\log u-\log(vN)$ is small. (Schnorr's method gives different weights to the columns to produce many solutions and also to try to guarantee that $u-vN$ is really small; you can think of this as reducing the above lattice with respect to a weighted Euclidean metric).

The idea is compelling, but there are issues*. Let us consider how close our close vector needs to be. As we will be naively testing $u-vN$ for smoothness, we should aim to find $u$ and $v$ values where $u-vN$ is significantly smaller than existing methods such as the quadratic sieve (where we test number of size $N^{1/2}$) or number field sieve (where we test numbers of size $\exp(c(\log N)^{2/3}(\log\log N)^{1/2})<N^{o(1)}$. If the final component of the difference vector in our close vector solution is $\epsilon$ then we have a solution where $$\log u=\log(vN)+\epsilon\iff u=vN\exp(\epsilon)$$ so that $$u-vN=vN(\exp(\epsilon-1)>\epsilon vN.$$ For this to be comparable to the quadratic sieve we will need $\epsilon$ to be less than $N^{-1/2}$ and to be competitive with the number field sieve it needs to be less than $N^{-1+o(1)}$. These are incredibly small which not only puts a strain on lattice algorithm, but also on the accuracy of our inputs: each of the logarithms of small primes will need to be expressed to 100s of decimal places of accuracy else the approximation will swamp $\epsilon$. This is something to watch closely as problems scale up.

QAOA: Finding short vectors using quantum computers

The novelty within the new paper seems to be a use of QAOA (quantum approximate optimisation algorithm/ansatz) to improve solutions to close vector problems. I've not yet read anything in the paper that makes their idea unique to the Schnorr lattice. The idea is that given a close-ish vector $\mathbf b=(b_0\ b_1\ldots b_k)$ to a target vector $\mathbf t=(t_0\ t_1\ldots t_k)$ they use QAOA to search the region $$\{(\mathbf v=(v_0\ v_1\ldots v_k):|v_i-b_i|\le 1\}$$ (i.e. the hypercube of side length 2 centred on our close-ish vector) for an approximate minimum to the function** $$||\mathbf v-\mathbf t||_2.$$ The minimum of this function would be taken at the closest vector to $\mathbf t$ in the hypercube. This is a search space of size $3^{k+1}$ and so brute force exhaustive search is not a sensible approach for large dimensions. The factoring application will rely on our really close vector lying within the hypercube, which will depend on the starting point. The paper says to derive starting points from Babai's rounding algorithm (i.e. rounding coordinate-wise with respect to the basis) applied to an LLL reduced basis. This will not provide spectacularly good starting points for higher dimensions, but it may be that the process can be iterated to "hill-climb" to iteratively better solutions. As written however, the paper relies on classical computation having found a close-ish vector that has a really close vector within its hypercube neighbourhood and this feels like a big assumption as we scale.

How good is QAOA at finding approximate minima to multivariate functions in hypercubes? Answers to this question are arcane. The behaviour of the function in the hypercube is modelled as a Hamiltonian, this Hamiltonian is modelled in a circuit and then the energy of the Hamiltonian approximately minimised. Behaviour varies according to how well-conditioned the Hamiltonian is and leaving questions of eigenvalues and Pauli states to quantum.stackexchange.com, I'll say that it's certainly likely to find something closer than the initial Babai starting point, though how much closer I can't say.

Issues with the experimental data

My increasing scepticism derives from looking at the quoted data. Schnorr's method spits out smooth $u$ and $v$ values for which $u-vN$ is supposed to be really small, so that it is automatically smooth. For larger values of $u-vN$ we have to do naive smoothness tests much as we do for existing factoring algorithms and if the $u-vN$ values are around $\sqrt N$ in size then the whole method is unlikely to beat the quadratic sieve, let alone the number field sieve. Looking at p. 11 (Section III.A. of the supplementary) they quote a Lemma of Schnorr*** that says $u-vN$ values of size around $O(p_k)$ should be produced. These would certainly be suitable and very likely to be smooth. The authors conclude Section III by relying on this estimate. In all of their experiments though, $u-vN$ comes up much larger than $p_k$.

In their "3 qubit" experiment to factor 1961 their (scaled) target is $(0\ 0\ 0\ 240)$ and an example Babai starting point of $(0\ 4 \ 4\ 242)$ which corresponds to $u=2025=3^45^2$ $v=1$ and a $u-vN$ value of 64 (which although smooth is considerably bigger than $p_k=5$). For this size of problem it is possible to exhaust the hypercube and note that there are no closer vectors. On page 22 they list all of the factorisations of suitable $u-vN$ that they found. These range in size from 17 to 5453. All of them are bigger than $p_k$, all bar three are bigger than $\sqrt N$ and some are bigger than the number that we wish to factor.

Likewise for their "5 qubit" case (factoring 48567227) on page 24 the value of $u-vN$ varies between 256476 and 222890309. All of them have a prime factor bigger than 11 and all of them are considerably bigger than $\sqrt N$.

In the "10 qubit" case (factoring 261980999226229) on page 19 they quote example $u-vN$ values between 10 and 11 decimal digits all bigger than $N^{0.7}$.

They rescue these examples by declaring a different smoothness bound for the $u-vN$ and naively testing them for smoothness (See p. 16 Section IV.D), Such an approach will be worse in classical post-processing than the quadratic sieve unless the $u-vN$ value can be made consistently less than $N^{1/2}$. Early experimental data does not indicate that this is the case.


(*) - The transcendental number theorist in me is queasy because there are deep results saying that linear forms in logarithms cannot be too small and that finding instances of smooth plus smooth equals smooth are tricky, but I'll try to stick to more concrete worries.

(**) - For the factoring application, a better approach would be simply to minimise the value of the final component which is all we really care about.

(***) - I haven't seen a satisfactory proof of this lemma.

Score:7
cn flag

The consensus among experts seems to be that this does not constitute a break, simply because the classical part of the algorithm still requires exponential time. I don't know whether this could yield any concrete speedups over known classical factorization algorithms.

Some pointer: https://groups.google.com/a/list.nist.gov/g/pqc-forum/c/AkfdRQS4yoY/m/3plDftUEAgAJ

See in particular Saarinen's answer:

It should be noted that the paper does not claim that the proposed method is faster than classical factoring methods. When the paper talks about "resources," it omits "running time"; what is merely claimed is that the quantum circuit is very small.

The "sublinearity" means that it can be smaller in qubits than the number being factored. This is a hybrid classical+quantum method; the number N must be held on a classical computer. Based on very rough heuristics, it appears to have an exponential running time in this setting.

The proposed method is not in any way related to Shor's factoring algorithm (which actually runs in polynomial time.) It is instead based on classical Schnorr’s algorithm [29, 30], which is (putting it politely) controversial. Certainly, the experimental analysis in "This destroys the RSA cryptosystem" paper [30] is deeply flawed, and researchers have been unable to find support for its claims. Léo Ducas has a nice set of notes about claims of factoring by methods of this type at: https://github.com/lducas/SchnorrGate

As for the impact on actual PQC algorithms, there is no indication of an asymptotic (exponential) speed-up for CVP, which would potentially impact lattice-based systems. Perhaps something like the "Babai optimizer" can shave off some bits from the concrete security analysis. However, this would be merely a constant-factor improvement.

Scott Aaronson's blogpost on this: https://scottaaronson.blog/?p=6957

See in particular this part:

No, just no

Daniel S avatar
ru flag
Not sure if I agree with Saarinen's statement. The circuit depth quoted for the QAOA part should equate to run time of the QAOA. Even if the relations returned by the optimiser are not very good at all, only a subexponential number of calls to the optimiser will be needed.
Sam Jaques avatar
us flag
Pretty sure Saarinen's right: the paper states specifically that the depth is for one layer of QAOA. The QAOA is solving a binary optimization problem with $n/\log n$ bits (for an $n$-bit number) so I don't know why we would expect anything less $2^{O(n/\log n)}$ calls/layers/runtime.
Daniel S avatar
ru flag
@SamJaques Do I misunderstand? That seems to say that QAOA has the same time complexity as brute-force exhaustion.
Geoffroy Couteau avatar
cn flag
Isn't it the whole point? QAOA is not known (and some would say, not even believed) to scale better than brute force
Sam Jaques avatar
us flag
Note that "brute force" and the big-O notation are hiding the usual square-root speed-up from Grover's algorithm. I am probably missing some subtleties here, but since QAOA is a versatile tool that you can apply to *any* combinatorial optimization problem, we shouldn't expect good time complexity in general if NP is not in BQP. The advantage seems to be just for certain problems.
Daniel S avatar
ru flag
I've set up a chat room for those interested in extending the discussion: https://chat.stackexchange.com/rooms/141815/qaoa
kodlu avatar
sa flag
thanks for a great answer. I accepted the other answer since it filled in more details arising from my shortcomings in this area.
Score:4
in flag

Qubit complexity

The paper says:

"We find that a quantum circuit with 372 physical qubits and a depth of thousands is necessary to challenge RSA-2048 even in the simplest 1D-chain system. Such a scale of quantum resources is most likely to be achieved on NISQ devices in the near future"

A depth of thousands is something that we are not going to see with high fidelity in the next decade. Such depths need quantum error correction, which is what makes estimates like "372 qubits" become "millions of qubits". The above-quoted sentence probably should say "372 logical qubits" not "physical qubits". Table II of this paper shows some estimates of how many "physical qubits" would be needed to implement each "logical qubit". I'm also not sure what "even" in the simplest 1D-chain system means, but it could mean that "in reality" rather than in the "simplest" of scnearios, more qubits and more depth is needed.

Graph complexity

The paper says:

"The topology of the ZZ-items in the problem Hamiltonian is an n-order complete graph (Kn) according to Eq. 4"

They did experiments with up to n=10 qubtis, which means that the 10 qubits would be represented by vertices of a $K_{10}$ graph, and ZZ interactions would be needed between all possible pairs of these qubits. That's not hard, but to do RSA-2048 they say that they would need 372 qubits and therefore the $K_{372}$ graph. Having 372 qubits each interact with 371 other qubits is not likely to happen in the next decade (or 7 decades, as the next paragraph explains) without significant losses in performance elsewhere.

In the first five commercial quantum annealers sold by D-Wave, each qubit could interact with at most 6 other qubits. Recently they released a device, for which my team reverse-engineered the connectivity graph based on whatever few sketches they made available at the time, and it had each qubit interacting with at most 15 other qubits. They claim to be extending this to 20 qubits in 2023-2024, but even if they add 5 qubits per year it will take more than 70 years for them to have each qubit interacting with up to 371 qubits. Machines with all-to-all connectivity have been proposed though, for example the coherent Ising machine and the "digital annealers" made separately by companies like Fujitsu and Hitachi, however I'm not aware of any graphs like the D-Wave ones below, which show orders of magnitude speed-up from these devices over "classical" computation.

QUBO complexity

I'd like to address this comment by Sam Jaques:

"The QAOA is solving a binary optimization problem with n/logn bits (for an n-bit number) so I don't know why we would expect anything less $2^{O(n/\log n)}$ calls/layers/runtime."

Specifically the paper says the following for RSA-2048:

"$n= 2\times 2048/\log 2048∼372$"

In the world of QUBO, 372 is not a lot of bits/qubits. In the reference list of my book on quadratization there's many papers that show you that they solved real-world QUBO problems involving 20,000 bits (i.e. binary variables). However there's two important points to make about that:

  • 372 bits with all-to-all coupling is much harder than 20,000 bits with each bit only interacting with a handful of other bits (as would be the case in the the above-mentioned references).
  • Finding the global ground state is much harder than finding a state whose energy is close to the ground energy (the above-mentioned references are for a computer vision problem, in which the goal is to get an image that closely resembles an "original" image, meaning that an approximation of the ground state is "good enough" as long as it gives an image that matches to the extent that our human eyeballs can notice).

Nevertheless, I need to "call out" the use of 2^n in that comment for at least two reasons, the strongest being described in bullet points below:

  • As mentioned in the same author's subsequent comment here, Grover's algorithm makes the cost $2^{n/2}$ rather than $2^n$, which for $n=372$ is the difference between $10^{112}$ and $10^{56}$. That's a difference that seems unfair to ignore when talking about actual factorization of real integers, as opposed to a theoretical study of what happens to things inside a big-O when $n \rightarrow \infty$.
  • Finding the global minimum of a binary optimization (QUBO) problem with 372 variables does seems hard (depending on how many copies of it exist; if the function is constant then every input gives the minimum). However, no classical computer has ever factored RSA-260 (862 bits) yet. This would be 2*862/log_2(862) = 177 qubits which would be 2^(177/2) or 5 x 10^26 operations to find the ground state with Grover's algorithm in the worst case scenario. There's so much that can be done to reduce the number of qubits using pre-processing for example in the way I did here, but let's pretend we don't do any of that. 10^26 is only 8 orders of magnitude larger than exascale computing and the zettascale computing that we can reasonably expect to see next decade. Combine that with the fact that single-core D-Wave devices have already been shown to be about 7 orders of magnitude faster than single-core classical computers in this study (remember that we're looking at 177 bits on the x-axis to factor an RSA number that a classical computer has never yet been able to factor):

enter image description here

Therefore if only considering the 2^n or 2^(n/2) QUBO complexity, then a feasible-sized fleet of quantum annealers with 177 qubits each, and capable of getting the Grover speed-up that adiabatic "annealers" have been proven to be able to achieve, could factor RSA-260/RSA-862 with this algorithm while a next-decade zettascale supercomputer would not be able to find the associated QUBO minimum classically in the same amount of time. If we were considering a problem that could be solved with a graph connectivity more similar to the quantum hardware (rather than with the "complete graph" mentioned in my "Graph complexity" section above), then dismissing quantum annealers due to the QUBO complexity at this number of qubits would be dangerous because the functional form of the complexity is not the only thing that matters, constants do matter.

Allow some pre-processing and problem-specific optimizations to reduce the number of qubits, and n=177 qubits goes down, along with 2^(177/2). Allow the user to do something better than the most blind, naive, brute-force search, and the 2^n required to find with a guarantee the global ground state (assuming there's only one copy of it!) goes down too. Even on classical computers, the GNFS isn't run with a random polynomial, they spend time finding a polynomial that is good for the specific number that they're factoring. We have to allow the people factoring by optimization (whether on classical computers or quantum annealers) to do better than "brute-force" search.

Summary

  • The paper says that factoring RSA-2048 would need 372 qubits and a depth of thousands of gates, but anything with a depth of thousands of gates will likely require millions of qubits to deal with the quantum error correction necessary, or gate-fidelities/coherence-lifetimes that are much better than anything I anticipate to see in the next decade.
  • The paper says they require the qubits to be connected via the complete graph $K_n$, which is not likely to become possible in a quantum annealer for 372 qubits in the next 70 years, although it's possible with the coherent Ising machine and digital annealers which haven't yet shown the types of speed-ups that D-Wave has shown for quantum annealers.
  • I would not be so quick to bring up the 2^n or 2^(n/2) QUBO complexity here, since if we switch from thinking about RSA-2048 to RSA-260/RSA-862, which has still not been factored by GNFS on a classical computer, the number of QUBO variables in the paper's algorithm is dangerously small to dismiss the algorithm immediately based only on the idea of a 2^n or 2^(n/2) scaling. One of my papers estimated that we would need 5893 logical qubits to factor RSA-230 with QUBO, and even if this were to be brought down by a factor of 10, it would still be much larger than what the authors of the Schnorr-based QUBO algorthm would need for RSA-260, which is only 177 logical qubits.

The algorithm in this paper will not lead to RSA-2048 being factored in the next 70 years without more achievements on the algorithm and/or hardware side, but it seems to be better than any other attempt that uses QUBO (including my own).

poncho avatar
my flag
While I would trust you on most of this answer, I do have one nagging question: one objection this answer brings up is that the paper requires all physical qubit hardware components to be pair-wise connected; is this true? I would naively expect that you could make it work using indirect connections - for example, you could perform SWAP operations to move the qubits you need to interact between the various hardware locations so that the ones you need to perform a ZZ operation on are adjacent. Obviously, this increases the circuit depth, but by a relatively small factor.
Nike Dattani avatar
in flag
@poncho I'll update my answer to consider both cases: QUBO problems can be solved using AQC (in which thousands of qubits are available, and the diagram in my answer shows a 10^7 times speed-up over classical computers, but there's **no such thing as a SWAP**), and using circuit-based QC (in which only dozens of qubits are available, with SWAP gates possible, but plagued so badly by noise that the only analogous result was Google's "supremacy" claim that's [34359738368 worse than it seemed](https://mattermodeling.stackexchange.com/a/3920/5)). My focus has been on AQC, but will update my answer
poncho avatar
my flag
Thanks for clarifying things - I've studied only circuit-based QC...
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.