Score:3

How could ECDSA be broken with prime factorization through Shor's Algorithm?

gm flag

could anyone help me understand how is ECDSA broken using Shor's Algorithm? All the papers I find are too complex to understand, and even though I feel I understand some concepts, some others are a bit more of a mess.

Score:7
ru flag

I'm going to give a more detailed version of @poncho's answer. Let me know if you'd like more detail on any point and I can make it still longer. Ideas that you might need to read additional background on are given in italics, but if other sources aren't clear, let me know.

Short Answer

ECDSA is not broken by factoring into primes, rather both problems are solved by variations of the quantum Fourier transform. Although slightly different, these variations are both known as Shor's algorithm.

Notation warning

In the following, I will use a bad habit of mathematicians to use the "+" sign to mean several different things. In all cases, it will be an operation that combines two members of set to make another member of the set. All uses are examples of abelian groups and it is useful to borrow our intuition of how addition works because the operations in our groups will behave similarly. I'll try to distinguish different groups by writing the sets in different ways. Specifically, I will use "+" for:

  • Adding real numbers in the usual way. I'll write $\mathbb R$ for the set and lower case Greek letters such as $\alpha$ for members of the set;
  • Adding whole numbers mod $\ell$ using clock arithmetic to always get an answer in the range 0 up to $\ell-1$. I'll write $\mathcal L$ for the set and use regular lower case Roman letters such as $a$ for members of the set. I may also multiply members of $\mathcal L$ by writing them side-by-side;
  • Adding points on an elliptic curve with $\ell$ points. I'll write $E$ for the set and use upper case Roman letters such as $G$ or $P$ for members of the set. I'll also write $2P$ for $P+P$ and $aP$ for $\underbrace{P+P+\ldots+P}_{a\text{ terms}}$ where $a$ is a member of $\mathcal L$;
  • Adding ordered pairs of whole number mod $\ell$ so that $(a,b)+(c,d)=((a+c)\mod\ell,(b+d)\mod\ell)$. I'll write $\mathbf T$ for the set and use bold lower case Roman letters such as $\mathbf x$ for members of the set. I'll also write $2\mathbf x$ for $\mathbf x+\mathbf x$ and $a\mathbf x$ for $\underbrace{\mathbf x+\mathbf x+\ldots+\mathbf x}_{a\text{ terms}}$ where $a$ is a member of $\mathcal L$.

Recovering private keys from ECDSA

In ECDSA, there is an elliptic curve point that everyone knows called $G$. A public keys is a different point $P$ where $P=sG$ where $s$ is a member of $\mathcal L$ known only to the signer. If we can find $s$ given only $G$ and $P$, then we can forge signatures and break ECDSA. The problem of finding $s$ given only $G$ and $P$ is called the discrete logarithm problem. I'll partially describe how a moderately-sized quantum computer might do this.

Mapping from $\mathbf T$ to $E$

Given $G$ and $P$, I'm going to define a function that takes members of $\mathbf T$ and returns members of $E$. If the pair $\mathbf x$ is $(a,b)$ where $a$ and $b$ are in $\mathcal L$, then my function will take $\mathbf x$ to the point $aG-bP$. I'll call this function $F$ so that $$F(\mathbf x)=F((a,b))=aG-bP.$$ Mathematically, this is quite a nice function for example if $F(\mathbf y)=Q$ and $F(\mathbf z)=R$ then $F(\mathbf y+\mathbf z)=Q+R=F(\mathbf y)+F(\mathbf z)$ (note that "+" is being used in two different ways here), which is a property know as homomorphism. For the purposes of solving the discrete logarithm problem, it is also important to notice that if $\mathbf d$ is the pair $(s,1)$ then $F(\mathbf x+\mathbf d)=F(\mathbf x)$ for all pairs $\mathbf x$. This is because $$F(\mathbf x+\mathbf d)=F(\mathbf x)+F(\mathbf d)=F(\mathbf x)+sG-P=F(\mathbf x).$$

Periodic functions

You may have encountered expressions similar to $F(\mathbf x+\mathbf d)=F(\mathbf x)$ before. For example: $$\sin(\alpha+2\pi)=\sin(\alpha),$$ for all $\alpha$, where we say that the sine function has period $2\pi$. Transferring the language of the set $\mathbb R$ to the set $\mathbf T$, we might say that the function $F$ has period $\mathbf d$. In our case however, we can evaluate $F$, but we cannot write down $\mathbf d$ (which is our goal in solving the discrete logarithm problem). This shows that our discrete logarithm problem is related to a problem known as period finding.

Period finding and Fourier transforms

There is a very elegant mathematical theory of real periodic functions called the Fourier transform, which in one interpretation says that all sensible, real periodic functions are just sums of multiples of sine functions and that the multipliers can be systematically calculated. We often deal with functions where one (or two) multiplier(s) of a particular sine function(s) are large and all the other multipliers are small. The large multipliers correspond to signal and the small multipliers to noise in the reception of radio waves for example. The Fourier transform gives a very efficient way of separating signal from noise in real functions. There a similar idea called the discrete Fourier transform for our set $\mathbf T$, that will identify our period $\mathbf d$ as the signal part of our function. The problem is that there are a lot of intermediate calculations that are very similar, but take a lot of time to compute and write down even though we only want a simple answer at the end. Computing this simple answer using a Fourier transform on a classical computer therefore takes too much time and space.

Quantum superposition

A lot of inaccurate statements are made about how quantum computers work, so be wary about how this next section is described. Notionally, a quantum computer can exist in a spectrum of possible states that allow us to think of all the intermediate calculations of a Fourier transform being represented in a superposition. Provided that we never try and explicitly write down ("measure") one of the intermediate calculations, a quantum computer can work through this complex intermediate state get to a final answer where the signal and noise components of the function are in superposition. When we stop and ask the quantum computer to write down ("measure") this answer, it is very highly likely that the computer will write down the signal part that we want. In our problem this "signal" should be a period of the function $F$.

...except

There's a small problem in that $\mathbf d$ is not the only period of $F$. It is also true that $F(\mathbf x+2\mathbf d)=F(\mathbf x)$ for all $\mathbf x$. This means our quantum computer might return the answer $\mathbf d=(s,1)$, but it might also return the answer $2\mathbf d=(2s\mod\ell,2)$ or $3\mathbf d=(3s\mod\ell,3)$ and so on. Analogously, our sine fucntion can be said to have period $2\pi$ or $4\pi$ or $6\pi$... This is a minor inconvenience if we can perform mod $\ell$ division: when our computer returns a pair of numbers $(u,v)$, we will just divide $u$ by $v$ mod $\ell$ and get $s$. Fortunately we know how to do division mod $\ell$ to get whole number answers using the extended Euclidean algorithm.

Finally

That is essentially how a quantum computer might break ECDSA (omitting the details of Fourier transforms, which are probably too lengthy to fully describe here). What does all this have to do with factoring into primes? Again omitting some details, if we wish to know the prime factors of a number $N$, then it almost certainly suffices to find a whole number $2D$ such that $2^{x+2D}\mod N=2^x\mod N$ for all integers $x$. This is because we would then have $(2^D+1)(2^D-1)=2^{2D}-1\equiv 0\pmod N$ and roughly half of the prime factors of $N$ will divide the first bracket and roughly half will divide the second bracket. Euclid's algorithm then would let us split out the primes, but only if we can find $D$. If we now consider the function from all of the integers to the integers mod $N$ given by $G(x)=2^x$, we have $G(x+2D)=G(x)$ for all $x$ and we again have a period-finding problem (except that now our set is all of the integers rather than the integers mod $\ell$). Again, this is solvable with the discrete Fourier transform (though not very easily on a classical computer) and again a quantum computer can handle the large intermediate information in superposition before returning $2D$ as the "signal".

poncho avatar
my flag
"I will use a bad habit of mathematicians to use the "+" sign to mean several different things" - perhaps you could recommend a notation distinguishing the (literally infinite) number of meanings that a mathematician may use "+" (e.g. addition in $\mathbb{Z}_{2^{521}-1}$) :-)
Daniel S avatar
ru flag
@poncho On occasions where students have really struggled with multiple uses of the same operator symbol in the same piece of mathematics, I’ve found that subscripting with the group can help e.g. I might use $+_{\mathbb R}$, $+_{\mathcal L}$, $+_E$, $+_{\mathbf T}$ in the above.
Pau T avatar
gm flag
Let's see, this explanation looks easier, but still some doubts arise. I understand the part of finding "s" (private key of ECDSA) by solving the ECDLP. But in the following step I do not understand what this mapping means and what the purpose of it actually is. I did understand how G and P are points, so the key point would be to find the period of how many times the point addition of P has to be done in order to get to this same point P, ¿no?. After doing this, I would know the period, but I do not see how this relates with the following step. What is this mapping thing about?
Pau T avatar
gm flag
Btw thanks to both of you @ poncho and @Daniel S. It is not easy to teach to someone like me who knows literally nothing about this topic, but I understand your explanations quite better than all the papers and research docs I could find with my research.
Daniel S avatar
ru flag
A chat room is appropriate at this point https://chat.stackexchange.com/rooms/141962/introduction-to-shor
fgrieu avatar
ng flag
@Daniel S: when initially explaining ECC, I find it useful to use $\hat +$ for point addition, leaving $+$ for the field operation. We can similarly use $\hat\cdot$ for scalar/point multiplication, and $\cdot$ (perhaps omitted as customary) for field multiplication. This is not battle-tested, but I think it has helped in some of my answers to beginners. Definitely, confusing the two kinds of operators is a most common mistake.
Score:3
my flag

I am researching about how could Shor's Algorithm end up breaking ECDSA, but I do not find what the relationship is between this cryptographic algorithm and prime factorization.

Well, Shor's algorithm really addresses this problem:

Given a function $F$ where the identity $F(a+x) = F(a)$
holds for all $a$, find $x$.

It turns out that, by selecting $F$ properly, we can use it either to factor or to compute discrete logarithms. However, we select different $F$ functions for solving those two problems.

Now, in terms of elliptic curves, the discrete logarithm problem is "given two points $P$, $Q = xP$, find $x$".

So, to apply Shor's to this problem, we define $F(u, v) = uP + vQ$ (note that this $F$ has two inputs; actually, Shor's doesn't really care about that); by defining the $+$ operation properly, we can get Shor's to give us $x, y$ values [1] such that $F(a, b) = F(a+x, b+y)$; that is $aP + bQ = (a+x)P + (b+y)Q$ or $xP + yQ = 0$. Assuming $y$ doesn't happen to be 0, we get $Q = -xy^{-1}P$ [2], which is our solution.

And, tying this back to ECDSA, once we can compute the discrete log of the public key, that's the private key, and that allows the attacker to sign any message he wants.

[1]: Actually, there are lots of such $(x, y)$ pairs - we don't care which one Shor's gives us as long as $y \ne 0$

[2]: $-xy^{-1}$ is mathematical notation; $y^{-1}$ is that value $z$ such that $y \times z \equiv 1 \pmod n$, where $n$ is the number of points on the curve, and $-xy^{-1}$ is the value $u$ such that $u + xy^{-1} \equiv 0 \pmod n$. This is easily computed if we known $x, y$ and $n$ (which we do)

poncho avatar
my flag
@PauT: "how could Shor's Algorithm breakk this with prime number factorization."; that's the point I'm trying to convey; Shor's does more than factorization - it solves a lower level problem that can be used either to factor or compute discrete logs.
poncho avatar
my flag
@PauT: well, I gave you a high level outline on how Shor's could be used to compute discrete logs (and hence break ECDSA). Showing it takes "polynomial time" would require showing that both Shor's and the classical computations at the end take polynomial time (they do).
us flag
For completeness' sake, might it be useful to write out $F$ for prime factorization as well?
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.