Why are binary extension fields preferred for Shamir secret sharing?

in flag

It is known that Shamir's secret sharing works over any finite field but I don't get it why binary extension fields are preferred?

sa flag

The main reason is that there is no disadvantage to using a binary extension field. Since the computing and communications infrastructure already runs over binary, this is the simplest and most efficient choice in most cases.

Note that when prime fields (Digital signatures) or odd characteristic fields/rings (RSA) or more exotic algebraic structures (Elliptic Curves) are required for security, they are used. So if you were already using the prime field $\mathbb{F}_p$ for another cryptographic primitive and needed to use secret sharing as part of an application, you could.

fgrieu avatar
ng flag
Yes. In other words, since application data is binary or bytes, using any field not a binary field would create the problem that the domain of what the secret sharing can encode does not match the application domain, so some kind of domain extension before encoding, restriction after decoding would be needed.
ar flag

Just to add some further detail to kodlu's excellent answer:

To share some secret data using Shamir's secret sharing over the finite field $\mathrm{GF}(k)$, you first need to encode that data as $m ≥ 1$ elements of the field (which are naturally represented as integers ranging from $0$ to $k-1$), and also assign each share you want to generate a distinct non-zero element of the field as a share ID (i.e. the $x$ coordinate at which the share-generating polynomial is evaluated to produce that share).

You then apply the share generation process, which will give you several shares, each consisting of $m$ elements of the field $\mathrm{GF}(k)$ (plus the share ID). By construction, these shares are uniformly distributed and $t-1$ wise independent, where $t$ is the reconstruction threshold parameter of the scheme, meaning that any subset of $s < t$ of the shares is indistinguishable from $s$ sequences of $m$ field elements chosen independently uniformly at random. In particular, as a weak corollary of this, all of the field elements comprising the shares are necessarily uniformly distributed between $0$ and $k-1$, inclusive.

Now, let's say that the secret data that you want to share is binary, and encoded as a string of $n$ bits. If you happen to be using a binary field, such that $k = 2^b$ (and if $n$ is a multiple of $b$), then mapping the secret into a sequence of field elements is very simple: just cut the $n$-bit string into $m = \frac nb$ pieces of $b$ bits, each of which naturally maps to a field element. And since every element of $\mathrm{GF}(2^b)$ also maps unambiguously to a string of $b$ bits, your shares can also be represented as $n$-bit bitstrings (plus the share ID) simply by concatenating (the binary representations of) the field elements in each share.

(If the data length $n$ is variable, and not necessarily an integer multiple of $b$, you might need to apply some padding to unambiguously indicate its length before sharing it, but that's still only a minor complication. And if your data is encoded as eight-bit bytes, and you don't need more than 255 shares, then you can just use $\mathrm{GF}(2^8)$ and skip the padding.)

What if you don't want to use a binary field, though? Then you have a couple of options:

  1. Choose the largest $b$ such that $2^b < k$, split the data into $b$-bit pieces, map those into field elements and share them. The main down side of this is that the random field elements in the shares will not fit in $b$ bits (since $2^b < k$), so you'll need to encode them using $b+1$ bits each, wasting at least $m$ bits per share. (Variable-length encoding after sharing will not help here, since you can't compress uniformly random data. Variable-length encoding before sharing is insecure, since then your share length would leak information about the secret.)

    You also cannot arrange both $b$ and $b+1$ to be convenient "round" values like 8, 16, 32, 64, 128 or 256, which means that either your data pieces or share pieces will inevitable have a awkward length. For example, let's say you wanted to want to share 256-bit encryption keys; you could either choose $b = 256$, in which case your shares would be 257 bits long (and, if you wanted to use a prime field, you'd need to do the calculations modulo a 257-bit prime), or $b = 255$, in which case you'd need to split your keys over two field elements, resulting in 512 bit shares. Or you could use, say, $b = 32$ (and, say, the prime field $\mathrm{GF}(2^{32}+15)$, which is a little awkward to work with using 64-bit arithmetic, but not impossibly so, especially if you don't need constant-time execution) and end up with $33 \times 8 = 264$ bit shares after some bit shuffling, which might be a decent middle ground.

  2. Use a general radix conversion from binary to base $k$ to optimally encode your data (i.e. interpret the $n$-bit binary data as a number from $0$ to $2^n-1$, then express that number in base $k$), and an inverse radix conversion from base $k$ to convert the shares back to binary. You'll still waste some bits, so your shares will be longer than the data, but only by a constant amount. The down side is that the radix conversion is slower and more complicated to implement than simple bit shuffling. Also, you'll almost certainly need some kind of a padding scheme if your secret length isn't fixed. (And make sure to always use $m = \lceil n \log_k 2 \rceil$, even if your secret value would fit into fewer base $k$ digits, to avoid the share length leaking information about the secret.)

Now, none of these complications is insurmountable, but they do complicate the implementation and make the share encoding less efficient. In comparison, even if you have to implement $\mathrm{GF}(2^b)$ arithmetic from scratch yourself, using a binary field is still simpler.


Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.