Score:1

How to safely and randomly iterate a key derived from Scrypt?

de flag

I'm developing a way to deterministically generate private keys for arbitrary elliptic curves based on some user-input (a brain-wallet). Currently, I'm using the Scrypt password hashing algorithm with robust difficulty parameters to hash a number of input parameters into a key.

The output of Scrypt should be uniformly distributed among $[0, 2^{b})$ where ${b}$ is the number of output bits used from the Scrypt algorithm output. But valid elliptic curve private keys must be less than the curve's finite field order $N$, distributed evenly among $[0, N)$. Thus, using the output of Scrypt directly as a private key mod $N$ would create a small bias in the resulting keys that are generated - bad news, so I need to avoid that.

Normally if you were generating private keys using a secure RNG, you would simply retry the RNG until you got a number less than $N$, and you could safely use that as a private key.

Is there a safe way to deterministically iterate the output of Scrypt, such that we preserve the pseudo-random distribution of Scrypt's output, without having to re-run Scrypt with new parameters?

One way I considered was hashing the output of scrypt with SHA256 or SHA512 until it was less than $N$, but this wouldn't work so well for curves larger than 512 bits, such as P521.

Another less elegant solution is to simply reject any input parameters which produce a key larger than $N$. It should be very rare for it to ever occur, so perhaps I can get away with it? Are there any common curves out there whose order $N$ is not a significant fraction of its next-highest power of two?

Score:1
my flag

Thus, using the output of Scrypt directly as a private key $\bmod N$ would create a small bias in the resulting keys that are generated - bad news, so I need to avoid that.

Actually, the bias wouldn't be large enough to be exploitable; however assuming that you want to avoid that entirely...

Is there a safe way to deterministically iterate the output of Scrypt, such that we preserve the pseudo-random distribution of Scrypt's output, without having to re-run Scrypt with new parameters?

Two obvious approaches:

  • Have Scrypt generate (say) $log_2 N + 128$ bits, and evaluate that $\bmod N$; the resulting bias will be so tiny it will be unmeasurable, much less exploitable

  • Pass the output of Scrypt to SHAKE; then squeeze out $\lceil log_2 N \rceil$ bits (possibly an even number of bytes, if it makes the implementation simpler), and if the value that it returns happens to be larger than $N$, squeeze out more bits.

The second is similar to your idea with SHA256 or SHA512; however because SHAKE allows us to squeeze out as many bits as we like, we can easily extend that to handle P521.

Oh, and since you asked:

Are there any common curves out there whose order $N$ is not a significant fraction of its next-highest power of two?

Well, the Brainpool curves come to mind - I don't know that their use is incredibly common; however I do believe they are used on occasion.

Gilles 'SO- stop being evil' avatar
cn flag
FIPS 186-4 §B.4 allows both approaches, with a specific process in each case. It only requests 64 extra bits for the fixed-input-size negligible-bias approach.
JoeJafarTheJenie avatar
de flag
Wow cool! I didn't know about SHAKE hashes until reading your answer. This is actually a very clean solution (although unbounded). Thanks for the tip about brainpool, their curve orders indeed look pretty nasty. Would love to know more about why adding extra scrypt output bits would reduce the bias
Score:0
cn flag

First, using a private key that's deterministically derived from a password is almost always wrong. Passwords are often compromised or forgotten and therefore need to be changed. If changing the password requires more than updating a small amount of data in one place, you have major operational difficulties. So normally the only thing you should do with a key that's derived from a password is key wrapping, i.e. the password-derived key is only used to symmetrically encrypt a file containing the “real” keys. Those real keys are stored, and they aren't deterministically generated from anything, they're just randomly generated. When the user updates the password, re-encrypt that file with the new password-derived key and delete the previous version.

That being said, to derive an elliptic curve key deterministically, the first step is to consider the curve type.

  • For Curve25519 and Curve448, keys are calculated from a fixed uniformly distributed bit string as specified in RFC 7748 §5. There is a precise process taking a 256-bit or 448-bit uniformly distributed input, forcing certain bits, interpreting the bits as a number (little-endian), and reducing to the canonical representative modulo $p$. Because $p$ is extremely close to the next power of $2$, the modulo reduction leaves the number unchanged with overwhelming probability. Implementations of Curve25519/Curve448 typically (“MUST”) accept keys in non-canonical form, so you don't have to do anything other than provide the uniformly distributed 32-byte or 56-byte string.

  • For Ed25519 and Ed448, private keys are specified in RFC 8032 §5.1.5 and §5.2.5 as a uniform 32-byte or 57-byte string respectively. The signature process starts by hashing this input (concatenated with other strings) and uses the result as an integer.

  • For NIST curves and generally curves in Weierstrass form, there is no single universally accepted process to go from bit strings to private keys. FIPS 186-4 §B.4 describes two possible methods to generate a private key from the output of a random generator, and these methods are equally applicable to deriving a private key from a uniformly distributed bit stream obtained from a symmetric key derivation algorithm. To derive a private key on a curve whose order $p$ is an $n$-bit prime:

    1. (“Extra random bits”) Obtain $n + 64$ bits of uniformly distributed ((pseudo-)random) secret input. Interpret that input as a big-endian integer, reduce it modulo $p-1$ and add $1$ to obtain a number in the range $[1, p-1]$. Some keys are very slightly more likely than others, but since the bias is about $2^{-64}$, it offers no practical advantage to an adversary.
    2. (“Testing candidates”) Obtain $n$ bits of uniformly distributed ((pseudo-)random) secret input. Interpret that input as a big-endian integer. If the result is $\ge p-1$, start the process again from the beginning. Otherwise add $1$. This has no bias at all, and doesn't require manipulating numbers larger than $n$ bits, but it has the downside that you don't know in advance how much uniformly distributed input you'll need: it's potentially unbounded.

    The first method is generally more convenient since you know exactly how much (pseudo-)random input is needed.

If you really need to chain this with a password-based key derivation algorithm such as scrypt, there are two methods.

  • The direct method is to request as much input as you need from scrypt. For Weierstrass curves, this makes the testing-candidates method impractical. So get 32 bytes from scrypt for Curve25519 or Ed25519; 40 bytes for P256R1 or P256K1; 56 bytes for P384R1 or P384K1; 57 bytes for Ed448, etc.
  • The indirect method is to request a fixed number of bits from scrypt. This needs to be enough to not be guessable by brute force. 128 bits (16 bytes) will do nicely. Use this as a seed to an ordinary (not password-based) key derivation function such as one from NIST SP 800-108 or HKDF, or as a seed to a pseudorandom generator algorithm such as one from NIST SP 800-90A. Use that PRNG output to derive the private key.
JoeJafarTheJenie avatar
de flag
Thanks for this! I'm aware of the issues inherent in deterministically generated keys, but for my specific use case, they aren't relevant.
JoeJafarTheJenie avatar
de flag
The "testing candidates" solution i think will not work, as that would mean asking the user for a new password (the 'less elegant' solution I described in OP) “Extra random bits” seems like a good option. Could you help me understand why adding 64 more random bits reduces the bias to $2^{-64}$?
Gilles 'SO- stop being evil' avatar
cn flag
@JoeJafarTheJenie We are given $2^{n-1} \le p \le 2^n-1$. Let $(q,r)$ be the quotient and remainder of the division of $2^{n+64}-1$ by $p-1$. We draw $X$ uniformly in $[0, 2^{n+64}-1]$ and take $X \bmod (p-1)$. Numbers in the range $[0,r]$ have $q+1$ antecedents and numbers in the range $[r+1, p-2]$ have $q$ antecedents. So the values less than $r$ are $1 + 1/q$ times as likely as values larger than $r$. $q = \lfloor (2^{n+64}-1) / (p-1) \rfloor \ge \lfloor (2^{n+64}-1) / (2^n-1) \rfloor \approx 2^{64}$ so the bias $1/q$ is less than $2^{-64}$.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.