Score:0

Is keccak256 (and similar hash functions) a suitable KBKDF for 256-bit keys?

mx flag

Let's temporarily work upon the assumption that proper KBKDF functions do not exist, for the sake of argument.

Would keccak256 be a secure choice for a KBKDF that derives 256-bit keys from a 256-bit master secret $k_{256}$ with an arbitrary-length derivation path $p$? And is this true in general for hash functions that have the properties outlined below?

My thinking is:

  • The input $k_{256}$ is high entropy already (we assume we have a cryptographic-grade key as a starting point), so using a relatively fast function like keccak256 is not a problem
  • If we compute $keccak256(k_{256} || p)$ (or in principle any other construction), we get by definition something of the same or lower entropy level, but I understand that a general goal for hash functions is to avoid discarding to the highest possible extent (i.e. minimize collisions)
  • $keccak256(k_{256} || p)$ is not susceptible to length-extension attacks, so $p$ does not need to be fixed-length if we want to avoid traditional MACs' nested approach

Is this a correct assessment? If it is, then:

  • What is the advantage of using e.g. HKDF over using keccak256 or an hashing function with similar properties, given that we assume $k_{256}$ to be uniform and we don't need any form of stretching?
  • Does this hold for other hashing functions for which the above assumptions hold? E.g. is it also true that for 256 -> 256 derivation, BLAKE2s is sufficient (as opposed to e.g. BLAKE2X or BLAKE2s-based HKDF)?

Taking this further, wouldn't using HKDF be potentially worse than just using keccak256 or a similar hashing function? Since HKDF assumes that it's extracting entropy from a potentially non-uniform input, would it retain more, less, or the same amount of entropy from an input compared to keccak256, or a similar hashing function?

Finally, as an extension to the question, would all of the above still hold true for key lengths lower than 256 bits? E.g. deriving a 128-bit key from a 128-bit master secret $k_{128}$ as

$truncate(128, keccak256(k_{128} || p))$

My thinking is that this hold, under the assumption that the entropy is evenly present along the entire output, and as such the truncation discards none to a minimal amount of entropy. However, I have noticed that the prominent Noise protocol framework does truncate the output of its HKDF function in the MixKey operation, but that's only to generate a key that is never used as an input for key derivation. Instead, it keeps a separate hash-length chaining key that is continously mixed to derive the next chaining key, while never being truncated. Why is this mechanism needed? Is it to minimize that minimal amount of entropy which could be still discarded by the truncation?

Marc Ilunga avatar
tr flag
I am not sure I understand the purpose of the first paragraph. Such schemes exist and you listed some. The questions suggest kdf in the sense of « from an existing uniform key ». That can be achieved using any suitable prf: HKDF-expand(HMAC), keyed blake2, CMAC, KMAC (keccak), etc…
rrrrrrrrrrrrrrrr avatar
mx flag
It was nothing more than a (probably misguided) attempt to avoid answers such as "your question is irrelevant because you should really use a proper KBKDF instead of keccak256". As for your next point, that's indeed the thesis I put forward in my question. The purpose of the question is to verify that my reasoning as to why the thesis is true is sound!
Marc Ilunga avatar
tr flag
Makes sense. To partially answer , that construction is a good kbkdf in theory because the sponge construction « behaves « like a random function. So appending to the key has the same behavior as other prfs e.g; HMAC. Length extension security is not enough: does not guarantee randomness. The assessment about HKDF weakness is inaccurate. But indeed, we can skip the extraction phase. In practice though, a proper keyed mode for keccak like KMAC would be desirable (not meant to be a « just use » answe tho )
Maarten Bodewes avatar
in flag
In general, hash functions can be used as "poor man's KDF". Length extension attacks normally don't apply anyway, assuming that the input of the KDF is not controlled by an adversary. I'm also definitely not saying that you should just use KMAC.
Score:2
fr flag

Typically, when we're using a KDF, we want the output to be uniform and indistinguishable from random, since we want the output to be able to be used as a symmetric key, which should be uniform and indistinguishable from random.

The reason why your construction is secure is that Keccak exhibits those properties. Most Merkle-Dåmgard functions that output more than 50% of their state size are vulnerable to length-extension attacks, which means they're actually distinguishable from random. That's why HMAC exists: because it isn't vulnerable to those attacks, and it has some nice provable properties involving how its security reduces to that of the compression function.

The main advantages to using a standard KDF, such as HKDF, or a similar design, are two-fold. First, you can rely on any sort of proofs or security reduction for this purpose that the construction exhibits. For example, HMAC-SHA-512/256 requires less security from the hash function than just using SHA-512/256, so if there's an attack, you may get more time to move to a more secure design.

Second, it is much easier to audit and explain your design if you say, "I'm using HKDF with SHA-3-256" and your auditor says, "Yup, looks like you are," rather than a custom design. This may sound like a trivial feature, but it is really important in many regulated industries or government agencies, where things like FIPS, PCI, or other requirements come up. Customers also like designs their security teams understand and approve.

HKDF doesn't reduce security in this case because even if you use the extract step, which you don't need and can skip (because your input is uniform), it reduces down to a 256-bit secret (assuming you're using a 256-bit hash function), which is no better or worse than your 256-bit input.

Both variants of BLAKE2 are also secure in this context, just like Keccak would be, because they exhibit the same properties. You could also use KMAC or keyed BLAKE2b in the place of HMAC in HKDF if you wanted, which would also be fine (and would likely have similar provable security properties).

Truncating the output of Keccak256 is also secure. Remember, we want our output to be uniform and indistinguishable from random (so it will pass the next-bit test just like a CSPRNG), so any portion of it is equally acceptable. The first portion is customary. Of course, if your entropy input is only 128-bit, then the security of the output will only be 128 bits.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.