Score:2

Crypto

What is the main problem with zero padding for AES key?

Terry TS Wong

3/3/23, 8:45 AM

I am trying to understand the logic behind some basic principles of AES key padding. Why do we use different kinds of AES key padding schemes instead of the simplest zero-padding? Take AES-128 for example, if my key is "cipherkey", How does the padding work, and what is the problem behind it?

Sorry if the question is too elementary, but I fail to find a good explanation of it.

479

0 + 0

aes

padding

Terry TS Wong

3/3/23, 9:40 AM

Actually yes, and I did read your answer, thanks. But I would like to have a more fundamental explanation about it... is there a short example you can show about how data is lost or misinterpreted during the encryption/decryption process ?

kelalaka

3/3/23, 5:45 PM

Short passwords are never secure [Is it easy to crack a hashed phone number?](https://crypto.stackexchange.com/a/81652/18298) and [Just how insecure are VeraCrypt containers encrypted with short passwords?](https://crypto.stackexchange.com/a/81020/18298) always choose a good password generation like dicewire.

Score:4

Crypto

fgrieu

3/3/23, 9:58 AM

The question considers padding memorable text like "cipherkey" into an AES key by appending zero bytes to the (UTF-8) expression of this text as bytes, until reaching 16, 24, or 32 bytes for AES-128, AES-192, or AES-256.

The main problem with this is that it's fast and inexpensive, which is a disaster in the context: it allows an adversary to quickly and inexpensively apply that process to plausible passwords, and test the resulting keys against a ciphertext (with known or known-redundant plaintext), and thus find the key. This is password cracking, and there's a small industry developing mixes of software and hardware for this.

A dictionary of the most common million English words (for some extended definition of the word) would I think contain "cipherkey". At least, a Google search found 78,400 occurrences, and it's in the list of the registered domains in three common TLDs. With a known plaintext/ciphertext pair or a ciphertext for a plaintext with some structure, this is going to take few second(s) to break for someone with the right tools.

Even though that can be tremendously improved (see next paragraph), "cipherkey" is not a good password. It's advisable to choose a password that's harder to guess. Merely naming that passphrase will help! But that's a secondary problem.

When turning anything memorable (password, passphrase..) into a key, one must use a key derivation function intended for passwords, that is one with a workfactor parameter allowing to tune the cost of the key derivation. If that cost is 0.5 seconds of a powerful computer's time, rather than 20 nanoseconds for the proposed method, that will become the bottleneck for password cracking, and likely give them at least a sweat. With a passphrase à la XKCD or Diceware, it can be quite secure.

One should use salt, which will prevent computing derived keys in advance, and prevent amortizing the cost of such derivations for several password/key pairs. That's another secondary problem.

It's important that the key derivation function is memory-hard, that is makes use of a sizable amount of memory throughout its computation, for that greatly increases the monetary cost of attack, at little cost to legitimate users. Recommendable functions include Argon2 and scrypt. Common but not recommendable functions include PBKDF2, which is not memory-hard, and among the worse possible use of CPU time for the purpose of password-to-key derivation: it maximizes the edge of attackers using ASICs, FPGAs, and even GPUs w.r.t. legitimate users using CPUs. I find NIST's recommendation for PBKDF2 consistent with their former push for Dual_EC_DRBG:

While PBKDF2 is time-hard but not memory-hard, it is so widely deployed that it is not practical (at this time, anyway) to introduce a requirement for a memory-hard key derivation function.

0 + 0

Maarten Bodewes

3/3/23, 10:18 AM

Note that AES itself is not affected, it's the low entropy key / password that's the issue. If the password has too little entropy then a password based key derivation function won't save you either; it only adds a constant amount of security, e.g. ~20 bits for a million iterations of PBKDF2.

kelalaka

3/3/23, 6:16 PM

You should advise; Whatever the password hashing mechanism, one needs a good password beforehand. this must be the first piece of advice. dicewire or Bip39, etc.

fgrieu

3/3/23, 7:02 PM

@kelalaka: I agree that a better password is needed, and added that advice. However I don't see the poor password choice as the as the _main_ problem. I think it's rather harder to crack a medium-hard password like "cipherkey" is (or was before that question) with 0.1s of memory-hard KDF, especially salted, than it is to crack most memorable passwords with essentially no KDF and direct use as the AES key.

kelalaka

3/3/23, 7:09 PM

Well, it is still parallelizable in CPUs, and I would go first [~600K pwned password list](https://haveibeenpwned.com/Passwords), though the list is given with SHA-1 to test. One may find unhashed in some deep web.

kelalaka

3/3/23, 7:29 PM

Also, I've always advised using a password manager, like password1, keepass, etc. With these the users only need to use a good password to secure the vault, they can generate a good random key for the other passwords. Firefox, Opera has these and this is becoming more standard.