Score:1

How insecure would a cipher based on iterative hashing be?

de flag

I was just wondering how the following construction could be insecure. I can tell that known-plain attacks are possible, but I'm not sure about anything else.

Let the user pick a password and hash it a sufficiently large number of times with something with preimage resistance like SHA-256. XOR the plaintext with the hash. Hash the hash again. XOR the result with the next block of plaintext...and so on until the end. How would adversaries be able to break this?

jp flag
What is the ciphertext?
Score:11
in flag

TL;DR; Insecure don't use it. Use the standards and well-known mode of operations.


Your construction is $s_i = H^i(password)$ i.e. hash the password $i$ times then use the outputs $s_i$ for each step to x-or with the plaintext to arrive at the ciphertext; $c_i = p_i \oplus s_i$

How would adversaries be able to break this?

There are ways to achieve this

  • The guessing attack: If someone gets some part of your message like at least 256-bit of a full SHA-256 block, then the rest of the messages are decrypted easily. So there is no forward secrecy here..

    One should look at these slides about Hash_DRGB see a better design against state compromises. When producing a Pseudo-Random Sequence compromise of the current state must not leak other states. The proposed design's main failure is this.

  • Your scheme seems to be vulnerable to the forbidden attack that when the $(IV,key)$ pair resue of CTR mode can break the confidentiality of the ciphertext. You don't define an IV into your scheme so you will have the same password and the stream generated by your cipher will produce the same output all the time.

    Either one must change the password every encryption or must introduce standard IV practices to mitigate, otherwise doom!

  • The password security cannot be verified and we usually use Password-Based Key Derivation functions like PBKDF2, Scrypt, or Argon2 to get randomness from passwords. These, at least, provide iteration count to mitigate password search attacks. They have also memory-harness and thread count to reduce the capabilities of the password search attacks. Since your scheme is vulnerable to the forbidden attack, it will be harder to harder for a user to select a new password for every encryption.

    Instead, we use random keys which are either locally generated to use or derived from protocols like DHKE.

  • The security of your scheme highly depends on passwords security and it will be a good attack point since the general user fails to achieve and manage good passwords. Although one can use dice-wire like passwords to generate passwords that have strength around 256-bit, the more encryption the more password to memorize will become infeasible. Instead, using one good password and deriving an encryption key that encrypts the random inputs for your scheme is a better candidate.

    For local encryption the usual practice is; generating a random File Encryption Key FEK for every file. Then using the password of the user derive a Key Encryption Key (KEK) with a good PBKDF like Argon2. With KEK encrypt and decrypt the FEKs then encrypt/decrypt the files. More details is here.

  • Your scheme lacks random access, so it is not useful for many applications like disk encryption.

  • You only requested pre-image resistance, however, you will also need collision resistance. If collisions can easily occur on the generated stream then it will repeat itself and this is an attack point. At least your stream is will not be a Pseudo-Random Sequence. Lucky to you, SHA-256 has collision resistance and the expected cycles ranges are huge;

    Let model SHA-256 as a uniform random function then the probability of element being on the cycle is

    $$\frac{1}{\sqrt{\hspace{.03 in}2\hspace{-0.05 in}\cdot \hspace{-0.04 in}\pi} \cdot 2^{127}}$$

    The average cycle length with the expected value for SHA-256 is $$2^{127} \sqrt{2\pi}$$ Therefore you don't fear short cycles. Ref is Harris, 1960

  • Your scheme doesn't have Ind-CPA security.

What do we have instead of this?

  • We have the well-known CTR mode that can convert any Pseudo-Random Function (PRF) to an encryption scheme since 1979. It is the seminal paper of Whitfield Diffie and Martin Hellman;

This is a huge advantage since it enables a large range of functions to use instead of just PRPs. CTR mode has IV to randomize the encryption and it achieves probabilistic encryption. CTR can achieve Ind-CPA security.

Currently, all the ciphers in TLS mode use CTR mode internally, AES-GCM, AES-CCM, and ChaCha20-Poly1305.

Conclusion

There is no advantage over CTR mode even you add IV to your encryption scheme and even you solve the password problem and the most problematic part: the guessing attack.

Stick the well-known and security bounds guaranteed CTR mode.
Score:2
us flag

Before talking about attacks, I suggest you use random key with enough bits (at least 128 bits) instead of the passwords. You don't need to hash the key many times, just use a good key. If it is derived from the password, use a good Key derivation function to derive the key and use that instead.

Now back to your question, you said it yourself, known plain text attack is possible. And it is disastrously easy to do that. All you need to know is a single block of plain text in order to recover all subsequent plain text by revealing the hash at that block. This is a lot lower level of security compared to what is expected of modern ciphers which are designed to be secure against adversaries capable of performing chosen plaintext or chosen ciphertext attacks. It is not even that difficult to figure out or at least narrow down to feasible number of tries for one block of plaintext based on file formats, few available information on the file owner etc, which can be used to decrypt entire ciphertext.

Actually, long before I also thought of a similar cipher, but instead of just hashing everything iteratively, I thought of doing xor of the left half of hash output with the plaintext block and hash the right half for the next block. This can be randomized by mixing public random number with the key. Another more well known way is using hash in CTR mode to generate the stream like we do with block ciphers as CTR mode does not require you to be able to decrypt the block. As kelalaka pointed out, it provides random access too.

But remember that this cannot be shown to be secure in the plain model and thus you cannot be very sure about its security with hash functions like SHA-2, which was not designed to be used this way. This requires a further assumption that your hash function is close enough to a random oracle to use like this securely. And even if it is secure with common hash functions, it will likely not have widespread use because it will be hell a lot slower than hardware assisted AES or Chacha20 (which also uses CTR mode as mentioned above).

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.