Score:1

Safety of AES-256/CBC/PKCS#7 + randomization and reusage of IV

es flag

As a start, I'm by any means no expert or anything near that in cryptography. I know the very basic about this, enough to more or less choose a method to implement and then read about it so I knew what I was implementing. So please excuse any supposedly dumb questions haha.

Having that in mind, I've created a AES-256/CBC/PKCS#7 + HMAC-SHA512 encryption/decryption class in an Android assistant app I'm making (supposed to have locally (very?) personal things and I might publish it on Play Store, so it's not only for me). That combination is supposed to be highly secure for the next ??(?) years. Might be a bit slower, but I don't mind, at least for now (very few data). Though, I also read that there's one problem with this, which is when the initialization vector gets reused. With CBC, it seems it's not possible to get the data back (right?), as it's possible with other modes, and that's why I chose this one. [If there are any other problems with this method, I'm happy to know about them.]

But it's possible to, after some time, detect patterns and see where messages are equal (equal blocks of 16 bytes). Knowing what that block means, one could know that hasn't changed over time after various encrypted versions, for example.

So I had an idea, which is: all the data I encrypt must be encoded using UTF-7. The remaining byte values (128-255) are used as random values to be put one each 16 bytes, in a random position. For example, in index 4 a 154 byte is added, and in index 19 a 234 byte is added. This way, it's always random, and actually equal blocks in data will be different if the same IV is reused ("random" can repeat values, and I can't check if I've already used them in this case, so I thought on this to prevent problems).

Is this a good approach? Might it mitigate the problem? Maybe solve it completely for the next infinite years and the method would be now completely safe or at least much safer?

Also, if anything I said is wrong, I'm happy to be corrected! Thanks!

Score:1
in flag

The only reason why you should be afraid of IV reuse is if the random number generator is off.

Assuming a fully random IV you could encrypt $2^{64}$ blocks within AES-CBC and still only have a one in $~2^{64}$ chance that there is a collision (approximately). Note that the repeat input problem doesn't limit itself to the IV; each ciphertext is used as a "vector" for the next AES-block-encrypt after all.

Your idea is to randomize the plaintext blocks somewhat, to help against IV collisions, something that is less frequent than the collisions in the output of AES (assuming that messages are larger than one block). But that won't really help, as a collision can still take place, and if the attacker knows what is in most of the plaintext it will be easy to guess what is in the other block.

There is however yet another problem. You now seem to have random information to perform the randomization of the blocks. If you would just use that random data for generating a secure random IV then you would likely not run into the problem in the first place.


If you want to protect your data you are better off deriving or encapsulation message specific keys from your "master key".

If your source of randomness is not cryptographically secure, you're probably in trouble anyway. You could take a look at AES-SIV mode to mitigate the problem somewhat. In that mode the IV depends on the plaintext message.

What you should definitely not do is to bugger up your protocol to try and hide the shortcomings of the algorithms. That's unlikely to succeed, and it adds all kinds of unnecessary complexity. AES encryption should not require any tapdancing, if the right constructs are used.


Security notes:

  • To protect against future changes I would recommend also including the IV into the MAC calculation. Using a MAC over the IV will only add 16 bytes to the calculation, which is relatively insignificant. Currently your IV cannot be changed by an attacker, but a change in protocol could make the IV and therefore your message vulnerable to change.
  • I would also like to warn you that if the data is MAC'ed that you are vulnerable to substitution attacks: replacing the data of one file with another. You'd need something verifiable / unique in the header / metadata and MAC it together with your ciphertext (we have AEAD schemes to help with that).
DADi590 avatar
es flag
Well, my idea was that as I was already using a secure random generator (SecureRandom in Java) for the IV, I could use it again for randomizing the data. As random values may repeat themselves, I thought on this. And then also what you said "each ciphertext is used as a 'vector' for the next AES-block-encrypt after all" could also be mitigated since it was all basically random in each block of the message which is 16 bytes (in my head of no expert at all haha). But I didn't think on how often though collisions might happen.... Just though "they will happen, so I'll make it harder".
DADi590 avatar
es flag
But if it's really low chance of a collision and my idea won't do that much, I'll remove it and get back to normal UTF-8 encoding. Also, about AES-SIV, I can't go on that. It's not implemented in Android (Cipher on Android Developers - above Summary in case you're interested). The encapsulation part, if I got it right from Wikipedia, it's not my case. I don't store the key anywhere (at least for now - no server, just a small project). It's derived from a password that must be inserted manually - that's the thing that must be secure enough (better than storing locally the key).
Maarten Bodewes avatar
in flag
Run Argon2 or - if you want to use an already implemented one - PBKDF2. Beware that you only accept ASCII if you want to remain compatible with Java SE, and that you use a hash that has a large enough output size (do not extract more than the hash output size from the function). A secure random 16 byte salt and an iteration count that is as high as feasible, and you're all set (you may want to include a version number in your protocol in case you ever want to upgrade it, or the iteration count) . `new SecureRandom()` is generally fine, and should give you a well seeded PRNG.
DADi590 avatar
es flag
About the security notes, I'm already including the IV in the MAC calculation (I calculate the MAC with the cypher text, IV and MAC key). I'm also using AEAD on this. I read exactly what you said, that it's possible to swap 2 encrypted files and they'll be read anyways. Though, if I'd include some header inside the file, that might not work. But in any case, yes, I'll use AAD to help preventing this. Thank you for all the info! I'll mark it as the answer!
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.