Score:1

Stream cipher padding

na flag

Problem example

Let's say I have a plaintext with length of 50 bytes. I want to encrypt it using a stream encryption algorithm ChaCha20-Poly1305.

Poly1305 generate a 128 bit hash (16 bytes), so encrypted message will be of length 50 + 16 = 66... If I append nonce to it (12 bytes), it'll be 78 bytes.

But... When I add for example 1 more byte to a plain text, cipher text will be of a length 79 bytes (+1).

What I need

I need to hide a real plaintext length. I heard about oracle padding attack, so if I use some padding, or create my own padding scheme (until plaintext length will be divisible by 16), will this attack affect me?

My question is: It is safe to use a padding to an authenticated stream cipher, like ChaCha20-Poly1305?

samuel-lucas6 avatar
bs flag
Don't invent your own padding scheme. Use [PADME](https://github.com/samuel-lucas6/PADME.NET) for deterministic padding or [Covert's randomised scheme](https://github.com/samuel-lucas6/CovertPadding). These will be more effective at hiding the length. Apply padding (e.g. following [ISO/IEC 7816-4](https://en.wikipedia.org/wiki/Padding_(cryptography)#ISO/IEC_7816-4)) before encryption so the padding bytes are authenticated by ChaCha20-Poly1305. You can then decrypt and remove the padding.
Ergo avatar
na flag
@samuel-lucas6 Thanks, I'll try.
Ergo avatar
na flag
@samuel-lucas6 how should I use PADME? In this repository is just function for calculation padding and padded length. How do I use it? It's something, like it'll calculate a number of bytes to be appended to plaintext? And what about Covert's randomised scheme?
samuel-lucas6 avatar
bs flag
That's what I meant by 'following ISO/IEC 7816-4'. PADME and Covert's are for calculating the amount of padding, and then you use ISO/IEC 7816-4 to append the padding because it's reversible without storing the amount of padding. [Here](https://github.com/ektrah/nsec/blob/master/src/Experimental/Iso78164Padding.cs) is an ISO/IEC 7816-4 example. It's also available in [libsodium](https://doc.libsodium.org/padding).
Ergo avatar
na flag
Thanks for your help.
samuel-lucas6 avatar
bs flag
@kelalaka Will do. Sorry, I have a habit of writing comments because it's quicker.
kelalaka avatar
in flag
comments clean up time..
Score:3
bs flag

Is it safe to use padding with an authenticated stream cipher like ChaCha20-Poly1305?

Yes, assuming you authenticate the padding so it can't be tampered with. Applying the padding before encryption solves this problem and is the typical approach. Alternatively, padding could be authenticated using the associated data.

I don't want people to know the "correct" plaintext length.

Effective padding is very difficult, so any simple or custom scheme (e.g. padding to a fixed block size) will likely either reveal the approximate length still or incur massive overhead (e.g. double the length of the message).

Fortunately, there are two padding schemes for this purpose:

  1. PADMÉ is a deterministic scheme designed to limit length leakage with minimal overhead. It was created for encoding padded uniform random blobs (PURBs).
  2. Covert's scheme provides random padding whilst ensuring a minimum fixed amount of padding for small messages. It was created for the Covert file encryption tool, and I'm now using it in Kryptor, my file encryption tool.

Note that these only calculate the amount of padding. You need to use something like ISO/IEC 7816-4 to apply the padding in a reversible manner without storing the amount.

Also, both have limitations:

  • PADMÉ creates distinguishable file sizes and fails to hide the length of small messages as well as messages that vary on the border between two padding amounts. For example, messages 0-8 bytes long receive no padding, 9 bytes means 1 byte of padding, and so on.
  • Covert's scheme won't protect against statistical attacks when an attacker can observe the same message padded many times, for example. It also leads to slightly more overhead and hasn't been formalised in a paper like PADMÉ.
  • Implementations are rare. Few projects use either scheme, although few projects attempt to hide metadata like file lengths, largely due to complexity and performance/storage costs.

Implementations

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.