Choice of nonce for reproducible encryption

AndreKR

4/30/24, 6:01 PM

In my application I have an SQLite database that stores labels for images, like this:

IMAGE ID	LABEL
1	foo
1	bar
2	bar
3	foo

The LABEL column is indexed as it is important that I can efficiently find all images with a certain label.

At rest I would like to encrypt those labels so that no one can learn the actual labels. Unfortunately encrypting the whole database seems difficult as it is not officially supported by rusqlite, the library I'm using. So I will have to resort to encrypting the labels before inserting them into the database. Of course it will still be possible to see which two images share a label, that is alright.

I am already using XChaCha20Poly1305 in another part of the application, but I'm not married to it if another cipher is better suited.

My question is where to safely get the nonce from.

I believe simply using the same nonce (maybe derived by HKDF together with the key) for all labels would be the infamous nonce reuse that is deadly for AEAD ciphers, including XChaCha20Poly1305? Can I use a hash of the label as nonce, i.e. derive the nonce from the plaintext? Or do I need to generate one random nonce per label and store them in the database, indexed by a hash of the label?

1 + 4

symmetric

nonce

chacha

mentallurg

4/30/24, 8:34 PM

Does this answer your question? [Is it safe to convert a 256-bit nonce...?](https://crypto.stackexchange.com/questions/78442/is-it-safe-to-convert-a-256-bit-nonce-into-a-192-bit-nonce-by-sha-256ing-and-the). Especially see not the accepted answer, but [this one](https://crypto.stackexchange.com/a/78448/27923).

AndreKR

4/30/24, 9:33 PM

@mentallurg Not really. That answer (and question) deals with the risk that the nonce is not unique. My nonces, were I to derive them from the plaintext, would be all but guaranteed to be unique, I'm asking whether it's an issue when there's a relationship between plaintext and nonce.

knaccc

4/30/24, 10:15 PM

Do you need to be able to decrypt the labels? Why not put the random IV into another table, so that you have a `label ciphertext, IV` table. If you only need to query by label, and don't need to decrypt the label, you can just do HMAC with a secret key to effectively get a hash of the label that cannot be brute-forced by someone that does not know the secret key

AndreKR

4/30/24, 10:44 PM

@knaccc I also need a list of which labels exist, so I need to decrypt them as well. Good idea to use another secret key for the hashes, that should make it even less likely that the derived nonce has some some hidden interaction with the encryption.

Score:1

Crypto

knaccc

5/1/24, 2:20 AM

Here is one approach:

Database clients have a secret key, which is used to encrypt/decrypt labels.
There is a table (label ciphertext, IV) in the database
Instead of an (image id, label) table, you'd have an (image id, label ciphertext) table
When creating a new label, your transaction first uses the secret key to decrypt all entries in the (label ciphertext, IV) table. You check if the label you want to create already exists. If it does not, you pick a random IV, encrypt the label, and insert a row for it into the table. You should NULL-pad the label to the maximum allowable label length, so that labels cannot be identified by ciphertext length.
For performance, the client may maintain a cache of existing labels, and of the most recent row in the (label ciphertext, IV) table that it has examined to discover existing labels.

+ 0

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: Choice of nonce for reproducible encryption

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.