Score:2

How much randomness to reasonably "launder" a compromised TRNG?

id flag

Assume I have a physical RNG module that generates $n$-bit random numbers that pass randomness tests such as the Dieharder suite. As it is a black box device with an unknown source of randomness, let us also assume it has potentially been partially compromised: an attacker who knows the workings of the RNG module, given a single previous state, can correctly guess the next state in $2^{n-m}$ steps for $1 \leq m \leq n$.

Let us also assume that I have access to another, verifiably uncompromised but otherwise poor quality entropy source giving $x - p$ bits of entropy per $x$ bits of output for $0 < p < x$. Let us also assume that this source of entropy is significantly slower than the aforementioned RNG module.

  1. To "launder" the first RNG source so that it produces truly unpredictable numbers, how much work will I have to do?
    • To cancel the effects of $m$ on the output number, how many bits of entropy will I have to distill and provide from the entropy source?
    • Which operations will I have to perform to ensure that I do not get a compromised number out? Will concatenation + hashing with a cryptographically secure hash function "do the trick"?
    • Conversely, which operations will I have to avoid if I want the tampering to be "laundered" away?
    • How much truncation will I have to do?
  2. Assuming that $m$ is sufficiently large/close to $n$, would it be better to just extract random bits from the second source and concatenate them into one number, bypassing the use of the potentially compromised module?

My assumptions:

  1. I will have to use $m$ bits of uncompromised random output, concatenate it with the first output and hash that. I think truncating the concatenated output or the input number will not help too much if I don't know exactly which bits are compromised, or if I don't know the probabilities of which bit is compromised. I should avoid XORing the inputs before applying the hash.

  2. I assume the "laundering" is reasonable up to $m = n/2$, beyond which I should probably just extract randomness from the second source and concatenate the random bits into a $n$-bit number, bypassing the potentially compromised module altogether.

Score:1
cn flag

Dump TRNG1. Both RC4 and the Marsienne Twister pass Dieharder.

This is crypto and like pregnancy, there's no concept of a 'little' compromised. If it's compromised, you can predict the next output in perhaps just a few steps if it's seeded from e.t.c. the time (common), never mind $2^{da, da ,da}.$ Computational indistinguishability means that unless you built the RNG yourself, you'll never find out what it's actually doing. Therefore get rid of it. (Or pretend to use it, but that's getting a little bit serious).

You don't concatenate bits from TRNG2. You measure the entropy as $1 - \epsilon$ bits/bit (potential can of worms). $\epsilon$ is the 'poorness' metric; a bias away from perfect randomness. And then extract from TRNG2 using standard techniques like Toeplitz matrices, standard hashing (Pearson, CRC16, SHAx), universal hash functions or if you're lucky just von Neumann. The speed will depend on $\epsilon$.

mezenkur avatar
id flag
You're right, I was being maybe too optimistic and let my tinkering brain get too far ahead of practical concerns. Thank you for the insight.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.