Score:1

Derivating random numbers from random numbers

ua flag

If I have a "truly random number" $K$ of $L$ bits (whatever "truly random" means... is it a value from a normal distribution a truly random number, or only uniform distributions are considered "truly random"?), and a "truly random number" $T$ of $M \le L$ bits,

which arithmetic/bitwise algorithms among $K$ and $T$ can generate new truly random numbers? If $M=L$, is $K + T$ or $K\ xor\ T$ a truly random number? or if $M\lt L$, do methods like HKDF-expand-label, HKDF-extract, or just sha256 over $K$ and $T$ generate truly random numbers? (for example, dividing $K$ in $L/M$ blocks of $M$ bits, apply some of these methods and concat their outputs).

I would like to know more about which properties are required for a deterministic algorithm to yield truly random numbers provided its inputs are truly random numbers.

A couple of guesses (assuming $M=L$ for simplicity), $K + T$ is a truly random number, but $K\ bitwise\_and\ T$ is not.

NOTE: My question is on the context of one-time pad ciphers. I want to store the cipher text and $K$ separately, and transfer it over a secure-channel only when decryption is required, but instead of transferring $K$ itself, which could be very long since it must match the plaintext length (which can be very long), I'm thinking on calculating $K$ by derivation from $J$ (of size $L$) and $T$ of size $M$, both being random numbers. $J$ is stored client-side, $T$ is stored server-side and fetched over a secure channel, and $K$ is finally derivated client-side and discarded from memory immediately after decryption. My question above is finally about which derivation function to use so that $K$ is indistinguishable from a truly random number assuming $J$ and $T$ are truly random numbers.

Maarten Bodewes avatar
in flag
You seem to be confusing $M$ (the number of bits of $T$) and $T$ in your question.
sanscrit avatar
ua flag
@MaartenBodewes Yes you were right. Thank you. Fixed now.
Maarten Bodewes avatar
in flag
I think you forgot about 4 of them, tried to fix them, please review.
Score:3
cn flag

You've asked a lot of questions in this question, but three stand out:-

whatever "truly random" means... is it a value from a normal distribution a truly random number, or only uniform distributions are considered "truly random"?

No. A truly random number (distribution) is simply non algebraic, unpredictable yet ergodic. It has no seed and no generating formula. It can only be categorised by very basic group statistics. A real life example is this distribution from a Zener diode based device:-

histogram

There is no commonly accepted name for this distribution. It has a mean and standard deviation, but no algebraic quantile, skewness or entropy. It just exists empirically (with $H_\infty \approx 6$ bits/byte).

I would like to know more about which properties are required for a deterministic algorithm to yield truly random numbers provided its inputs are truly random numbers.

The only major property is that $ \operatorname{X}: \{0,1\}^n \to \ \{0,1\}^m $ with $ m \lt n $. $X$ can be lots of things like von Neumann extractors, CRC functions, matrices, LFSRs and general (universal) hash functions. An important point though is that $X$ does not have be any form of cryptographic function. It's a fallacy to think so, but the security arises from the truly random input bits of length $n$.

I'm thinking on storing client-side a random number J of length L, transferring a pre-generated "truly random token" T instead (uniquely tied to that specific plaintext), and generating K client-side...

You are in fact improving the one time pad :-( Unless you're talking of quantum key distribution, that's not possible, yet a common attempt on this forum. I'm not exactly clear of your proposal, but the giveaway is the phrase generating. One time pads are not generated client side, but rather centrally in a one to one or one to many architecture. Anything else is a pseudo random process or stream cipher construct.

sanscrit avatar
ua flag
I have improved my last note of the question. I hope it's more clear now.
Score:2
ru flag

In terms of bitwise arithmetic/bitwise operators on uniformly distributed independent random numbers (cryptography usually uses uniform random, but there are exceptions e.g. in lattice cryptography), XOR gives a uniform distribution, + gives a triangular distribution, AND gives a distribution where $P(N)=0.25^{w(N)}0.75^{L-w(N)}$ where $w$ is the Hamming weight, product is complicated, difference is triangular, OR gives a distribution like AND but with the roles of $w(N)$ and $L-w(N)$ reversed, NOR has the same distribution as AND, NAND has the same distribution as OR.

The effect of cryptographic hash functions on uniform input is not usually known to be uniform, but is often modelled as such.

For a one-time pad you will want a uniform distribution across all keys of the same length as the plaintext. A universal hash function should achieve this, but make sure that inputs are secret, suitably distributed and only used once.

sanscrit avatar
ua flag
So based on your answer I assume that if I have two keys A and B having L bits of entropy each, A xor B will still have L bits of entropy right?
Daniel S avatar
ru flag
Yes, that is the case, assuming that their generation is independent. (An obvious counter example is A=B).
Score:1
fr flag

All of the algorithms you've mentioned (HKDF and SHA-256) are pseudorandom. In general, any approach you're going to be using here to expand some truly random numbers into other sequences are also pseudorandom. That's because these algorithms are deterministic: if you put the same data (entropy) in, then you'll get the same results.

True random numbers come from a physical source and are not deterministic. That can be radioactive decay, thermal noise, free-running oscillators, or other types of sources. Because these don't necessarily produce equal numbers of ones and zeros and failure is not unknown, usually there is some sort of filtering and processing, which may be a cryptographic algorithm like AES-CBC-MAC or SHA-256, or some sort of bias elimination or reduction like XORing or Von Neumann debiasing, or both.

If you used a TRNG and produced random bits, you could XOR them with more bits from a TRNG and get the same security, but this would be silly because you're using twice as many bits from a TRNG for no gain in security. Similarly, you could perform modular addition over bytes with two TRNG values, but again this has the same limitation. Bitwise AND biases the output, so it would weaken the security.

Since any algorithm to expand the output of a TRNG is going to be pseudorandom instead of truly random, what you're essentially proposing sounds like a stream cipher with a pre-shared key based on a TRNG. However, that's totally fine as a design! As long as you pick a secure stream cipher, that's a legitimate and secure design. However, stream ciphers are not one-time pads and are not provably secure, but they are vastly more convenient to use in real life.

If you do choose to go this approach, I highly recommend you pick an existing, well-designed library to build this. TLS has support for pre-shared keys, and there are of course other libraries which can do this as well for messages.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.