Importance of Distribution in Cryptography

br flag

I have been studying the subject of theoretical cryptography and there has always been discussion regarding the sampling and distribution(while learning any schemes) in the class. The questions are like What is the distribution from which the input will be sampled from? Also, I get confused in the question like how will the samples be generated?

I just wanted to get a perfect understanding of sampling and the importance of distribution while sampling in cryptography.

ng flag

In cryptography, a (probability) distribution is most often discrete, that is a function $F$ from a finite set $\mathcal S$ to the interval $[0,1]$ of $\mathbb R$ such that $$1=\sum_{x\in\mathcal S}F(x)$$

$F(x)$ is to be understood as the probability that $x$ occurs in some circumstance considered. Depending on circumstances, set $\mathcal S$ can for example be the set of keys, symbols in some alphabet or information chunks (e.g. byte), English text up to some size, possible inputs or outputs of some function.

Often (and unless otherwise apparent or stated) the distribution is assumed uniform, that is constant all over $\mathcal S$. It follows $F(x)=1/\lvert\mathcal S\rvert$ irrespective of $x$ in $\mathcal S$.

Often (and unless otherwise apparent or stated), when some useful unit of information consists or random symbols of the same set (e.g. bits, bytes, characters of a key), it is assumed the same function $F$ applies to all symbols, that is the symbols are random and independent. That's a different (and orthogonal) notion from uniform.

Far from all distributions considered in cryptography are uniform or/and independent. For example the distribution of letters in English plaintext is far from uniform, and a pair of adjacent symbols are far from independent. In that case it makes more sense to consider the distribution of English words, or the distribution of two or three consecutive letters in some large sample of English text.

The notion matters in many fields of cryptography and cryptanalysis. For example, the security of the One Time Pad depends on the pad having uniform distribution; and if it is composed of symbols, that the symbols are independent (that is use the same distribution, which must be uniform too); but the security of the OTP does not depend on the distribution of plaintext.

Generating sample(s) according to some distribution $F$ is picking element(s) $x_i$ from set $\mathcal S$ according to $F$, forming a (generally, ordered) tuple. Unless otherwise stated the samples (if more than 1) would be independently chosen, and each chosen in a way such that $x_i$ has probability $F(x)$ to be $x$. Or perhaps (depending on context) the choice could be by some deterministic (rather than random) process, such that the actual distribution is (or is assumed) indistinguishable from a random distribution per $F$.


Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.