In cryptography, a (probability) distribution is most often discrete, that is a function $F$ from a finite set $\mathcal S$ to the interval $[0,1]$ of $\mathbb R$ such that $$1=\sum_{x\in\mathcal S}F(x)$$
$F(x)$ is to be understood as the probability that $x$ occurs in some circumstance considered. Depending on circumstances, set $\mathcal S$ can for example be the set of keys, symbols in some alphabet or information chunks (e.g. byte), English text up to some size, possible inputs or outputs of some function.
Often (and unless otherwise apparent or stated) the distribution is assumed uniform, that is constant all over $\mathcal S$. It follows $F(x)=1/\lvert\mathcal S\rvert$ irrespective of $x$ in $\mathcal S$.
Often (and unless otherwise apparent or stated), when some useful unit of information consists or random symbols of the same set (e.g. bits, bytes, characters of a key), it is assumed the same function $F$ applies to all symbols, that is the symbols are random and independent. That's a different (and orthogonal) notion from uniform.
Far from all distributions considered in cryptography are uniform or/and independent. For example the distribution of letters in English plaintext is far from uniform, and a pair of adjacent symbols are far from independent. In that case it makes more sense to consider the distribution of English words, or the distribution of two or three consecutive letters in some large sample of English text.
The notion matters in many fields of cryptography and cryptanalysis. For example, the security of the One Time Pad depends on the pad having uniform distribution; and if it is composed of symbols, that the symbols are independent (that is use the same distribution, which must be uniform too); but the security of the OTP does not depend on the distribution of plaintext.
Generating sample(s) according to some distribution $F$ is picking element(s) $x_i$ from set $\mathcal S$ according to $F$, forming a (generally, ordered) tuple. Unless otherwise stated the samples (if more than 1) would be independently chosen, and each chosen in a way such that $x_i$ has probability $F(x)$ to be $x$. Or perhaps (depending on context) the choice could be by some deterministic (rather than random) process, such that the actual distribution is (or is assumed) indistinguishable from a random distribution per $F$.