Diffusion is the same as mixing.

Read section 8.3 as they suggest. A lot of it for this simple structure depends on the choice of the rotation constants, and those were experimentally optimized using an evolutionary algorithm.

In the case of AES, the use of MDS codes means that 5 of 8 bytes are always "active" (the minimum weight/distance of the MDS code with parameters $[n,k]=[8,4]$ is $d=n-k+1=5.$) This means that you get diffusions (maximum possible under linear mixing due to the MDS bound on $d$) of the type

- 1 nonzero byte (of the $(a_{i,j})_{i=1,\ldots,4}$) going to at least 4 nonzero bytes (of the $(b_{i,j})_{i=1,\ldots,4}$)
- 2 nonzero bytes (of the $(a_{i,j})_{i=1,\ldots,4}$) going to at least 3 nonzero bytes (of the $(b_{i,j})_{i=1,\ldots,4}$)
- 3 nonzero bytes (of the $(a_{i,j})_{i=1,\ldots,4}$) going to at least 2 nonzero bytes (of the $(b_{i,j})_{i=1,\ldots,4}$)
- 4 nonzero bytes (of the $(a_{i,j})_{i=1,\ldots,4}$) going to at least 1 nonzero bytes (of the $(b_{i,j})_{i=1,\ldots,4}$)

in each of the columns in MixColumns.

To explain further, the Singleton Bound Wikipedia states that for any code over a symbol alphabet with $q$ elements, the minimum distance (and thus for a linear code the minimum weight, i.e., the number of nonzero components)
satisfies
$$
d\leq n-k+1
$$
where $n$ is the length of the codewords, $k$ is the dimension of the code. MDS is maximum distance separable, i.e., optimal, i.e., the largest possible minimum distance for given $n,k$ parameters.

We have $q=2^8,$ here since we consider each byte as a code symbol. Since the column of $b'$s is obtained by multiplying the column of $a$'s with an MDS matrix the two columns form codewords in this code. Thus $d=n-k+1=5$ thus at least 5 of the bytes in the $b$'s column and the corresponding $a$'s column are guaranteed to be nonzero.

This is an algebraic design. In the Skein spec they use the somewhat strange expression 5 full diffusions for AES, presumably related to this value of 5 nonzero bytes.

See here for more details on the AES mixing design.