Both are not super clear, especially if you do not understand the math involved, though both have very similar descriptions. I would say the first book has a better description for confusion, but the second is better for diffusion, though it is the key that is being diffused into the plaintext to make the ciphertext, in a "confusing" way so that you cannot "undiffuse" either out of the ciphertext.
In fact, with a large enough "confusion" layer, you do not need a "diffusion" layer. Confusion refers to a non-linear operation, and diffusion a linear operation. Large non-linear operations are VERY computationally expensive, which is why a small non-linear operation is combined with linear operations to get the job done.
What the combination of linear and non-liner mixing does is make the relationship between the ciphertext and key, and between the ciphertext and plaintext, an extremely complex math problem to solve. If someone knows the plaintext and the ciphertext, you do not want them to find the key, and if they know the ciphertext only, you do not want them to find the key or the plaintext.
It helps to see how these these apply to a common cipher like AES. AES uses a 128-bit block, with an 8-bit non-linear sbox, and a linear matrix multiplication in a finite field operating on 32-bits of the block in 4 parallel paths.
In AES, the confusion comes from the s-box (Sub Byte layer), which is used in the round function as well as the key schedule. Diffusion comes from the matrix operation (Mix Column layer) combined with row shifting so that all input bits are fully mixed over the course of 2 rounds, in combination with how the round subkeys are generated in the key schedule.
With enough rounds, the work to solve the math problem becomes harder than brute forcing the key, and AES has those pretty much right at about the same level. Of course it is more complicated than that, but that is the cliff notes.
Image courtesy of wikpedia