The Curve25519 curve has $\ell = 2^{252}+27742317777372353535851937790883648493$ possible points in the prime-order group, corresponding to $\ell$ private keys.
However, X25519 refers to a particular method of choosing a private key and performing a Diffie-Hellman exchange.
You could use Curve25519 but avoid X25519 entirely. You would choose any private key $a$ such that $0<a<\ell$, and then use regular variable-base scalar multiplication to compute a shared secret as $aB$, where $B=bG$, $G$ is a well-known base point, and $b$ is the private key of the other party such that $0<b<\ell$.
With X25519, the little-endian private key has the least significant byte clamped so that the private key is a multiple of 8. This is so that the result of the X25519 operation cannot be manipulated if the other party provides you with a public key point which is on the curve, but is not in the prime-order group. Depending on the situation, this prevents another party from learning a tiny bit of information about your private key. If you did not clamp, then you would have to take the extra step of validating that the other party's public key is in the prime-order group, by multiplying it by $\ell$ and checking that the result is the point at infinity.
The most significant byte is altered so that the most significant bit that is set as 1 is always in the same position. This means that the implementation in the original X25519 paper will run in constant time, thus avoiding leaking information about the private key.
What this means is that you can choose any random 32-byte sequence, and then clamp it to produce a private key. This private key may exceed $\ell$, but this will not matter, since the X25519 scalar multiplication operation is designed to accept oversized private keys. The scalar multiplication operation produces points that are in a cyclic group, so $(a+n\cdot\ell)B=aB$ for any integer value of $n$.
public key without a corresponding private key
If $B$ is just a random byte sequence, it will still be treated as a valid point on the curve by X25519. Therefore, if the byte sequence $B$ is not a valid point on the curve, it will be treated as a point $B'$ which is a valid point on the curve. One in 8 valid points on the curve will be in the prime-order group, and so the other 7 will not have an associated private key. The process of converting a private key to a public key will always produce a point in the correct prime-order group, because this involves multiplying by a base point which is definitely on the curve and in the correct prime-order group. Therefore, a valid curve point that is not in the prime-order group could not have been generated by any private key.
Acccording to my understanding, the mapping $K_S \to K_P$ is
one-to-one for all X25519 key pairs $(K_S,K_P)$, but please correct me
if I am incorrect.
This would be true if you followed the method I described above for using Curve25519 but avoiding X25519, and if the entire resulting coordinate pair was preserved. This would not be true for X25519, since X25519 allows private keys greater than $\ell$, where more than one private key byte sequence will map to the same public key (as described above). Furthermore, X25519 outputs only an x-coordinate and leaves ambiguity over the sign of the y-coordinate that would be recovered from the x-coordinate (Curve25519 is symmetrical across the x-axis).
if I were theoretically to compute the corresponding public keys for
all $2^{251}$ X25519 private keys, how many (unique†) public keys would I
obtain?
This is equivalent to asking:
How many possible values of $a$ are there, such that $a=x\bmod \ell$, where $2^{253}\leq x<2^{254}$, and $x$ is a multiple of 8. Then, the answer would have to be roughly halved to account for the ambiguity of the y-coordinate sign for each X25519 x-coordinate output. Perhaps a mathematician here can solve this brain teaser. I'd expect there to be collisions due to the $\operatorname{mod} \ell$ operation.