One can have $|X_n|\neq |Y_n|$, or even $|X_n| = f(n) \neq \infty = |Y_n|$ for some function $f(n)$.
This occurs in practice, namely in lattice-based cryptography, where $X_n$ is some (ideal) discrete Gaussian, while $Y_n$ is some approximation to it of finite support (so one can sample from it in constant-time).
For a discrete gaussian of mean $\mu$ and standard deviation $\sigma$, one can replace it with that same gaussian conditioned to being in the interval $[\mu-10\sigma, \mu+10\sigma]$ or something.
What I think you're missing is that being "distinguishable" vs "indistinguishable" is not a binary.
If one ignores pseudorandomness for a bit (and talks about purely statistics), then often the total variation distance (or statistical distance)
$$\Delta(X, Y) = \frac{1}{2}\sum_{x\in\mathsf{supp}(X)\cup\mathsf{supp}(Y)}|\Pr[X = x]-\Pr[Y = x]|.$$
This (essentially) measures how hard it is to distinguish $X$ and $Y$.
For example, if $\Delta(X,Y)\leq \epsilon$, then it takes $O(1/\epsilon^2)$ samples from an unknown distribution to reliably (say with probability close to 1) distinguish between the cases of the samples coming from $X$ or $Y$.
So if $\epsilon \approx 2^{-n}$, one can say they cannot be distinguished with polynomially many samples.
There are other times that one may want to say it is hard to distinguish distributions that have $|X_n|\neq |Y_n|$.
For example, let $X_n$ be uniform over all functions $F: \{0,1\}^n\to\{0,1\}^n$.
Let $Y_n$ be uniform over all invertible functions with this domain/range.
Then $|X_n|\neq |Y_n|$, but one can bound the statistical distance between these distributions provided one gets at most $q$ queries (this is called "PRP-PRF Switching Lemma").