Is there $y\in\{0,1\}^{512}$ with $\emptyset \neq s^{-1}(\{y\})$ and $s^{-1}(\{y\})$ is finite?
For SHA-512 as actually defined, trivially yes, since the input size is limited to $2^{128}-1$ bit. That makes $s^{-1}(\{y\})$ true for any $y\in\{0,1\}^{512}$. And $y$ the hash of the empty bitstring meets the condition $\emptyset \neq s^{-1}(\{y\})$. In the following we assume this $2^{128}-1$ bit limit is removed.
My bets are that the proposition is now false, by the following argument (not a proof).
It seems plausible that every 512-bit hash is reached by SHA-512 for some at most 895-bit message (the largest bloc-aligned message requiring a single round). Argument: under an ideal block cipher model for the block cipher at the heart of a SHA-512 round, each of the hash of such message is random and independent in $\{0,1\}^{512}$. By the coupon collector bound, we expect to get all values after roughly $2^{520.5}$ draws, and the tail estimate essentially rules out (probability $<2^{-(2^{384})}$) we would not get them all after $2^{896}-1$ attempts. That realistically could only fail if our model is extremely bad.
It's even more plausible that every 512-bit hash is reached by SHA-512 for an at most 1919-bit message (the largest bloc-aligned message requiring two rounds), because with two rounds the number of possible inputs in the first rounds reaches $2^{1024}$, making it (by argument 1) likely that close to $2^{512}$ values can reach the 512-bit state input of the second round; and we get $2^{896}-1$ message input blocks as in the first round. Thus we are even more way past the coupon collector bound, and more importantly since we have 512 more input bits that vary it seems even more unlikely our model could be that bad.
This reasoning for the second round can be repeated for any larger number of rounds, leading to the conclusion that each value in $\{0,1\}^{512}$ likely is reached at most round.
It would be even more surprizing that when we let the message length modulo $2^{128}$ take all the possible values, any of the $2^{512}$ values is reached for no message $2^{128}$ to $2^{129-1}$ bit: we have nearly an additional 128 bit of input that can vary.
And then since the message length is allowed to grow indefinitely, we get that likely, any $y$ has infinite $s^{-1}(\{y\})$.
The $\emptyset \neq s^{-1}(\{y\})$ part of the proposition makes it even much less likely to hold. Let's suppose that the 512-bit initial value of SHA-512 is reached again (internally) after $2^{119}$ rounds for at least one (say $w$) of the $2^{(2^{129})}$ different messages starts of $2^{129}$ bits, which is likely per a slightly modified version of 3 above (we have full freedom for every of the $2^{119}$ 1024-bit input blocks of $w$). Now if a certain $y\in\{0,1\}^{512}$ is $\operatorname{SHA-512}(x)$, then it's also $\operatorname{SHA-512}(w\mathbin\|x)$ and more generally $\operatorname{SHA-512}(w\mathbin\|\ldots\mathbin\|w\mathbin\|x)$ thus constructively disproving the proposition.