The *average* number of inputs generating the same output is clearly the number of inputs divided by the number of outputs which is the quantity
$$
2^{384}/2^{128}
$$
in this case. To further clarify, as in a comment, here "average" is over the collection of (2^{128})^(2^{384})
possible functions from 384-bit bitstrings to 128-bit bitstrings, since the assumption is that we draw one of those functions uniformly at random.

Since a good hash function can be modelled as a random mapping we can say more about this, based on the *bins-in-balls* paradigm.

To start, see Wikipedia for more details here. In their definition, we would have $m=2^{384}$ and $n=2^{128}.$ For example as soon as $m>n \ln n$ the probability that all outputs are reached by some input approaches 1. This is the so called coupon collector problem, look it up. We can also estimate the load of the bin with most balls, least balls etc. There is a paper by Raab online with a title like "balls in bins: a tight analysis" with more technical details.

As in a comment, the distribution of the number of number of inputs resulting in a fixed hash output value, the Poisson approximation can be used and then that approximated by a Gaussian with mean $\mu=2^{256}$ and standard deviation $\sigma=2^{128}.$

Note that this is only a model, though usually very accurate, since any given hash function is deterministic.

**Edit:**

The paper by Raab and Steger available here also has results about how many inputs map to the *most popular* hash output. We have $m=n^3$ here with $n=2^{128}.$ Therefore we are in the last case of Theorem 1, where $m$ is large compared to $n$, namely $m\geq c n (\ln n)^3.$ (In fact even $m>2^{148}$ would put us into this case since $\log_2(\ln 2^{128})^3 \approx 19.4.$)

Applying this result we then see that the maximum load is with high probability equal to
$$
\frac{m}{n}+\sqrt{\frac{2 m \ln n}{n}(1-o(1))}
$$
where $o(1)$ goes to zero very rapidly. A quick calculation gives
$$
2^{256}+\sqrt{2^{385-128} \ln (2^{128})}\approx 2^{256}+\sqrt{2^{257+6.5}}=2^{256}+2^{131.5}
$$
as the expected maximum load in a bin (i.e., a fixed hash output). It is interesting that this value is approximately at $2^{3.5} \sigma \approx 11.3 \sigma$ when compared to the approximation given by Ilmari Karonen in a comment.