I will only address the HKDF part.

HKDF was introduced in the following paper: https://eprint.iacr.org/2010/264.pdf

In this context, HMAC is used for two somewhat distinct purposes: 1) randomness extraction and 2) variable (input/output) length PRF.

The key-swap happens for randomness extraction. The situation here is that we are given a keying material $IKM$ that is not (pseudo)uniform random and want to create a key $PRK$ that is pseudorandom (i.e., computationally indistinguishable from random).

As you noted, HMAC is also shown to be a PRF. However, we cannot rely on PRF security to argue the security of $PRK$. But the paper argues that this use of HMAC is suitable for providing a computation randomness extractor (see section 6).

Speaking of PRFs, an interesting thing to note is that some security proofs, like TLS, actually rely on the so-called PRF-ODH assumption(https://eprint.iacr.org/2017/517.pdf). When applied to the use of HKDF in TLS: recall that the two parties exchange DH shares $(g^x, g^y)$; the (on variant of the) assumption roughly says that: the function $F(K, X) = HMAC (X, K) $ is a PRF under the assumption that the underlying compression function is a random oracle; even if the attacker was given access to an oracle $\mathcal{O}(T,v) = F (T^x, v) $. (Omitted here: restrictions on values of $(T,x)$ and the maximal number of queries).

Note here that the function $F$ above has keyspace $\langle G \rangle$, the group used for the DH exchange. So we are dealing with a uniform random key on the keyspace in the context of PRF-ODH.

P.S: consider reading this answer as well https://crypto.stackexchange.com/a/30461/58690