It's asked different things in title and body, leading to radically different answers.
Can I use a cryptographic hash function such as sha256 for Randomness Extraction ?
Yes for standard cryptographic hash functions including SHA-256, any SHA-2 or SHA-3 function, Blake2b (even SHA-1 or MD5 if one does not care giving a poor first impression), and any of their truncation. They are suitable to transform a semi random input into a shorter, uniformly random bit string, assuming there is enough entropy in the semi random input, and assuming that this input is not prepared with knowledge of hash function. To illustrate why that last condition is required: if the input $x$ was filtered or tweaked so that $\operatorname{SHA-256}(x)$ has it's first bit $0$, then $\operatorname{SHA-256}(x)$ would be easily distinguishable from random.
I want to transform a semi random input to a shorter, uniformly random bit string. Assuming there is enough entropy in the semi random input, can I use a collision resistant hash function to extract randomness?
No, in general. Collision resistance is not a sufficient property. E.g. the function defined by $F(x)=\operatorname{SHA-256}(x)\mathbin\|\operatorname{SHA-256}(x)$ is just as collision-resistant as SHA-256 is, but is not suitable for randomness extraction as it's output is easily distinguishable from random.
Collision resistance is not a necessary property either. E.g. SHA-256 truncated to it's first 8 bytes is not collision-resistant, but is suitable for randomness extraction.
Security is met if the entropy extraction function is a random member of a computationally secure Pseudo-Random Function Family (PRF).
Again, it's critical that the "semi random input" is not somewhat biased with knowledge related to the function used for randomness extraction. For example: no matter how large and random most of the input is, if one input bit is prepared with knowledge of the rest of the input and of the otherwise ideal extraction function, then it's easy to get the first bit of the output equal to 0 with probability near 75%.
Update following comment:
I found some (direct academic source of the "yes") for HMAC, but if the hash function is not keyed, would this still be the case?
Yes for any standard cryptographic hash function, if it's usable to instantiate a PRF (when the kind of length-extension property SHA-256 has does not matter). But I'm afraid I can't point a source, because my "not somewhat biased with knowledge related to the function used for randomness extraction" condition is both necessary, and hard to formalize other than as without knowledge of the key of a PRF, and there's no key in a hash. Hence the formal statement using HMAC, which has a key input and is an archetypal PRF.
We can rightly argue that the constants in the definition of a hash like SHA-256 in practice (and proof of the Merkle-Damgård construction used in SHA-256) plays the role of a key. It's widely believed that SHA-256 passes the PRF experiment when we use the 32-byte IV as key; and that's even provable under an ideal cipher model for the cipher underlying the Davies-Meyer compression function. But that's not part of the design goals of SHA-256, nor a very standard result, and we need lesser hypothesis to prove that HMAC is a PRF.