First, to define the construction. AES-GCM when used in the suggested way is equivalent to $$\operatorname{KDF}(K,S,I)=\operatorname{AES}_K(S\|0^{32})\oplus \operatorname{GHASH}_{\operatorname{AES}_K(0^{128})}(I)$$
where $\operatorname{GHASH}_H(M)\approx \sum_i H^iM_i$ for message blocks $M_i$ of size 128-bit and the multiplications and additions being done in a 128-bit finite field with polynomial representation. This is an approximation of GHASH that will work for our below discussion.
Is that construction secure…
I'll call this KDF construction secure if it behaves like a PRF in both the inputs, subject to the named constraints, i.e. unique inputs yield independent pseudo-random outputs.
[Assuming] the salt / nonce is globally unique among the whole lifetime of the the key and over all device IDs?
As you can see from the above expression, assuming your salts are unique, the AES operation will mask any GHASH result and the output will look random and unpredictable. Given the answer to the next part though, it is questionable whether the associated data provides any useful benefit though, besides potentially obscuring a salt-collision event slightly.
[Assuming] the salt / "nonce" is only unique for a specific ID and there can be several devices that share the same salt by accident (due to a collision of random values)?
Unfortunately no. If two derivations use the same $(K,S)$ pair with two different $I,I'$ queries and an adversary learns the corresponding output keys $k,k'$, then an adversary can learn $k\oplus k' = \operatorname{GHASH}_{\operatorname{AES}_K(0^{128})}(I) \oplus \operatorname{GHASH}_{\operatorname{AES}_K(0^{128})}(I')\approx \sum_i H^i(I'_i\oplus I_i)$ which is an expression from which an adversary can recover $H=\operatorname{AES}_K(0^{128})$ allowing them to recover $k\oplus \operatorname{GHASH}_{\operatorname{AES}_K(0^{128})}(I) = \operatorname{AES}_K(S\|0^{32})$ and from there predict the derived key for any $I$ for this fixed $(K,S)$ pair.