Is the slowness of multiplication really such a big problem that multiplication can't find widespread use in fast lightweight encryption algorithms? Probably on IOT devices or RFID chips it can be a problem, but when it comes to computers and smartphones, an encryption algorithm based on multiplication couldn't be a problem, isn't it?
Part of the issue appears to be the definition of 'lightweight', and the intended platforms it is targeted. The CPUs on smartphones are actually quite capable; I would not characterize those platforms (or laptop computers) as necessarily 'lightweight'. Lightweight crypto is generally designed with microcontrollers in mind; typically, those microcontrollers don't have built-in 64x64 bit multiplication instructions.
Now, modular multiplication (for modulus a power of 2) can be implemented by a series of shifts and conditional additions; certainly doable, but considerably more expensive than an addition operation.
The other issue would appear to be that modular multiplication isn't as wonderful as you would have hoped. For this discussion, I'll limit my discussion to multiplication modulo a power of 2 (multiple modulo a prime doesn't have these issues; they do have have issues around the range not being a power of 2).
Modular multication does not have any 'right-word' propagation; for example, flipping the high bit of one of the inputs would only affect the high bit of the output; the other bits are unaffected. Of course, modular addition has the same issue; however it's also cheaper.
Modular multiplication does have strong differentials; the strongest is based around the identity $(-x)*y = -(x*y)$ (and the modulus operation does not break this up).
Both of these issues can be designed around in a proper design; however the fact that you have to do so makes it less attractive. In addition, it begs the question: why not use multiplication in $GF(2^k)$ instead? If we're doing a shift/add implementation, a double/xor implementation of Galois multiplication isn't much more expensive, and it avoids the above two issues...