This is the kind of problem one runs into when pods are schedulable anywhere. You're on the right track with affinity rules.
You could make these pods express an anti-affinity to each other by making pods within a deployment's replicaset express negative affinity for each other (so they spread among nodes). This makes scheduling somewhat heavy, but does accomplish keeping pods from causing cascading failures when a node is lost. It also does a pretty good job of making sure they're spread among failure domains, but that's more of a side-effect.
However, there is a better way to accomplish this - via pod topology spread constraints. By specifying a spread constraint, the scheduler will ensure that pods are either balanced among failure domains (be they AZs or nodes), and that failure to balance pods results in a failure to schedule.
One could write this in a way that guarantees pods are distributed among nodes, and that a node failure will not cause "bunching". Take a look at this example pod:
kind: Pod
apiVersion: v1
metadata:
name: mypod
labels:
foo: bar
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
foo: bar
- maxSkew: 1
topologyKey: node
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
foo: bar
containers:
- name: pause
image: k8s.gcr.io/pause:3.1
This can be combined with affinity rules if you also do not want deployments and their replicasets to schedule with other deployments on the same node, further reducing the "bunching" effect. A soft anti-affinity is typically appropriate in such a case, so the scheduler will "try to not" colocate those workloads when possible.