I have some zfs pools, and every month or so during normal operation, the main pool will start to rebuild (resilver) with it's spare and a random drive. In looking at dmesg, I see this:
[Wed Nov 24 13:20:37 2021] audit: type=1400 audit(1637781634.835:321): apparmor="STATUS" operation="profile_replace" info="same as current profile, skipping" profile="unconfined" name="snap.canonical-livepatch.hook.connect-plug-etc-update-motd-d" pid=2454976 comm="apparmor_parser"
[Wed Nov 24 13:20:38 2021] loop27: detected capacity change from 0 to 8
[Wed Nov 24 13:24:48 2021] sde: sde1 sde9
[Wed Nov 24 13:31:26 2021] sdl: sdl1 sdl9
[Wed Nov 24 15:15:02 2021] kauditd_printk_skb: 42 callbacks suppressed
This is days after system booted due to software updates. I presume the messages for sde and sdl indicated the drives somehow went away from the system and are re-discovered? At boot time, the partitions message is followed by an attached disk message, this is not the case here.
Looking for reasons. sde is connected to a sas9201 card and from there to a different box housing all those drives. sdl is connected to the MB controller. Note it always seems to be 2 drives, on different controllers at around the same time. It's always been different drives also. 2 different controllers. Would love to assume cabling or something simple, but different controllers at about the same time and different drives every time? Also would seem to rule out controller issue. This is a system that's been up over a year, and only a few months ago started doing this.
The system operates normally otherwise with no issues. The glitch, whatever it is, causes nothing other than the zfs pool to rebuild, nothing is lost and nothing else glitches. Connected to a UPS, both the system and the disk array box. I see no other messages in the log files indicating any issue, nothing that says anything went away.
Been fishing for bugs of some sort, don't see any. It's a strange issue. No dive errors on these drives, nothing unusual in smart.
Is there anything I can do to further debug this? Something to enable, or setting to change? Suggestions?