We run a platform that uses Linux bridging to filter traffic and also logs that activity to a MySQL server. Occasionally we have an issue where the unit will experience very high latency, and leading up to that we often see a repeating page allocation failure in the mpt3sas
driver, and logged to /var/log/messages
. These seem to occur under times of high system load, but also on a system with seemingly sufficient memory. I do not have the expertise to read these logs properly and was hoping someone may have some insight.
I have tried tuning the vm.min_free_kbytes = 65536
(and we are using vm.reclaim_mode = 1
) but that doesn't seem to alleviate the problem. Does anyone have any ideas? (Logs follow:)
localhost kernel: [21572436.601597] sas3ircu: page allocation failure: order:4, mode:0xcc0(GFP_KERNEL), nodemask=(null),cpuset=/,mems_allowed=0
localhost kernel: [21572436.601601] CPU: 2 PID: 22663 Comm: sas3ircu Tainted: G W O #1
localhost kernel: [21572436.601602] Hardware name: XXXXXXXXXXX , BIOS 3.1 06/06/2018
localhost kernel: [21572436.601602] Call Trace:
localhost kernel: [21572436.601609] dump_stack+0x7c/0x9c
localhost kernel: [21572436.601612] warn_alloc.cold+0x7b/0xdf
localhost kernel: [21572436.601615] ? _cond_resched+0x15/0x30
localhost kernel: [21572436.601617] ? __alloc_pages_direct_compact+0x141/0x150
localhost kernel: [21572436.601618] __alloc_pages_slowpath+0xd88/0xdc0
localhost kernel: [21572436.601622] ? node_reclaim+0x2b1/0x310
localhost kernel: [21572436.601624] ? get_page_from_freelist+0xaf/0x3a0
localhost kernel: [21572436.601625] __alloc_pages_nodemask+0x2bf/0x310
localhost kernel: [21572436.601628] __dma_direct_alloc_pages+0x137/0x220
localhost kernel: [21572436.601630] dma_direct_alloc_pages+0x1c/0x80
localhost kernel: [21572436.601639] _ctl_do_mpt_command+0x724/0xc40 [mpt3sas]
localhost kernel: [21572436.601642] ? ima_file_check+0x59/0x80
localhost kernel: [21572436.601646] _ctl_compat_mpt_command+0xd1/0x100 [mpt3sas]
localhost kernel: [21572436.601651] _ctl_ioctl_main+0x4e0/0xb80 [mpt3sas]
localhost kernel: [21572436.601655] ? __ia32_compat_sys_ioctl+0x189/0x210
localhost kernel: [21572436.601656] __ia32_compat_sys_ioctl+0x189/0x210
localhost kernel: [21572436.601659] do_int80_syscall_32+0x6e/0x1d0
localhost kernel: [21572436.601660] entry_INT80_compat+0x85/0x90
localhost kernel: [21572436.601669] Mem-Info:
localhost kernel: [21572436.601672] active_anon:9743919 inactive_anon:513867 isolated_anon:0
localhost kernel: [21572436.601672] active_file:35892 inactive_file:14339 isolated_file:0
localhost kernel: [21572436.601672] unevictable:0 dirty:398 writeback:1 unstable:0
localhost kernel: [21572436.601672] slab_reclaimable:51419 slab_unreclaimable:4912133
localhost kernel: [21572436.601672] mapped:18355 shmem:22661 pagetables:53364 bounce:0
localhost kernel: [21572436.601672] free:1065699 free_pcp:351 free_cma:0
localhost kernel: [21572436.601675] Node 0 active_anon:38975676kB inactive_anon:2055468kB active_file:143568kB inactive_file:57356kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:73420kB dirty:1592kB writeback:4kB shmem:90644kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
localhost kernel: [21572436.601675] Node 0 DMA free:15884kB min:12kB low:24kB high:36kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15968kB managed:15884kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
localhost kernel: [21572436.601678] lowmem_reserve[]: 0 1784 64117 64117
localhost kernel: [21572436.601679] Node 0 DMA32 free:255804kB min:1892kB low:3788kB high:5684kB active_anon:170384kB inactive_anon:80484kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:1965184kB managed:1899648kB mlocked:0kB kernel_stack:0kB pagetables:56kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
localhost kernel: [21572436.601682] lowmem_reserve[]: 0 0 62333 62333
localhost kernel: [21572436.601683] Node 0 Normal free:3991108kB min:63624kB low:127460kB high:191296kB active_anon:38805292kB inactive_anon:1974984kB active_file:143684kB inactive_file:57032kB unevictable:0kB writepending:1596kB present:65011712kB managed:63836092kB mlocked:0kB kernel_stack:5604kB pagetables:213400kB bounce:0kB free_pcp:1404kB local_pcp:232kB free_cma:0kB
localhost kernel: [21572436.601686] lowmem_reserve[]: 0 0 0 0
localhost kernel: [21572436.601687] Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15884kB
localhost kernel: [21572436.601694] Node 0 DMA32: 14687*4kB (UME) 10010*8kB (UME) 7183*16kB (UME) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB (H) 0*4096kB = 255804kB
localhost kernel: [21572436.601697] Node 0 Normal: 297793*4kB (UM) 129409*8kB (UM) 110330*16kB (UME) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3991724kB
localhost kernel: [21572436.601701] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
localhost kernel: [21572436.601702] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
localhost kernel: [21572436.601702] 107240 total pagecache pages
localhost kernel: [21572436.601707] 34281 pages in swap cache
localhost kernel: [21572436.601708] Swap cache stats: add 18740072, delete 18705912, find 159408767/161694352
localhost kernel: [21572436.601708] Free swap = 4913860kB
localhost kernel: [21572436.601708] Total swap = 33554424kB
localhost kernel: [21572436.601709] 16748216 pages RAM
localhost kernel: [21572436.601709] 0 pages HighMem/MovableOnly
localhost kernel: [21572436.601709] 310310 pages reserved
localhost kernel: [21572436.601710] 0 pages cma reserved
localhost kernel: [21572436.601710] 0 pages hwpoisoned
localhost kernel: [21572436.601711] failure at drivers/scsi/mpt3sas/mpt3sas_ctl.c:763/_ctl_do_mpt_command()!