Score:1

Debugging Mariadb randomly crashing

br flag

These are the logs journalctl -u mariadb -f from the past days at the moment of the crash:

Mar 27 14:25:35 new-server.novalocal mariadbd[30060]: 2023-03-27 14:25:35 40923 [Warning] Access denied for user 'root'@'182.160.117.92' (using password: YES)
Mar 27 14:25:36 new-server.novalocal mariadbd[30060]: 2023-03-27 14:25:36 40924 [Warning] Access denied for user 'root'@'182.160.117.92' (using password: YES)
Mar 27 14:25:37 new-server.novalocal mariadbd[30060]: 2023-03-27 14:25:37 40925 [Warning] Access denied for user 'root'@'182.160.117.92' (using password: YES)
Mar 27 14:25:38 new-server.novalocal mariadbd[30060]: 2023-03-27 14:25:38 40926 [Warning] Access denied for user 'root'@'182.160.117.92' (using password: YES)
Mar 27 14:56:50 new-server.novalocal mariadbd[30060]: 2023-03-27 14:56:50 41477 [Warning] Hostname 'zg-1220c-69.stretchoid.com' does not resolve to '192.241.234.8'.
Mar 27 14:56:50 new-server.novalocal mariadbd[30060]: 2023-03-27 14:56:50 41477 [Note] Hostname 'zg-1220c-69.stretchoid.com' has the following IP addresses:
Mar 27 14:56:50 new-server.novalocal mariadbd[30060]: 2023-03-27 14:56:50 41477 [Note]  - 127.0.0.1
Mar 27 14:56:50 new-server.novalocal mariadbd[30060]: 2023-03-27 14:56:50 41477 [Warning] Aborted connection 41477 to db: 'unconnected' user: 'unauthenticated' host: '192.241.234.8' (This connection closed normally without aut
Mar 27 15:14:38 new-server.novalocal systemd[1]: mariadb.service: main process exited, code=killed, status=9/KILL
Mar 27 15:14:38 new-server.novalocal systemd[1]: Unit mariadb.service entered failed state.
Mar 27 15:14:38 new-server.novalocal systemd[1]: mariadb.service failed.
Mar 27 15:14:43 new-server.novalocal systemd[1]: mariadb.service holdoff time over, scheduling restart.
Mar 27 15:14:43 new-server.novalocal systemd[1]: Stopped MariaDB 10.6.4 database server.
Mar 27 15:14:43 new-server.novalocal systemd[1]: Starting MariaDB 10.6.4 database server...
Mar 27 15:14:44 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:44 0 [Note] Using unique option prefix 'character-set-client' is error-prone and can break in the future. Please use the full name 'character-set-client-ha
Mar 27 15:14:44 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:44 0 [Warning] /usr/sbin/mariadbd: ignoring option '--character-set-client-handshake' due to invalid value 'utf8mb4'
Mar 27 15:14:44 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:44 0 [Note] Using unique option prefix 'key_buffer' is error-prone and can break in the future. Please use the full name 'key_buffer_size' instead.
Mar 27 15:14:44 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:44 0 [Note] /usr/sbin/mariadbd (server 10.6.4-MariaDB) starting as process 18423 ...
Mar 27 15:14:44 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:44 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
Mar 27 15:14:44 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:44 0 [Note] InnoDB: Number of pools: 1
Mar 27 15:14:44 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:44 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
Mar 27 15:14:44 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:44 0 [Note] InnoDB: Using Linux native AIO
Mar 27 15:14:44 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:44 0 [Note] InnoDB: Initializing buffer pool, total size = 8589934592, chunk size = 134217728
Mar 27 15:14:44 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:44 0 [Note] InnoDB: Completed initialization of buffer pool
Mar 27 15:14:44 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:44 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=549981281636,549981636145
Mar 27 15:14:51 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:51 0 [Note] InnoDB: Starting final batch to recover 3777 pages from redo log.
Mar 27 15:14:55 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:55 0 [Note] InnoDB: 128 rollback segments are active.
Mar 27 15:14:56 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:56 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
Mar 27 15:14:56 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:56 0 [Note] InnoDB: Creating shared tablespace for temporary tables
Mar 27 15:14:56 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:56 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
Mar 27 15:14:56 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:56 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
Mar 27 15:14:56 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:56 0 [Note] InnoDB: 10.6.4 started; log sequence number 550168329917; transaction id 2374116794
Mar 27 15:14:56 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:56 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
Mar 27 15:14:56 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:56 0 [Note] Plugin 'FEEDBACK' is disabled.
Mar 27 15:14:56 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:56 0 [Note] Server socket created on IP: '0.0.0.0'.
Mar 27 15:14:56 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:56 0 [Note] Server socket created on IP: '::'.
Mar 27 15:14:56 new-server.novalocal mariadbd[18423]: 2023-03-27 15:14:56 0 [Note] /usr/sbin/mariadbd: ready for connections.
Mar 27 15:14:56 new-server.novalocal mariadbd[18423]: Version: '10.6.4-MariaDB'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MariaDB Server
Mar 27 15:14:56 new-server.novalocal systemd[1]: Started MariaDB 10.6.4 database server.
Mar 27 15:15:04 new-server.novalocal mariadbd[18423]: 2023-03-27 15:15:04 12 [Warning] IP address '45.84.208.80' could not be resolved: Name or service not known
Mar 27 15:25:12 new-server.novalocal mariadbd[18423]: 2023-03-27 15:25:12 0 [Note] InnoDB: Buffer pool(s) load completed at 230327 15:25:12
Mar 27 15:25:34 new-server.novalocal mariadbd[18423]: 2023-03-27 15:25:34 403 [ERROR] mariadbd: Table './abac/abac_logger' is marked as crashed and should be repaired
Mar 27 15:25:34 new-server.novalocal mariadbd[18423]: 2023-03-27 15:25:34 403 [Warning] Checking table:   './abac/abac_logger'
Mar 27 16:38:30 new-server.novalocal mariadbd[18423]: 2023-03-27 16:38:30 3625 [Warning] Aborted connection 3625 to db: 'unconnected' user: 'unauthenticated' host: 'nat.en.net.ua' (This connection closed normally without authe
Mar 27 16:38:30 new-server.novalocal mariadbd[18423]: 2023-03-27 16:38:30 3626 [Warning] Access denied for user 'root'@'nat.en.net.ua' (using password: NO)
Mar 27 16:38:30 new-server.novalocal mariadbd[18423]: 2023-03-27 16:38:30 3627 [Warning] Access denied for user 'root'@'nat.en.net.ua' (using password: YES)
Mar 27 16:38:31 new-server.novalocal mariadbd[18423]: 2023-03-27 16:38:31 3629 [Warning] Access denied for user 'root'@'nat.en.net.ua' (using password: YES)
Mar 27 16:38:31 new-server.novalocal mariadbd[18423]: 2023-03-27 16:38:31 3630 [Warning] Access denied for user 'root'@'nat.en.net.ua' (using password: YES)
Mar 27 16:38:31 new-server.novalocal mariadbd[18423]: 2023-03-27 16:38:31 3631 [Warning] Access denied for user 'root'@'nat.en.net.ua' (using password: YES)
Mar 27 16:38:31 new-server.novalocal mariadbd[18423]: 2023-03-27 16:38:31 3632 [Warning] Access denied for user 'root'@'nat.en.net.ua' (using password: YES)
Mar 27 16:38:31 new-server.novalocal mariadbd[18423]: 2023-03-27 16:38:31 3633 [Warning] Access denied for user 'root'@'nat.en.net.ua' (using password: YES)

Additional info:

  • Server version: 10.6.4-MariaDB MariaDB Server
  • CentOS Linux release 7.9.2009 (Core)
  • 16GB of RAM, and about 9GB Innodb size (soon going to increase both)

Any feedback on what could be the reason of the crash?

Update 1: Found cause thanks to comment from @AlexD

dmesg -T | egrep -i 'killed process'
[Sun Mar 26 00:25:56 2023] Killed process 22107 (mariadbd), UID 27, total-vm:11617328kB, anon-rss:8841236kB, file-rss:0kB, shmem-rss:0kB
[Mon Mar 27 15:05:47 2023] Killed process 30060 (mariadbd), UID 27, total-vm:11726556kB, anon-rss:8919200kB, file-rss:0kB, shmem-rss:0kB

And also:

dmesg -T | less
[Fri Feb 24 11:55:34 2023] Out of memory: Kill process 28666 (mariadbd) score 630 or sacrifice child
[Fri Feb 24 11:55:34 2023] Killed process 28666 (mariadbd), UID 27, total-vm:12708960kB, anon-rss:10227836kB, file-rss:0kB, shmem-rss:0kB
[Fri Feb 24 12:46:34 2023] php-fpm[22177]: segfault at 7fa2bb08ca31 ip 0000559801a1e551 sp 00007fff0fca6610 error 4 in php-fpm[5598017bd000+3a6000]
[Fri Feb 24 13:43:02 2023] php-fpm[13587]: segfault at 7fa29c026e31 ip 0000559801a1e551 sp 00007fff0fca6aa0 error 4 in php-fpm[5598017bd000+3a6000]
[Sat Feb 25 00:51:50 2023] sh (11581): drop_caches: 3
[Sun Feb 26 00:51:48 2023] sh (10278): drop_caches: 3
[Mon Feb 27 00:51:48 2023] sh (1543): drop_caches: 3
[Tue Feb 28 00:51:46 2023] sh (29036): drop_caches: 3
[Tue Feb 28 08:56:36 2023] mariadbd invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
[Tue Feb 28 08:56:36 2023] mariadbd cpuset=/ mems_allowed=0
[Tue Feb 28 08:56:36 2023] CPU: 2 PID: 9536 Comm: mariadbd Kdump: loaded Not tainted 3.10.0-1160.36.2.el7.x86_64 #1
[Tue Feb 28 08:56:36 2023] Hardware name: RDO OpenStack Compute, BIOS 1.10.2-3.el7_4.1 04/01/2014
[Tue Feb 28 08:56:36 2023] Call Trace:
[Tue Feb 28 08:56:36 2023]  [<ffffffff86183559>] dump_stack+0x19/0x1b
[Tue Feb 28 08:56:36 2023]  [<ffffffff8617e5f8>] dump_header+0x90/0x229
[Tue Feb 28 08:56:36 2023]  [<ffffffff85d0f18b>] ? cred_has_capability+0x6b/0x120
[Tue Feb 28 08:56:36 2023]  [<ffffffff85bc252d>] oom_kill_process+0x2cd/0x490
[Tue Feb 28 08:56:36 2023]  [<ffffffff85d0f25c>] ? selinux_capable+0x1c/0x40
[Tue Feb 28 08:56:36 2023]  [<ffffffff85bc2c1a>] out_of_memory+0x31a/0x500
[Tue Feb 28 08:56:36 2023]  [<ffffffff85bc9834>] __alloc_pages_nodemask+0xad4/0xbe0
[Tue Feb 28 08:56:36 2023]  [<ffffffff85c19388>] alloc_pages_current+0x98/0x110
[Tue Feb 28 08:56:36 2023]  [<ffffffff85bbdfe7>] __page_cache_alloc+0x97/0xb0
[Tue Feb 28 08:56:36 2023]  [<ffffffff85bc0f80>] filemap_fault+0x270/0x420
[Tue Feb 28 08:56:36 2023]  [<ffffffffc046391e>] __xfs_filemap_fault+0x7e/0x1d0 [xfs]
[Tue Feb 28 08:56:36 2023]  [<ffffffffc0463b1c>] xfs_filemap_fault+0x2c/0x30 [xfs]
[Tue Feb 28 08:56:36 2023]  [<ffffffff85bee78a>] __do_fault.isra.61+0x8a/0x100
[Tue Feb 28 08:56:36 2023]  [<ffffffff85beed3c>] do_read_fault.isra.63+0x4c/0x1b0
[Tue Feb 28 08:56:36 2023]  [<ffffffff85bf6580>] handle_mm_fault+0xa20/0xfb0
[Tue Feb 28 08:56:36 2023]  [<ffffffff86190653>] __do_page_fault+0x213/0x500
[Tue Feb 28 08:56:36 2023]  [<ffffffff86190a26>] trace_do_page_fault+0x56/0x150
[Tue Feb 28 08:56:36 2023]  [<ffffffff8618ffa2>] do_async_page_fault+0x22/0xf0
[Tue Feb 28 08:56:36 2023]  [<ffffffff8618c7a8>] async_page_fault+0x28/0x30
[Tue Feb 28 08:56:36 2023] Mem-Info:
[Tue Feb 28 08:56:36 2023] active_anon:3700632 inactive_anon:232131 isolated_anon:0
 active_file:2112 inactive_file:2402 isolated_file:0
 unevictable:0 dirty:0 writeback:0 unstable:0
 slab_reclaimable:7681 slab_unreclaimable:19877
 mapped:57615 shmem:259768 pagetables:20522 bounce:0
 free:34106 free_pcp:271 free_cma:0
[Tue Feb 28 08:56:36 2023] Node 0 DMA free:15908kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[Tue Feb 28 08:56:36 2023] lowmem_reserve[]: 0 2812 15866 15866
jp flag
SIGKILL indicates that it is most probably killed by OOM killer.
Alexandru Trandafir Catalin avatar
br flag
Thanks @AlexD I was thinking of that but not sure as I haven't found before an actual log proving it, but I did now thanks to your comment! I'll update the post with the findings!
Score:1
br flag

Answering my own question with the solution found thanks to a comment:

@AlexD: SIGKILL indicates that it is most probably killed by OOM killer.

Based on that, if you do more research on OOM Killer, you find you that you can find killed system processes by running:

dmesg -T | egrep -i 'killed process'

And also browsing the full logs with:

dmesg -T | less

With that the cause was found, and the final solution for now was to increment the server RAM because by the size of Innodb, the operating system doesn't have enough remaining RAM.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.