Score:2

Galera cluster node fails with "InnoDB: Conflicting lock on table" error

hn flag

I have a Galera Cluster (MariaDB 10.5) that consists of 3 nodes (one of which is Arbitrator).

Last week I have two incidents: both nodes just stopped responding and the following was found in logs:

2023-04-10 23:35:42 1 [ERROR] InnoDB: Conflicting lock on table: `DB`.`Table` index: PRIMARY that has lock                                                                                                                                                      
RECORD LOCKS space id 4776 page no 1156 n bits 208 index PRIMARY of table `DB`.`Table` trx id 5338132732 lock_mode X locks rec but not gap                                                                                                                      
Record lock, heap no 132 PHYSICAL RECORD: n_fields 12; compact format; info bits 0                                                                                                                                                                                            
 0: len 4; hex 800c9f7f; asc     ;;                                                                                                                                                                                                                                           
 1: len 6; hex 00013e2d70fc; asc   >-p ;;                                                                                                                                                                                                                                     
 2: len 7; hex 35000002c3267e; asc 5    &~;;                                                                                                                                                                                                                                  
 3: len 30; hex 39393730356136322d633566352d346531352d396366642d323531313764; asc 99705a62-c5f5-4e15-9cfd-25117d; (total 36 bytes);                                                                                                                                           
 4: len 7; hex 99afd578941f5e; asc    x  ^;;                                                                                                                                                                                                                                  
 5: len 7; hex 99afd578ea0442; asc    x  B;;                                                                                                                                                                                                                                  
 6: SQL NULL;                                                                                                                                                                                                                                                                 
 7: len 6; hex 414354495645; asc ACTIVE;;                                                                                                                                                                                                                                     
 8: len 23; hex 556e7469746c656420476f6f676c652052657669657773; asc Name;;                                                                                                                                                                                 
 9: len 4; hex 80000004; asc     ;;                                                                                                                                                                                                                                           
 10: len 4; hex 80012d67; asc   -g;;                                                                                                                                                                                                                                          
 11: SQL NULL;                                                                                                                                                                                                                                                                

2023-04-10 23:35:42 1 [ERROR] InnoDB: WSREP state:                                                                                                                                                                                                                            
2023-04-10 23:35:42 1 [ERROR] WSREP: Thread BF trx_id: 5338132734 thread: 1 seqno: 2620904850 client_state: exec client_mode: high priority transaction_mode: replaying applier: 1 toi: 0 local: 0 query: INSERT INTO `DB`.`Table2` (`id`,`pid`,`createdA
t`,`updatedAt`,`settings`,`appliedMigrations`,`TableId`,`appReleaseId`) VALUES (NULL,'497b7e26-27da-4e59-8db3-8d8a584e9c06',TIMESTAMP'2023-04-10 23:35:41.487000',TIMESTAMP'2023-04-10 23:35:41.487000','"partialy displayed json string goes here                           
2023-04-10 23:35:42 1 [ERROR] WSREP: Thread BF trx_id: 5338132732 thread: 2 seqno: 2620904849 client_state: exec client_mode: high priority transaction_mode: committing applier: 1 toi: 0 local: 0 query: NULL                                                               
2023-04-10 23:35:42 0x7edc87c30700  InnoDB: Assertion failure in file /home/buildbot/buildbot/build/mariadb-10.5.9/storage/innobase/lock/lock0lock.cc line 674
InnoDB: We intentionally generate a memory trap.                                                                                                                                                                                                          
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/                                                                                                                                                                                         
InnoDB: If you get repeated assertion failures or crashes, even                                                                                                                                                                                           
InnoDB: immediately after the mysqld startup, there may be                                                                                                                                                                                                
InnoDB: corruption in the InnoDB tablespace. Please refer to                                                                                                                                                                                              
InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/                                                                                                                                                                                          
InnoDB: about forcing recovery.                                                                                                                                                                                                                           
230410 23:35:42 [ERROR] mysqld got signal 6 ;                                                                                                                                                                                                             
This could be because you hit a bug. It is also possible that this binary                                                                                                                                                                                 
or one of the libraries it was linked against is corrupt, improperly built,                                                                                                                                                                               
or misconfigured. This error can also be caused by malfunctioning hardware.                                                                                                                                                                               

To report this bug, see https://mariadb.com/kb/en/reporting-bugs                                                                                                                                                                                          

We will try our best to scrape up some info that will hopefully help                                                                                                                                                                                      
diagnose the problem, but since we have already crashed,                                                                                                                                                                                                  
something is definitely wrong and this may fail.                                                                                                                                                                                                          

Server version: 10.5.9-MariaDB-1:10.5.9+maria~focal-log                                                                                                                                                                                                   
key_buffer_size=134217728                                                                                                                                                                                                                                 
read_buffer_size=2097152                                                                                                                                                                                                                                  
max_used_connections=887                                                                                                                                                                                                                                  
max_threads=2002                                                                                                                                                                                                                                          
thread_count=321                                                                                                                                                                                                                                          
It is possible that mysqld could use up to                                                                                                                                                                                                                
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 12482229 K  bytes of memory                                                                                                                                                         
Hope that's ok; if not, decrease some variables in the equation.                                                                                                                                                                                          

Thread pointer: 0x7f17404c1a18                                                                                                                                                                                                                            
Attempting backtrace. You can use the following information to find out                                                                                                                                                                                   
where mysqld died. If you see no messages after this, something went                                                                                                                                                                                      
terribly wrong...                                                                                                                                                                                                                                         
stack_bottom = 0x7edc87c2fd58 thread_stack 0x49000                                                                                                                                                                                                        
2023-04-10 23:35:44 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT2.44926S), skipping check                                                                                                                                              
/usr/sbin/mariadbd(my_print_stacktrace+0x32)[0x564838d1caf2]                                                                                                                                                                                              
/usr/sbin/mariadbd(handle_fatal_signal+0x485)[0x5648387726f5]      

Does the transaction cause this issue or it could be the cluster misconfiguration? Where should I look to?

asktyagi avatar
in flag
Please add more logs, specially immediate after above logs, check if it reports any crash.
Vladimir Ivanenko avatar
hn flag
Hi @asktyagi, thanks for looking into it. Sure, I've just added additional logs.
asktyagi avatar
in flag
It seems bug to me https://jira.mariadb.org/browse/MDEV-25405 or https://jira.mariadb.org/browse/MDEV-25427 but correct information you may get by mariadb team.
Vladimir Ivanenko avatar
hn flag
Thank you @asktyagi, its sounds reasonable. I see both bugs are fixed, so update probably should help. Appreciate your help!
Vladimir Ivanenko avatar
hn flag
After 3 weeks since the cluster has been updated no incidents happened. So update did the job.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.