I have a master and 3 slaves running the buggy 8.0.29 version of MySQL. Yesterday 2 of my slaves stopped synchronizing and I couldn't complete a full resync from a fresh dump because I kept getting "error 2013 lost connection to SQL server" when reloading the dump file to either one of the downed slaves. I tried to resolve the error by editing my cnf file with various directives, that didn't work.Nor did increasing server RAM.
That's when I found out that that **23 tables in 7 databases I host had been corrupted by replay corruption from the redo log bug in 8.0.29**. We've been running this version for a while, none of our backups go back that far.
So I'm painted into a corner with SQL dumps not working and no real backups, just 1 master and 1 slave remaining. Luckily I was able to stop replication on my remaining slave and do a VM level backup of it with the databases quiesced. Luckily it is back up and fully synced with the master now too. So I can count on that if things go south trying to upgrade.
From what I understand the fix is to upgrade with empty redo logs. I also understand that impacted tables (may?) be fixed by running an optimize table on them. My questions are the following:
- Has anyone been in a similar situation? What steps did you, (should I) take to get whole again?
- Will a MySQL server restart sufficiently empty the redo logs so I can proceed with a MySQL version upgrade?
- If not what should I do?
- Will optimizing my tables remove this corruption?
- If so should I optimize before or after the mysql upgrade?
- My total dump file is 30GB, should I optimize just the bad tables? Probably..
- What other resources or advice can you offer?