Score:1

MySql Service Not Starting in cluster with DRBD

US flag

I have a high availability cluster with two nodes configured, after a maintenance carried out by an external company when restarting the system my mysql resource stopped working.

when executing the drbd-overview command I get the following.

Node Principal:
0:home Connected Primary/Secondary UpToDate/UpToDate C r-----
1:storage Connected Secondary/Primary UpToDate/UpToDate C r-----
2:mysql StandAlone Secondary/Unknown UpToDate/Outdated r-----

Node Secundary:
0:home Connected Secondary/Primary UpToDate/UpToDate C r-----
1:storage Connected Primary/Secondary UpToDate/UpToDate C r-----
2:mysql StandAlone Primary/Unknown UpToDate/Outdated r-----

I really don't know what the problem is, in other forums they tell me that it is a mysql problem and that I should start the service with the following command.

/etc/init.d/mysql start

but this doesn't work.

I checked the /var/lib/mysql directory on both nodes. I realized that I don't have the ibdatadir file on node 2 but I do have it on node 1. I don't know if this has something to do with it.

archivo clsstd2.err

mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql /usr/sbin/mysqld: Table ‘mysql.plugin’ doesn’t exist [ERROR] Can’t open the mysql.plugin table. Please run mysql_upgrade to create it. InnoBD: operating system error number 13 in a file operation.
InnoBD: The error means mysqld does not have access rights to the directory.
InnoBD: File name ./ibdata1
InnoBD: File operation call: ‘create’
InnoBD: Cannot continue operation
mysqld_safe mysqld from pid file /var/lib/mysql/clsstd2.pid ended

When executing the command crm_mon -1

=========
Stack: openais
Current DC: clsstd1 – partition with quorum
Version: 1.1.5-1.1.e15-01e86afaaa6da8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
4 Resources configured.
==========
Online: [ clsstd1 clsstd2 ]
Full list of resources:
Resources group all

virtual_ip_1        (ocf::heartbeat:IPaddr2):       stopped
virtual_ip_2        (ocf::heartbeat:IPaddr2):       stopped
virtual_ip_3        (ocf::heartbeat:IPaddr2):       stopped

fs_home (ocf::heartbeat:Filesystem):    stopped
fs_mysql    (ocf::heartbeat:Filesystem):    stopped
fs_storage  (ocf::heartbeat:Filesystem):    stopped
mysql       (ocf::heartbeat:mysql):     stopped
httpd       (ocf::heartbeat:apache):        stopped
swengined   (lsb:/user/lib/ocf/resource.d/streamwide/swengine):   stopped

Master/Slave Set: ms_drbd_home [drbd_home]
Masters: [ clsstd1 ]
Slaves: [ clsstd2 ]
Master/Slave Set: ms_drbd_mysql [drbd_mysql]
Masters: [ clsstd1 ]
Slaves: [ clsstd2 ]
Master/Slave Set: ms_drbd_mysql [drbd_storage]
Masters: [ clsstd1 ]
Slaves: [ clsstd2 ]

Migration summary:
*  Node clsstd1:
*  Node clsstd2:
asktyagi avatar
in flag
Please update question with mysql logs.
Matt Kereczman avatar
nr flag
"_I checked the /var/lib/mysql directory on both nodes. I realized that I don't have the ibdatadir file on node 2 but I do have it on node 1. I don't know if this has something to do with it._" That is normal. Only on the node where DRBD is Primary (and the DRBD device is mounted) will you be able to access the filesystem that's being replicated. It looks like you might have a split-brain in your DRBD device. I would try failing the mysql service over to the peer node to see if it starts successfully there, and let us know how that goes.
Iván Jf avatar
md
Hello, thanks for answering, as I mentioned before, I have looked for help in other forums and I have not been able to solve the problem. I've searched and I can't find the mysql logs, the only file that exists inside the directories is clsstd2.err. where you describe MySQL problems, I have implemented solutions recommended in other forums without success. I'm new to this and I didn't quite understand the part " I would try failing the mysql service over to the peer node to see if it starts successfully there, and let us know how that goes".
Iván Jf avatar
md
Edit the question and add the content of the clsstd2.err file
jm flag
Dok
Is this cluster managed by Pacemaker? If so, what is the state of the cluster? Anything from a `crm_mon -1`? Perhaps it's and older cluster running the RHCS or Heartbeat?
Iván Jf avatar
md
If you are managed by pacemaker and running Heartbeat, please add an image of this in the question
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.