NFS cluster with pacemaker and corosync fails with stale file on failover

Nicola Urbinati

8/23/24, 8:32 AM

I have a 3-node pacemaker-corosync cluster on ubuntu 22.04, managing some nfs shares.

I share two different glusterfs (for sharded storage on the three nodes) volumes (one ssd, the other hdd):

showmount -e
Export list for nfs2:
/HDD5T/nfsshare/exports/HDD5T x.x.x.x/24
/HDD5T/nfsshare/exports       x.x.x.x/24
/HDD5T/nfsshare/exports/SDD2T x.x.x.x/24

So on the hosts, I mount the HDD gluster volume on /HDD5T (I need all of the path to be shared), then define the full exports path and finally mount the SDD gluster volume on the relative directory (/path/SDD2T).

This is the context.

When I put one of the nodes in standby, the nfs service starts on another node. When this happens (whichever the nodes), the clients that already mounted the two shares can still access the shares backed by the SDD but not the one backed by the HDD, every command says "cannot access '/mnt/glusterfs/HDD5T': Stale file handle".

So why the SSD share works, while the HHD share not?

Thank you very much.

0 + 0

server

nfs

cluster

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: NFS cluster with pacemaker and corosync fails with stale file on failover

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.