Score:0

Remote call of NRPE comman fails in one case, while succeeding locally in all

ng flag

I have amazingly strange issue with monitoring a CIFS (SMB) shared folder mounted to Linux machines by Nagios + NRPE.

NRPE process runs on the Linux machines under dedicated user nrpe:

# systemctl status nrpe
  nrpe.service - Nagios Remote Program Executor
   Loaded: loaded (/usr/lib/systemd/system/nrpe.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2023-05-02 14:46:47 IDT; 20h ago
     Docs: http://www.nagios.org/documentation
  Process: 30216 ExecStopPost=/bin/rm -f /run/nrpe/nrpe.pid (code=exited, status=0/SUCCESS)
 Main PID: 30218 (nrpe)
   CGroup: /system.slice/nrpe.service
           └─30218 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -f

# ps -ef | grep nrpe
nrpe     30218     1  0 May02 ?        00:00:05 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -f

The monitoring command is defined in its configuration /etc/nagios/nrpe.cfg file this way:

command[check_backups_share]=/usr/lib64/nagios/plugins/check_disk -w 7% -c 5% -p /mnt/backups

If I run the command manually as nrpe user on all machines, it succeeds:

# sudo -u nrpe bash
bash-4.2$ /usr/lib64/nagios/plugins/check_disk -w 7% -c 5% -p /mnt/backups
DISK OK - free space: /mnt/backups 2571991 MiB (61.32% inode=-);| /mnt/backups=1622248MiB;3900643;3984528;0;4194240

However, if I call it remotely from Nagios, it succeeds on one machine and fails on another:

$ /usr/local/nagios/libexec/check_nrpe -2 -H Machine01 -c check_backups_share
DISK OK - free space: /mnt/backups 2575536 MiB (61.40% inode=-);| /mnt/backups=1618703MiB;3900643;3984528;0;4194240

$ /usr/local/nagios/libexec/check_nrpe -2 -H Machine02 -c check_backups_share
DISK CRITICAL - /mnt/backups is not accessible: Permission denied

All other remote NRPE commands on Machine02 succeed. Even more, if I unmount the /mnt/backups folder on Machine02, it also succeeds (for root filesystem). But when it's mounted, I get this Permission denied error.

The folder is mounted identically on all machines, with the same credentials and options. In /etc/fstab file:

//Backups-Server/backups  /mnt/backups      cifs    vers=3.0,credentials=/path/to/creds    0 0

So:

  • all credentials, permissions, users, groups are the same;
  • command executed locally on all machines under the same user produces the same results;
  • but when executed remotely, it fails on one machine complaining on permissions, but succeeds on all others,
  • while the executing nrpe process configured the same way on all machines and has the same permissions.

So what on earth could this be?

Update:

Solved, see below.

Nikita Kipriyanov avatar
za flag
Extraordinary statements require extraordinary proofs. "Identical"? Well, what distro is it? On each machine, show `ls -la` on *mounted* share, `ls -Z` to see SELinux if it's applicable (don't assume, *check* it with `sestatus`), and `getfacl` with appropriate attributes. (I'd also check `getattr`, just for case.) Finally, hadn't you just forget to restart `nrpe` after making changes to configuration?
Cat Mucius avatar
ng flag
Thanks, @NikitaKipriyanov! The OS: `# uname -a: Linux Machine01 3.10.0-1062.el7.x86_64 #1 SMP Thu Jul 18 20:25:13 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux` `# cat /etc/redhat-release: Red Hat Enterprise Linux Server release 7.7 (Maipo)` ```
Cat Mucius avatar
ng flag
`# ls -alZ /mnt/backups/ drwxr-xr-x. root root system_u:object_r:cifs_t:s0 . drwxr-xr-x. root root system_u:object_r:mnt_t:s0 .. drwxr-xr-x. root root system_u:object_r:cifs_t:s0 Machine01 drwxr-xr-x. root root system_u:object_r:cifs_t:s0 Machine02`
Cat Mucius avatar
ng flag
# getfacl /mnt/backups/ getfacl: Removing leading '/' from absolute path names # file: mnt/backups/ # owner: root # group: root user::rwx group::r-x other::r-x
Cat Mucius avatar
ng flag
`# sestatus SELinux status: enabled SELinuxfs mount: /sys/fs/selinux SELinux root directory: /etc/selinux Loaded policy name: targeted Current mode: enforcing Mode from config file: enforcing Policy MLS status: enabled Policy deny_unknown status: allowed Max kernel policy version: 31`
Cat Mucius avatar
ng flag
All these outputs are identical on both machines, the problematic and the healthy. Not sure what do you mean by `getattr`, could you please show an example?
Nikita Kipriyanov avatar
za flag
You see yourself this is unreadable. Please, [edit](https://serverfault.com/posts/1130248/edit) your question and put it there. // As for the problem, I strongly suspect SELinux involvement. Check variables with `semanage boolean -l`. Set it to Permissive (temporarily) and inspect logs; it will log instead of denying action so if it begin working you'll be able to see what exactly prevented it.
Cat Mucius avatar
ng flag
@NikitaKipriyanov, you're absolutely right, SELinux is culpable here - as soon as I set it to Permissive mode with `setenforce Permissive`, the problem disappeard; as soon as I reverted this by `setenforce Enforcing`, it resumed. However, when I run `semanage boolean -l` on both machines, it produces exactly the same results (compared with `diff`), and the mode is `Enforcing` on both, the problematic and the healthy one. What else can I check?
Cat Mucius avatar
ng flag
This also helped - `semanage permissive -l nrpe_t`.
Nikita Kipriyanov avatar
za flag
So, while in Permissive, what was logged, what would be rejected if it worked in Restrictive?
Cat Mucius avatar
ng flag
I found this entry in the `journalctl -xe | grep check_disk` output: `May 04 13:19:13 Machine02 kernel: type=1400 audit(1683195553.116:22386): avc: denied { getattr } for pid=24826 comm="check_disk" path="/mnt/backups" dev="cifs" ino=3458764513820542747 scontext=system_u:system_r:nrpe_t:s0 tcontext=system_u:object_r:cifs_t:s0 tclass=dir permissive=0`. Does this tell you anything?
Nikita Kipriyanov avatar
za flag
Yes. In general, you follow this [procedure with audit2allow](https://wiki.centos.org/HowTos/SELinux#Creating_Custom_SELinux_Policy_Modules_with_audit2allow) to create a custom permission. I suggest you to read the whole linked page. I am too lazy to reproduce the problem to demonstrate its use and write an answer, but now I suppose you'll be able to do it yourself.
Cat Mucius avatar
ng flag
@NikitaKipriyanov, thanks a lot! `audit2allow` solved the problem - as far as I understand, it generated a policy allowing processes labeled with `nrpe_t` to access CIFS-mounted folders.
Score:1
ng flag

Update:

Thanks to @NikitaKipriyanov, the problem solved.

Found multiple entries in /var/log/messages:

[root@Machine02 /]# cat /var/log/messages | grep avc | grep check_disk

May  7 17:32:01 Machine02 kernel: type=1400 audit(1683469921.442:63546): avc:  denied  { getattr } for  pid=20931 comm="check_disk" path="/mnt/backups" dev="cifs" ino=3458764513820542747 scontext=system_u:system_r:nrpe_t:s0 tcontext=system_u:object_r:cifs_t:s0 tclass=dir permissive=0

May  7 17:32:21 Machine02 kernel: type=1400 audit(1683469941.508:63549): avc:  denied  { getattr } for  pid=20973 comm="check_disk" path="/mnt/backups" dev="cifs" ino=3458764513820542747 scontext=system_u:system_r:nrpe_t:s0 tcontext=system_u:object_r:cifs_t:s0 tclass=dir permissive=0

May  7 17:32:41 Machine02 kernel: type=1400 audit(1683469961.575:63552): avc:  denied  { getattr } for  pid=21025 comm="check_disk" path="/mnt/backups" dev="cifs" ino=3458764513820542747 scontext=system_u:system_r:nrpe_t:s0 tcontext=system_u:object_r:cifs_t:s0 tclass=dir permissive=0

Generated policy to avoid such denials:

[root@Machine02 /]# cat /var/log/messages | grep avc | grep check_disk | audit2allow -M check_disk_policy

This produced two files:

  • check_disk_policy.pp - binary
  • check_disk_policy.te - textual, containing this:
module nrpe 1.0;

require {
    type nrpe_t;
    type cifs_t;
    class dir getattr;
}

#============= nrpe_t ==============

#!!!! The file '/mnt/backups' is mislabeled on your system.  
#!!!! Fix with $ restorecon -R -v /mnt/backups
allow nrpe_t cifs_t:dir getattr;

Running restorecon -R -v /mnt/backups didn't help, so I loaded the policy:

[root@Machine02 /]# semodule -i check_disk_policy.pp

After that the error disappeared. As I understand, this policy allows processes labeled with nrpe_t (the /usr/sbin/nrpe process and children processes spawned by it, such as /usr/lib64/nagios/plugins/check_disk) to access CIFS shared folders.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.